File Compression


  

  1. Overview
  2. (Compressed) File Extention Chart

Overview

  

"There is no reason anyone would want a computer in their home"
- Ken Olson, president, chairman and founder of Digital Equipment Corp., 1977

Using a computer for any amount of time, and you will quickly realize that the amount of storage space on your computer is limited. One way to deal with this problem is to use a compression software package that "squishes" unused programs into small boxes, thus freeing up a little more of your disk space for other programs.

It turns out that storage space problems are not limited solely to your local computers. As the numbers of files that are available through web/FTP increases daily, web/FTP sites are actively looking for ways to squeeze more files into a limited amount of space. The web/FTP sites accomplish this by using file compression.

The good news is that a compressed file takes up a lot less space on the web/FTP site's computer. The bad news is that a compressed file is absolutely useless until you uncompress it. Wait ... it gets worse. Before you can uncompress a file, you have to know what compression method was used to compress the file in the first place. Unfortunately, there is no one standard file compression method -- there are HUNDREDS of different file compression methods in use today

If you have to know what compression method was used before you can uncompress a file, how are you ever going to figure out which method was used? Well, it is actually pretty easy:

  1. Most web/FTP directories have an Index or a README file (or something similar) that shows an index of all the files that are in that directory. Some really nice sites have expanded README files that include a mention about what compression method was used and where you can get a free copy of the software needed to uncompress the files.

  2. Look at the files' extensions. By looking at the extensions and comparing them to the chart below, you will be able to determine what compression method was used and what particular software is needed to uncompress the file.

Fortunately, most uncompression software is either public domain (meaning that it is completely free) or shareware (meaning that you can get a copy of it for free, but the author expects you to send him some money for the program if you decide to keep it and use it).

(Compressed) File Extention Chart

  

The list below shows some of the most popular extensions that you are bound to encounter during your visits to web/FTP sites around the world. It also shows transfer modes needed to retrieve files with these extensions, what uncompress software package you need to to uncompress the files after you retrieve them, and it even gives some additional comments about each of the extensions.

Even though the list is quite extensive, keep in mind that there are literally HUNDREDS of compression methods in use today, and there is no way that I can list all of them. If you find a particular extension not listed below, and feel strongly about the inclusion, please let me know.

File Extension Transfer mode Use this to uncompress CEE UCL program Description
.adi Ascii None acad AutoCAD Device-Independent Binary Plotter Format.
.ai Ascii None illustrator Adobe Illustrator File Format.
.aiff, .aif, .aifc, .faic Binary None xplay Audio Interchangeable File Format; sound file format used in Unix and Macintosh.
.arc Binary ARC, ARCE arc MS-DOS format, which requires the use of the ARC or ARCE programs. (hardly used anymore)
.arj Binary Arj arj Another MS-DOS format; requires the use of Arj.
.asf Binary None no .ASF is a Microsoft's proprietary file format for storing audio and video information [for Microsoft Media Player] and is specially designed to run over networks like the Internet. Kind of a counter part to RealAudio's .RA/.RAM/.RM3 format.
.au Binary None xplay Digitial audio file format used in Sun and NexT workstations (=Unix).
.bmp Binary None gimp, photoshop, xv, im Standard Windows image format on DOS and Windows-compatible computers. The BMP format supports RGB, indexed-color, grayscale, and Bitmap color modes, and does not support alpha channels. You can specify either Microsoft Windows or IBM OS/2 format and a bit depth for the image. For 4-bit and 8-bit images using Windows format, you can also specify RLE compression.
.cab Binary None N/A Microsoft Cabinets file format.
.ccitt Binary None im CCITT Group 3 and Group 4 Encoding graphic file format.
.cdf Ascii None StarOffice, wabi Common Data Format for spreadsheet and matrix.
.cgm Binary None xv, im Computer Graphics Metafile file format; for raster graphics/picture files.
.cmf Binary None xplay Creative Labs Music Format; sound file format for SoundBlaster sound card familiy.
.cpt Binary Compact Pro No Compact Pro compression format for Mac and Power PC.
.dem Ascii None arc USGS Digital Elevation Model spatial raster data file format.
.dgn Binary Microstation arc/info Microstation drawing file.
.dlg Ascii None arc USGS Digital Line Graph spatial vector data file format.
.doc Ascii/Binary None StarOffice, wabi Another common extension for text documents. (Be careful, though: .doc and .DOC extensions are also used for Microsoft Word documents (which are Binary files). The duck theory will help you determine the difference) No decompression is needed.
.drw Binary Designer illustrator Micrografx Designer/Draw Plus drawin file Format.
.dwg Binary AutoCAD acad AutoCAD drawing file.
.dxf Ascii AutoCAD acad AutoCAD Data Exchange Format file for transferring CAD data among applications.
.eps Ascii None illustrator, gimp, photoshop Encapsulated PostScript (EPS) language file format can contain both vector and bitmap graphics and is supported by virtually all graphic, illustration, and page-layout programs. The EPS format is used to transfer PostScript language artwork between applications.
.exe Binary None N/A Self-extracting MSDOS executable (creates files on disk when run). Run the file, or try unzip, lha or arj on it.
.gif Binary None gimp, photoshop, xv, im Compuserve's GIF87a and GIF89a Graphics Interchange Format; for graphics/picture files. Gif files are images compressed with the LZW algorithm to minimize file size and electronic transfer time. GIF89a format specifies interlacing, and define background transparency.
.gz Binary gunzip (=gzip) gzip A Unix version of ZIP (GNU ZIP). To uncompress, type gunzip -d filename.gz at your host system's command line. (Not compatible with Zip)
.hap Binary HAP No Hamarsoft HAP archiver (Markov modeling + arithmetic coding).
.hdf Binary None im Hierarchical Data Format; graphic file format.
.hpk Binary hpack No hpack (archiver with strong encryption).
.hqx Ascii StuffIt No A Macintosh BinHex format; transfer in text mode ; you need StuffIt program to uncompress ‘.hqx’ file. BinHex is *not* a compression program, it is similar to uuencode but handles multiple forks.
.iff Binary None im Amiga Interchange File Format (IFF) is used for working with Video Toaster and transferring files to and from the Commodore Amiga system. In addition, this format is supported by a number of paint programs on IBM computers. IFF is the best export format to use with DeluxePaint from Electronic Arts. The IFF format supports RGB, indexed-color, grayscale, and Bitmap color modes, and does not support alpha channels.
.lha, .lzh Binary LHa, LHarc lha Another MS-DOS format. Originated from Japan.
.jbig Binary None im Joint Bilevel Image Group; for graphics files in Unix.
.jpg, .jfif Binary None gimp, photoshop, xv, im Jpeg Format by Joint Photographic Experts Group (JPEG); for photographs and other continuous-tone images files. JPEG format supports CMYK, RGB, and grayscale color modes, and does not support alpha channels. Unlike the GIF format, JPEG retains all color information in an RGB image but compresses file size by selectively discarding data. A JPEG image is automatically decompressed when opened. A higher level of compression results in lower image quality, and a lower level of compression results in better image quality.
.mid, .midi Binary None timidity MIDI (Musical Instrument Digital Interface) sound file format.
.miff Binary None im ImagMagick's file format for raster image.
.mpg, .mpeg Binary None xanim Video file compressed by MPeg I standard.
.mp2, .mpa2, .mp2a, .mpa, .abs, .mpega Binary None mxaudio MPeg II Leyer 1 compressed audio.
.mp3 Binary None mxaudio MPeg III Leyer 1 compressed audio.
.pak Binary pak No pak for MSDOS (LZW algorithm).
.pcm Binary None xaudio Pulse Code Modulation sound file format; pretty much hardware independent.
.pbm Binary None gimp, photoshop, xv, im Portable Bitmap, monochrome, graphic file format (normally in Unix).
.pcx Binary None gimp, photoshop, im ZSoft's PCX graphic file format commonly used by IBM PC. PCX format supports RGB, indexed-color, grayscale, and Bitmap color modes, and does not support alpha channels. PCX supports the RLE compression method. Images can have a bit depth of 1, 4, 8, or 24.
.pdf Binary None acroread Adobe's Portable Document Format. PDF format supports RGB, indexed-color, CMYK, grayscale, Bitmap, and Lab color modes, and does not support alpha channels. The format supports JPEG and ZIP compression, except for Bitmap-mode files, which use CCITT Group 4 compression when saved as Photoshop PDF.
.pgm Binary None gimp, photoshop, xv, im Portable Graymap, grayscale, graphic file format (normally in Unix).
.pit Binary PackIt No PackIt compression format used in Macintosh.
.pix Binary None xv, im Color RGB image file format.
.png Binary None gimp, im Portable Network Graphic file format. Developed as a patent-free alternative to GIF, the Portable Network Graphics (PNG) format is used for losslessly compressing and displaying images on the World Wide Web. Unlike GIF, PNG supports 24-bit images and produces background transparency without jagged edges; however, some older versions of Web browsers may not support PNG images. The PNG format supports grayscale and RGB color modes with a single alpha channel, and Bitmap and indexed-color modes without alpha channels. PNG uses the saved alpha channel to define transparency in the file.
.pnm Binary None gimp, xv, im Portable AnyMap Graphic file format for Unix.
.ppm Binary None gimp, xv, im Portable Pixmap graphic file format (normally in Unix).
.ppt Binary PowerPoint StarOffice, wabi Microsoft PowerPoint presentation file format.
.ps Ascii None gs, gv A PostScript document (in Adobe's page description language). You can print this file on any PostScript capable printer or use a previewer, like GNU project's GhostScript.
.psd Binary Photoshop photoshop, illustrator Adobe's PhotoShop graphic file format.
.qt, .mov Binary None xanim Apple's Quick Time movie file format.
.ra, .ram, .rm, Binary None raplayer, rvplayer RealAudio sound file format
.raw Binary None xplay raw signed Pulse Code Modulation sound file format.
.riff Binary None xaudio Resource Interchange File Format sound file. Tagged structure and basisi for .wav file format.
.rtf Ascii None StarOffice, wabi Rich Text Format. Used in translating native document formats among various wordprocessors.
.snd Binary None xaudio raw signed Pulse Code Modulation sound file format.
.sgi Binary None im Silicon Graphics (SGI) Image file format.
.shar Ascii sh sh Unix SHell ARchive. This is not a compressed file format. Use "sh foo.shar," where "foo.shar" is the name of shar archive, to extract on Unix.
.sea Binary None No Self-extracting archive for Macintosh.
.sit Binary StuffIt No A Macintosh compression format that requires the StuffIt program.
.sqz Binary Squeeze No Squeeze for MSDOS. Based on LZ77 with hashing compression.
.tar Binary tar, xtar tar, xtar Unix Tape ARchive. Often used to compress several related files into one large file. All Unix systems will have a program called tar for "un-tarring" such files. Often, a "tar'd" file will also be be compressed with the gzip program, so you first have to use uncompress gzip and then tar. To ungzip and un-tar at the same time, type $ gzip -cd filename.tar.gz | tar xpvf - at your host system's command line.
.tar.gz Binary gzip, tar Yes tar + gzip.
.tar.Z Binary uncompress, tar Yes tar + compress.
.tga Binary None gimp, photoshop, xv, im Truevision Targa graphic file format. The Targa format supports 32-bit RGB files with a single alpha channel, and indexed-color, grayscale, and 16-bit and 24-bit RGB files without alpha channels.
.tif, .tiff Binary None gimp, photoshop, xv, im Aldus' Tag Image File Format (TIFF) graphic file format. TIFF is a flexible bitmap image format supported by virtually all paint, image-editing, and page-layout applications. Also, virtually all desktop scanners can produce TIFF images. The TIFF format supports CMYK, RGB, and grayscale files with alpha channels, and Lab, indexed-color, and Bitmap files without alpha channels. TIFF also supports LZW compression.
.txt Ascii None editors Of course, by itself, this means the file is an ascii text file rather than a program, and does not need to be uncompressed.
.uue, .uu Ascii uudecode uuencode, uudecode Transfer as text file; use uudecode to convert to binary; then depends on compressed formats, uncompress file.
.voc Binary None xaudio Creative Labs' Voice audio file format.
.wav Binary None wabi, xaudio Microsoft Windows sound file format based on RIFF format.
.wmf Binary None StarOffice, wabi Windows Metafile Format by Microsoft. Graphics file format.
.wpg Binary None N/A WordPerfect Graphics Metafile.
.xbm Binary None gimp, xv, im X Bitmap graphic file format for X Window system.
.Z Binary uncompress compress, uncompress Unix compression based on 'the' LZW algorithm. To uncompress, type $ uncompress filename.Z at your system prompt.
.zip Binary unzip, Pkunzip unzip This indicates the file has been Zip/Unzip compressed with a common MS-DOS compression program, known as PKzip. In Unix system, you can use unzip to un-Zip a file.
.zoo Binary zoo No A Unix and MS-DOS compression format. Use a program called zoo to uncompress. (almost obsolte)