What is Data Compression? (with picture)

What is Data Compression?

By R. Kayne

Updated: May 16, 2024

Data compression is a general term for a group of technologies that encode large files in order to shrink them down in size. Smaller files take up less room, leaving more storage space. They are also faster to transfer over a network, whether that network is the Internet, an intranet, or a local area network (LAN).

In the 1970s, various techniques were available to archive files, or place them together in a single package to avoid sending multiple files between computers. The idea was soon augmented with data compression techniques, so the term “archive” is now often used to describe a compressed file.

Compressing data involves applying an algorithm that makes some of the repetitive bits unnecessary. It's a bit like a kind of shorthand map that gets stored with the compressed file. When decompressed, the map restores all of the missing bits, reconstituting the complete file. This technique can be used with text, graphics, executable programs, and multimedia files, though some types of files compress better than others.

Today’s most commonly recognized data compression technique was originally used in the DOS operating system prior to Microsoft Windows™ becoming ubiquitous in the mid-1990s. Author Phil Katz eventually termed these compressed files zipped files — the idea being that when the files were unzipped (decompressed), the full contents “popped” out. Files used with this method have the extension, .zip.

Files that are extremely large even when compressed can be split into pieces before being sent over a network. The pieces are collected and reassembled on the receiving end. The leading compression technique for large files, also from the days of DOS, is called RAR, after author Eugene Roshal. Programs that support Roshal ARchive files can create a set of RARs from a large multimedia file, for example, or decompress an existing RAR to reassemble a movie or program. These files have the extension .rar, or for multi-part files, part01.rar, part02.rar or .r01 r02.

Various music formats also use other data compression techniques to shink the files while maintaining as much of the original quality as possible. The most obvious example is the .mp3 format. In this case, however, the compressed file is not an archive and cannot be decompressed. The bits that are removed to achieve the smaller file size are gone for good. Other techniques used for music files retain more quality but also result bigger file sizes.

Data compression programs are widely available online. A zipped or RAR'd file requires a program that supports that method to unzip or unRAR it. Most programs support multiple types of compressed files, and many of these programs are freeware.

Our Promise to you

What is Data Compression?

Editors' Picks

Related Articles