Image compression - Misplaced Pages

(Redirected from Image file compression) Reduction of image size to save storage and transmission costs

This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations. (April 2010) (Learn how and when to remove this message)

Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior results compared with generic data compression methods which are used for other digital data.

Lossy and lossless image compression

Image compression may be lossy or lossless. Lossless compression is preferred for archival purposes and often for medical imaging, technical drawings, clip art, or comics. Lossy compression methods, especially when used at low bit rates, introduce compression artifacts. Lossy methods are especially suitable for natural images such as photographs in applications where minor (sometimes imperceptible) loss of fidelity is acceptable to achieve a substantial reduction in bit rate. Lossy compression that produces negligible differences may be called visually lossless.

Methods for lossy compression:

Transform coding – This is the most commonly used method.
- Discrete Cosine Transform (DCT) – The most widely used form of lossy compression. It is a type of Fourier-related transform, and was originally developed by Nasir Ahmed, T. Natarajan and K. R. Rao in 1974. The DCT is sometimes referred to as "DCT-II" in the context of a family of discrete cosine transforms (see discrete cosine transform). It is generally the most efficient form of image compression.
  - DCT is used in JPEG, the most popular lossy format, and the more recent HEIF.
- The more recently developed wavelet transform is also used extensively, followed by quantization and entropy coding.
Color quantization - Reducing the color space to a few "representative" colors in the image. The selected colors are specified in the color palette in the header of the compressed image. Each pixel just references the index of a color in the color palette. This method can be combined with dithering to avoid posterization.
- Whole-image palette, typically 256 colors, used in GIF and PNG file formats.
- block palette, typically 2 or 4 colors for each block of 4x4 pixels, used in BTC, CCC, S2TC, and S3TC.
Chroma subsampling. This takes advantage of the fact that the human eye perceives spatial changes of brightness more sharply than those of color, by averaging or dropping some of the chrominance information in the image.
Fractal compression.
More recently, methods based on Machine Learning were applied, using Multilayer perceptrons, Convolutional neural networks, Generative adversarial networks and Diffusion models. Implementations are available in OpenCV, TensorFlow, MATLAB's Image Processing Toolbox (IPT), and the High-Fidelity Generative Image Compression (HiFiC) open source project.

Methods for lossless compression:

Run-length encoding – used in default method in PCX and as one of possible in BMP, TGA, TIFF
Predictive coding – used in DPCM
Entropy encoding – the two most common entropy encoding techniques are arithmetic coding and Huffman coding
Adaptive dictionary algorithms such as LZW – used in GIF and TIFF
DEFLATE – used in PNG, MNG, and TIFF
Chain codes

Other properties

The best image quality at a given compression rate (or bit rate) is the main goal of image compression, however, there are other important properties of image compression schemes:

Scalability generally refers to a quality reduction achieved by manipulation of the bitstream or file (without decompression and re-compression). Other names for scalability are progressive coding or embedded bitstreams. Despite its contrary nature, scalability also may be found in lossless codecs, usually in form of coarse-to-fine pixel scans. Scalability is especially useful for previewing images while downloading them (e.g., in a web browser) or for providing variable quality access to e.g., databases. There are several types of scalability:

Quality progressive or layer progressive: The bitstream successively refines the reconstructed image.
Resolution progressive: First encode a lower image resolution; then encode the difference to higher resolutions.
Component progressive: First encode grey-scale version; then adding full color.

Region of interest coding. Certain parts of the image are encoded with higher quality than others. This may be combined with scalability (encode these parts first, others later).

Meta information. Compressed data may contain information about the image which may be used to categorize, search, or browse images. Such information may include color and texture statistics, small preview images, and author or copyright information.

Processing power. Compression algorithms require different amounts of processing power to encode and decode. Some high compression algorithms require high processing power.

The quality of a compression method often is measured by the peak signal-to-noise ratio. It measures the amount of noise introduced through a lossy compression of the image, however, the subjective judgment of the viewer also is regarded as an important measure, perhaps, being the most important measure.

History

Entropy coding started in the late 1940s with the introduction of Shannon–Fano coding, the basis for Huffman coding which was published in 1952. Transform coding dates back to the late 1960s, with the introduction of fast Fourier transform (FFT) coding in 1968 and the Hadamard transform in 1969.

An important development in image data compression was the discrete cosine transform (DCT), a lossy compression technique first proposed by Nasir Ahmed, T. Natarajan and K. R. Rao in 1973. JPEG was introduced by the Joint Photographic Experts Group (JPEG) in 1992. JPEG compresses images down to much smaller file sizes, and has become the most widely used image file format. JPEG was largely responsible for the wide proliferation of digital images and digital photos, with several billion JPEG images produced every day as of 2015.

Lempel–Ziv–Welch (LZW) is a lossless compression algorithm developed by Abraham Lempel, Jacob Ziv and Terry Welch in 1984. It is used in the GIF format, introduced in 1987. DEFLATE, a lossless compression algorithm developed by Phil Katz and specified in 1996, is used in the Portable Network Graphics (PNG) format.

The JPEG 2000 standard was developed from 1997 to 2000 by a JPEG committee chaired by Touradj Ebrahimi (later the JPEG president). In contrast to the DCT algorithm used by the original JPEG format, JPEG 2000 instead uses discrete wavelet transform (DWT) algorithms. It uses the CDF 9/7 wavelet transform (developed by Ingrid Daubechies in 1992) for its lossy compression algorithm, and the Le Gall–Tabatabai (LGT) 5/3 wavelet transform (developed by Didier Le Gall and Ali J. Tabatabai in 1988) for its lossless compression algorithm. JPEG 2000 technology, which includes the Motion JPEG 2000 extension, was selected as the video coding standard for digital cinema in 2004.

Huffman Coding

Huffman coding is a fundamental technique used in image compression algorithms to achieve efficient data representation. Named after its inventor David A. Huffman, this method is widely employed in various image compression standards such as JPEG and PNG.

Principle of Huffman Coding

Huffman coding is a form of entropy encoding that assigns variable-length codes to input symbols based on their frequencies of occurrence. The basic principle is to assign shorter codes to more frequently occurring symbols and longer codes to less frequent symbols, thereby reducing the average code length compared to fixed-length codes.

Application in Image Compression

In image compression, Huffman coding is typically applied after other transformations like Discrete Cosine Transform (DCT) in the case of JPEG compression. After transforming the image data into a frequency domain representation, Huffman coding is used to encode the transformed coefficients efficiently.

Steps in Huffman Coding for Image Compression

Frequency Analysis: Calculate the frequency of occurrence of each symbol or symbol combination in the transformed image data.
Constructing the Huffman Tree: Build a Huffman tree based on the symbol frequencies. The tree is constructed recursively by combining the nodes with the lowest frequencies until a single root node is formed.
Assigning Codewords: Traverse the Huffman tree to assign variable-length codewords to each symbol, with shorter codewords assigned to more frequent symbols.
Encoding: Replace the original symbols in the image data with their corresponding Huffman codewords to generate the compressed data stream.

Benefits of Huffman Coding in Image Compression

Lossless Compression: Huffman coding can be used in both lossy and lossless image compression techniques, providing flexibility in balancing between compression ratio and image quality.
Efficiency: By assigning shorter codes to frequently occurring symbols, Huffman coding reduces the average code length, resulting in efficient data representation and reduced storage requirements.
Compatibility: Huffman coding is widely supported and can be seamlessly integrated into existing image compression standards and algorithms.

Conclusion

Huffman coding plays a crucial role in image compression by efficiently encoding image data into a compact representation. Its ability to adaptively assign variable-length codewords based on symbol frequencies makes it an essential component in modern image compression techniques, contributing to the reduction of storage space and transmission bandwidth while maintaining image quality.

Notes and references

"Image Data Compression".
Ahmed, N.; Natarajan, T.; Rao, K.R. (1974). "Discrete Cosine Transform" (PDF). IEEE Transactions on Computers: 90–93. doi:10.1109/T-C.1974.223784. S2CID 149806273. Archived from the original (PDF) on 2011-11-25.
Gilad David Maayan (Nov 24, 2021). "AI-Based Image Compression: The State of the Art". Towards Data Science. Retrieved 6 April 2023.
Bühlmann, Matthias (2022-09-28). "Stable Diffusion Based Image Compression". Medium. Retrieved 2022-11-02.
"High-Fidelity Generative Image Compression". Retrieved 6 April 2023.
Burt, P.; Adelson, E. (1 April 1983). "The Laplacian Pyramid as a Compact Image Code". IEEE Transactions on Communications. 31 (4): 532–540. CiteSeerX 10.1.1.54.299. doi:10.1109/TCOM.1983.1095851. S2CID 8018433.
Shao, Dan; Kropatsch, Walter G. (February 3–5, 2010). Špaček, Libor; Franc, Vojtěch (eds.). "Irregular Laplacian Graph Pyramid" (PDF). Computer Vision Winter Workshop 2010. Nové Hrady, Czech Republic: Czech Pattern Recognition Society. Archived (PDF) from the original on 2013-05-27.
Claude Elwood Shannon (1948). Alcatel-Lucent (ed.). "A Mathematical Theory of Communication" (PDF). Bell System Technical Journal. 27 (3–4): 379–423, 623–656. doi:10.1002/j.1538-7305.1948.tb01338.x. hdl:11858/00-001M-0000-002C-4314-2. Archived (PDF) from the original on 2011-05-24. Retrieved 2019-04-21.
David Albert Huffman (September 1952), "A method for the construction of minimum-redundancy codes" (PDF), Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, doi:10.1109/JRPROC.1952.273898, archived (PDF) from the original on 2005-10-08
Pratt, W.K.; Kane, J.; Andrews, H.C. (1969). "Hadamard transform image coding". Proceedings of the IEEE. 57: 58–68. doi:10.1109/PROC.1969.6869.
Ahmed, Nasir (January 1991). "How I Came Up With the Discrete Cosine Transform". Digital Signal Processing. 1 (1): 4–5. Bibcode:1991DSP.....1....4A. doi:10.1016/1051-2004(91)90086-Z.
"T.81 – DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES – REQUIREMENTS AND GUIDELINES" (PDF). CCITT. September 1992. Archived (PDF) from the original on 2000-08-18. Retrieved 12 July 2019.
"The JPEG image format explained". BT.com. BT Group. 31 May 2018. Retrieved 5 August 2019.
"What Is a JPEG? The Invisible Object You See Every Day". The Atlantic. 24 September 2013. Retrieved 13 September 2019.
Baraniuk, Chris (15 October 2015). "Copy protections could come to JPEGs". BBC News. BBC. Retrieved 13 September 2019.
"The GIF Controversy: A Software Developer's Perspective". 27 January 1995. Retrieved 26 May 2015.
L. Peter Deutsch (May 1996). DEFLATE Compressed Data Format Specification version 1.3. IETF. p. 1. sec. Abstract. doi:10.17487/RFC1951. RFC 1951. Retrieved 2014-04-23.
Taubman, David; Marcellin, Michael (2012). JPEG2000 Image Compression Fundamentals, Standards and Practice: Image Compression Fundamentals, Standards and Practice. Springer Science & Business Media. ISBN 9781461507994.
^ Unser, M.; Blu, T. (2003). "Mathematical properties of the JPEG2000 wavelet filters" (PDF). IEEE Transactions on Image Processing. 12 (9): 1080–1090. Bibcode:2003ITIP...12.1080U. doi:10.1109/TIP.2003.812329. PMID 18237979. S2CID 2765169. Archived from the original (PDF) on 2019-10-13.
Sullivan, Gary (8–12 December 2003). "General characteristics and design considerations for temporal subband video coding". ITU-T. Video Coding Experts Group. Retrieved 13 September 2019.
Bovik, Alan C. (2009). The Essential Guide to Video Processing. Academic Press. p. 355. ISBN 9780080922508.
Le Gall, Didier; Tabatabai, Ali J. (1988). "Sub-band coding of digital images using symmetric short kernel filters and arithmetic coding techniques". ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing. pp. 761–764 vol.2. doi:10.1109/ICASSP.1988.196696. S2CID 109186495.
Swartz, Charles S. (2005). Understanding Digital Cinema: A Professional Handbook. Taylor & Francis. p. 147. ISBN 9780240806174.

Data compression methods

Lossless

Entropy type	Adaptive coding Arithmetic Asymmetric numeral systems Golomb Huffman Adaptive Canonical Modified Range Shannon Shannon–Fano Shannon–Fano–Elias Tunstall Unary Universal Exp-Golomb Fibonacci Gamma Levenshtein
Dictionary type	Byte pair encoding Lempel–Ziv 842 LZ4 LZJB LZO LZRW LZSS LZW LZWL Snappy
Other types	BWT CTW CM Delta Incremental DMC DPCM Grammar Re-Pair Sequitur LDCT MTF PAQ PPM RLE
Hybrid	LZ77 + Huffman Deflate LZX LZS LZ77 + ANS LZFSE LZ77 + Huffman + ANS Zstandard LZ77 + Huffman + context Brotli LZSS + Huffman LHA/LZH LZ77 + Range LZMA LZHAM RLE + BWT + MTF + Huffman bzip2

Lossy

Transform type	Discrete cosine transform DCT MDCT DST FFT Wavelet Daubechies DWT SPIHT
Predictive type	DPCM ADPCM LPC ACELP CELP LAR LSP WLPC Motion Compensation Estimation Vector Psychoacoustic

Audio

Concepts	Bit rate ABR CBR VBR Companding Convolution Dynamic range Latency Nyquist–Shannon theorem Sampling Silence compression Sound quality Speech coding Sub-band coding
Codec parts	A-law μ-law DPCM ADPCM DM FT FFT LPC ACELP CELP LAR LSP WLPC MDCT Psychoacoustic model

Image

Concepts	Chroma subsampling Coding tree unit Color space Compression artifact Image resolution Macroblock Pixel PSNR Quantization Standard test image Texture compression
Methods	Chain code DCT Deflate Fractal KLT LP RLE Wavelet Daubechies DWT EZW SPIHT

Video

Concepts	Bit rate ABR CBR VBR Display resolution Frame Frame rate Frame types Interlace Video characteristics Video quality
Codec parts	DCT DPCM Deblocking filter Lapped transform Motion Compensation Estimation Vector Wavelet Daubechies DWT

Theory

Community

Hutter Prize

People

Mark Adler

Multimedia compression and container formats

Video
compression

ISO, IEC, MPEG	DV MJPEG Motion JPEG 2000 MPEG-1 MPEG-2 Part 2 MPEG-4 Part 2 / ASP Part 10 / AVC Part 33 / IVC MPEG-H Part 2 / HEVC MPEG-I Part 3 / VVC MPEG-5 Part 1 / EVC Part 2 / LCEVC
ITU-T, VCEG	H.120 H.261 H.262 H.263 H.264 / AVC H.265 / HEVC H.266 / VVC
SMPTE	VC-1 VC-2 VC-3 VC-5 VC-6
TrueMotion and AOMedia	TrueMotion S VP3 VP6 VP7 VP8 VP9 AV1
Chinese Standard	AVS1 P2/AVS+(GB/T 20090.2/16) AVS2 P2(GB/T 33475.2,GY/T 299.1) HDR Vivid(GY/T 358) AVS3 P2(GY/T 368)
Others	Apple Video AVS Bink Cinepak Daala DVI FFV1 Huffyuv Indeo Lagarith Microsoft Video 1 MSU Lossless OMS Video Pixlet ProRes 422 4444 QuickTime Animation Graphics RealVideo RTVideo SheerVideo Smacker Sorenson Video/Spark Theora Thor Ut WMV XEB YULS

Audio
compression

ISO, IEC, MPEG	MPEG-1 Layer II Multichannel MPEG-1 Layer I MPEG-1 Layer III (MP3) AAC HE-AAC AAC-LD MPEG Surround MPEG-4 ALS MPEG-4 SLS MPEG-4 DST MPEG-4 HVXC MPEG-4 CELP MPEG-D USAC MPEG-H 3D Audio
ITU-T	G.711 A-law µ-law G.718 G.719 G.722 G.722.1 G.722.2 G.723 G.723.1 G.726 G.728 G.729 G.729.1
IETF	Opus iLBC Speex Vorbis FLAC
3GPP	AMR AMR-WB AMR-WB+ EVRC EVRC-B EVS GSM-HR GSM-FR GSM-EFR
ETSI	AC-3 AC-4 DTS
Bluetooth SIG	SBC LC3
Chinese Standard	AVS1 P10(GB/T 20090.10) AVS2 P3(GB/T 33475.3) Audio Vivid(GY/T 363) DRA(GB/T 22726) ExAC(SJ/T 11299.4)
Others	ACELP ALAC Asao ATRAC CELT Codec 2 iSAC Lyra MELP Monkey's Audio MT9 Musepack OptimFROG OSQ QCELP RCELP RealAudio RTAudio SD2 SHN SILK Siren SMV SVOPC TTA True Audio TwinVQ VMR-WB VSELP WavPack WMA MQA aptX aptX HD aptX Low Latency aptX Adaptive LDAC LHDC LLAC

Image
compression

IEC, ISO, IETF, W3C, ITU-T, JPEG	CCITT Group 4 GIF HEIC / HEIF HEVC JBIG JBIG2 JPEG JPEG 2000 JPEG-LS JPEG XL JPEG XR JPEG XS JPEG XT PNG TIFF TIFF/EP TIFF/IT
Others	APNG AV1 AVIF BPG DjVu EXR FLIF ICER MNG PGF QOI QTVR WBMP WebP

Containers

ISO, IEC	MPEG-ES MPEG-PES MPEG-PS MPEG-TS ISO/IEC base media file format MPEG-4 Part 14 (MP4) Motion JPEG 2000 MPEG-21 Part 9 MPEG media transport
ITU-T	H.222.0 T.802
IETF	RTP Ogg Matroska
SMPTE	GXF MXF
Others	3GP and 3G2 AMV ASF AIFF AVI AU BPG Bink Smacker BMP DivX Media Format EVO Flash Video HEIF IFF M2TS Matroska WebM QuickTime File Format RatDVD RealMedia RIFF WAV MOD and TOD VOB, IFO and BUP

Collaborations

Methods

Entropy
LPC
- ACELP
- CELP
- LSP
- WLPC
Lossless
Lossy
LZ
- DEFLATE
- LZW
PCM
- A-law
- µ-law
- ADPCM
- DPCM
Transforms
- DCT
- FFT
- MDCT
- Wavelet
  - Daubechies
  - DWT

Lists

See Compression methods for techniques and Compression software for codecs

Categories: