Comparing compression in different formats
This is a comparison of a small selection of compression methods used in georeferenced raster imagery. The intention is to find which method gives the best "bang for buck" - the best quality for the smallest file size. The measure for quality is SSIM - structural similarity index , which is a perception-based model to calculate the difference between two images.
Method
We compare a lossless baseline image to a compressed dataset with formats ECW, WEBP and JPEG, and measure the SSIM using Python package scikit-image. This is repeated with a range of quality values for the compressed images. This should give an indication of which method is most efficient. In addition, it will indicate the quality value which will achieve the same SSIM across the compression methods.
Note that each of the file formats use a different scale for the input "quality" parameter. We include a wide range of the legal values from each of the formats, so this difference should not matter.
The baseline image has the following attributes:
Attribute | Value |
---|---|
File size | 273 MB |
Format | GeoTIFF |
Compression | LZW |
Dimensions | 10000 x 10000 px |
Pixel size | 0.25 meters |
The compressed ECW images were created with FME 2021.2. JPEG and WEBP were used as tile formats in GeoPackage datasets, with overviews. These were created with GDAL 3.4.1
Results
The graph shows that at high compression/low SSIM, the ECW format is easily the most effective. However, as SSIM increases, the three formats converge, and at around SSIM=0.9 the formats produce equally sized files.
Comparing formats at SSIM 0.77
Comparing formats at SSIM 0.9
Comparing ECW q89 to q90
The ECW format has an interesting rise in SSIM from q90 to q89. Comparing the two images, it seems that significant noise is introduced at q90. File size does not change much.