It's Time to Move Past JPEG

JPEG made its way onto the scene in the early 1990s, with the first part of its specification published in 1992.

JPEG relies on a mathematical process called the Discrete Cosine Transform. This describes each 8x8 block as a series of standing cosine waves and instead of storing the color at each pixel, you store the amplitude of 64 standing waves that could potentially contribute to the image.

In the late 1990s the JPEG group developed a new image format, JPEG 2000, based on a different type of transform: Discrete Wavelet Transform. The DWT is similar in concept to the DCT except that instead of having 64 different waves, you have one wave form that changes position. By storing the amplitude of this moving waveform you can reconstruct the image by adding them all up.

There are a handful of advantages. The image is broken into much larger blocks (up to 1024x1024), meaning you won't have the same sort of macroblock noise that JPEG has.

Also, since the wave values are location-dependent, the image is less suceptible to high-frequency noise around edges.

JPEG 2000 never caught on, but not because it's a bad format. The JPEG 2000 standard was ambitious - it allowed for all sorts of different color spaces, color depths, and additional color channels. Because transform blocks could be up to 1024x1024 it was very memory intensive for the time. 2MB of temporary decode space is significant on machines that had 32 MB of RAM.

Looking at JPEG 2000

The following images were compressed in Photoshop.

JPEG: Save For Web, Progressive, 50% quality
JPEG 2000: 1024x1024 blocks, float, quality level 20

Alfa Romeo

Alfa JPEG vs JPEG 2000

Immediately we have a demonstration of both the pros and the cons of wavelet-based image compression. The smooth gradients in the car's paint are preserved very well with minimal noise around the edges. However, a lot of high-frequency detail in the road is blurred out.

Fall Foliage

Fall Foliage JPEG vs JPEG 2000

Another good demonstration of the trade-off between DCT and DWT. JPEG 2000 reduces noise on the white wall, but also loses some detail in the far-background greenery.

The JPEG 2000 image also looks slightly more vibrant as well, which is more accurate to the original. JPEG reduces the resolution for the image's color channels, which results in a loss in saturation in colored edges.

Dragalia Lost

Dragalia Lost JPEG vs JPEG 2000

If you've been following along you'll probably already be able to pick out the differences here. The JPEG image has clusters of noise around edges, and the colors dull slightly as they approach the black lines.

In JPEG 2000 you can clearly see that there are still artifacts. Because the artifacts are spread out over multiple pixels they can be more visible at 100% zoom in certain scenarios - though these are usually the same scenarios that cause banding and block artifiacts in JPEG.

Stop trying to make JPEG 2000 a thing.

They called JPEG 2000 "JPEG 2000" so it would sound futuristic, so it's safe to say that its time came and went without it ever seeing significant adoption.

(Side note, the Library of Congress uses JPEG 2000 for archival scans. It may never have become a widely-used web format, it is very good for archival storage.)

Keep an eye on video encoding.

Every video encoder needs to be able to encode keyframes. The h.265 keyframe encoding algorithm was turned into an image file format called BPG. BPG is unlikely to gain a lot of traction, but it did open a lot of eyes. It's incredibly impressive but nobody wants to risk implementing it because of patent concerns surrounding h.265.

Similar h.265 patent concerns led several industry titans to form the Alliance for Open Media, a group whose mission it is to develop royalty-free media standards for use on the web. This group leans heavily on work done by Xiph.org on an open-source h.265 competitor called Daala, which they are extending into a new standard called AV1.

Change can happen.

The most important thing to remember about the Alliance for Open Media is that its members include Microsoft, Google, and Mozilla - the "big 3" of web browser authors. Also, including Xiph.org means that the open-source community will probably have a strong say in the group. These are exactly the groups that need to be involved in order to adopt any sort of standard on the web. There's also participation from Apple (Apple, MS, and Google being the big names in mobile and desktop OS) and AMD, ARM, Intel and Nvidia (crucial for creating hardware that can speed up encoding/decoding).

YouTube has actually rolled out a playlist of videos that you can stream in AV1 (if your browser meets all the requirements and you enable it).

So AV1 is a real technology that's got backing and is making its way onto the web - which is great because just this past February (2019) the official 1.0 release of the AV1 Image File Format (AVIF) specification was released. Microsoft and Netflix have both posted some test files and an insider build of Windows supports AVIF natively.

Here are some AV1-encoded video you can use to see if your browser can play the format.

Sintel Trailer: 1080p24 at 1 Mbps.

R:Racing Revolution replay: 480p60 at 1 Mbps.

It's hard to simply imagine a world where everybody switches over to a new format, but it happened before when JPEG was introduced. Before then, photos were stored as BMP, TIFF, TGA, or even GIF (and they took up a ton of space). There's a long way to go, but a lot of people are already putting in a lot of work.

Who knows - unlike JPEG 2000 it might actually result in something.