The untimely demise of an image upscaler
ML-based image enhancement models are great. Unfortunately, some of them are great only once.
Yesterday, my dumb Twitter post unexpectedly went viral. Some of the commenters read too much into it and pixel-peeped the attached photos — but what struck me was that some of their zoomed-in crops weren’t just pixelated or blurry. Their images contained new details, seamlessly conjured by an ML image upscaling algorithm on their devices.
Today’s post is not a rant against ML: I think that advanced image upscaling algorithms are fantastic. The tech can be misused and the results aren’t perfect, but I have a cache of low-res, overly-compressed images and videos from the early days of digital imaging. I don’t care what purists think: I’m itching to get this media in a better shape to preserve old memories.
The utility of ML-based restoration algorithms isn’t limited to archival data. On the internet, images and videos get reposted over and over again, acquiring cumulative damage due to repeated compression and resizing. To illustrate the issue, let’s take a random cell phone photo I snapped earlier this year in Seattle. The sequence starts with a 2000px original, followed by a 500px thumbnail, a re-upscaled image — and then the product of 20 consecutive downscale-upscale roundtrips:
Such artifacts are sometimes blamed on JPEG compression, but nowadays, repeated resizing is usually more of a deal. Either way — the end result is that if we can no longer locate the original file, ML restoration techniques may be our only hope for preserving some of the internet’s best memes.
The random encounter with ML upscaling on Twitter made me wonder: although the degradation caused by unintentionally stacking “dumb” image processing algorithms is ugly, the process is gradual and the damage is localized. But in a world where every other vendor is starting to sprinkle ML enhancements on top of their products, would the cumulative artifacts introduced by such technologies manifest in a similarly subdued way — or would it all go off the rails fast?
To answer this question, I took my earlier downscale-upscale experiment, but I replaced the Lanczos upscale step with a call to Topaz Gigapixel AI, the preeminent ML-based upscaling tool used by pro photographers. With a bit of UI automation, I generated 200 downscale-upscale roundtrips and then merged them into the following video. I relied on Topaz’s “High Fidelity v2” model at the recommended settings (35-25-100); these parameters don’t appear to be critical:
The clip is best viewed full-screen.
The results are quite striking: the output of ML upscaling, when looped onto itself, collapses quite rapidly. This includes the near-instant deterioration of the “NO CASH LEFT ON PREMISES” sign — despite its excellent legibility in the intermediate 500px image.
In the experiment above, I used a lossless format (PNG) to move data between the programs. I wanted to isolate the ML-mediated effects from compression artifacts — but as it turns out, although switching to JPEG at 90% hastens the corruption, it doesn’t alter the outcome much:
Aside from being trippy, the videos have practical implications: today’s ML upscaling techniques are fantastic, but they might only be safe to use once. If the filters are applied on-the-fly at multiple points in the lifetime of a multimedia clip, the cumulative damage is likely to be far worse than the effects of “dumb” algorithms that the vendors are seeking to replace.
If you liked this article, please subscribe! Unlike most other social media, Substack is not a walled garden and not an addictive doomscrolling experience. It’s just a way to stay in touch with the writers you like.
PS. For another fun image-processing experiment, check out this story of a cat.
The videos are awesome. A tank-like object out of a sign?
But, sadly, this is fully expected and not surprising…