Earlier this month, I posted a gentle introduction to two pivotal frequency-domain algorithms: discrete Fourier transform (DFT) and discrete cosine transform (DCT). The first one has countless applications in signal processing, from Auto-Tune to MRI. The second one reigns supreme in the world of lossy compression — including JPEG, MP3, and H.264. I penned that deep dive simply because I could never find an intuitive and satisfying explanation of how and why these algorithms work.
We need the binary compressed output as well (and a routine to decompress it) – the whole purpose of losing data is to be able to shrink its size, isn't it? Also, I'm very curious as to how well this thing can compress text.
To preempt the pedantry: the way JPEG works is that it performs this lossy transformation, and then compresses and decompresses the quantized coefficients using traditional lossless compression (Huffman coding). There's also a color space transform and chroma subsampling beforehand, but that's not relevant to text.
Anyway, this page is skipping the lossless compression and decompression parts because they would have no effect on the data. The entire point is to demonstrate the degradation you'd experience if you applied the same algorithm to text, similar to simulating the impact of JPEG compression from the original input bitmap to your screen.
We need the binary compressed output as well (and a routine to decompress it) – the whole purpose of losing data is to be able to shrink its size, isn't it? Also, I'm very curious as to how well this thing can compress text.
Great work!
To preempt the pedantry: the way JPEG works is that it performs this lossy transformation, and then compresses and decompresses the quantized coefficients using traditional lossless compression (Huffman coding). There's also a color space transform and chroma subsampling beforehand, but that's not relevant to text.
Anyway, this page is skipping the lossless compression and decompression parts because they would have no effect on the data. The entire point is to demonstrate the degradation you'd experience if you applied the same algorithm to text, similar to simulating the impact of JPEG compression from the original input bitmap to your screen.
It's a great password generator as well
Could you use it to bypass an LLM/GPT input filter, depending on how it is set up?
It could work well for sneaking banned words past filtering that is done as a first step.
love it! - except that I envy you for your afternoons ;-)