Oh my! The 20 minutes spent reading (and listening) to this article were one of the best 20 minutes spent on social media this week. Thanks a lot, very interesting!
Very fun and interesting, especially to me since I have spent years dealing with visual effects while dealing with tinnitus. But I think you missed the most interesting tie-in: time. You used a music clip which is frequency (et al) change over time, but then compared this with a static image. Your eyes are also constantly processing moving images ... else vision would be pretty boring after that first instant a baby opens its eyes. So, I think an interesting knock-on would be to extend this study by using a short video clip rather than a simple still image of Skye. Observing how the various artifacts and alterations behave over time might surprise you further.
Yeah, I thought about it, but I suspect it wouldn't be all that interesting visually. The temporal equivalent of pixelation (first example) is just low frame rate, which is visually unoffensive but boring, and essentially makes the same point (compared to vision, hearing is weird).
The second example would be motion blur - again, not any more illuminating than what I have here, probably?
The third... logically, it's probably just a video with colors posterized, so no benefit to the temporal component altogether.
The fourth video would be a bit more interesting - motion trails - but again, not hugely different from the static visual.
The only case where it could get wacky is probably the last case with FFT manipulation, if we do it in the t axis instead of x-y. Of course, maybe there are other filters worth covering here.
This is great! The echo example illustrates an interesting finding in psychoacoustics - since echoes are naturally occuring, our auditory system effectively filters them out, while giving us information about the space. It's pretty amazing.
I have been thinking lately how "curated" our perception of reality really is. How much editing, filtering, extrapolation and transformation our brain does with incoming sensory input signals and how programmable or configurable this editing/filtering/extrapolation/transformation might be.
> Also note that the loss of fidelity is far more rapid for audio than for quantized images!
Interestingly, the image is cheating! The numbers in an image file don't linearly correspond to pixel brightness. The darker values are given way more bins then they should be to reduce the effects of quantization. Without gamma encoding an 8-bit image would have noticeable quantization artifacts in the darker areas. (try setting an image editor to "linear light" and 8-bits)
You can do a similar trick with audio: A-law/Mu-law coding. It was used as an early form of compression for phone systems. It's much simpler then doing anything to the digital bitstream: Instead of needing a computer, it's just a few opamps on either end.
(The same trick also works for other kinds of limited-dynamic range, high-noise channels like analog radio links and tape recorders.)
Nicely done, although I think it would also be informative to show the frequency response from each 'technique'. That connects the dots a bit more perhaps. If you do that, and you plot the the frequency spectrogram of the sound clip, you can see where the entire spectrogram is in the pass band of the frequency response and so one would not expect to hear much difference.
Probably! My main motivation was to explore the image / audio angle, and unfortunately, frequency-domain representations of images are not easy to grok.
Oh my! The 20 minutes spent reading (and listening) to this article were one of the best 20 minutes spent on social media this week. Thanks a lot, very interesting!
Very fun and interesting, especially to me since I have spent years dealing with visual effects while dealing with tinnitus. But I think you missed the most interesting tie-in: time. You used a music clip which is frequency (et al) change over time, but then compared this with a static image. Your eyes are also constantly processing moving images ... else vision would be pretty boring after that first instant a baby opens its eyes. So, I think an interesting knock-on would be to extend this study by using a short video clip rather than a simple still image of Skye. Observing how the various artifacts and alterations behave over time might surprise you further.
Yeah, I thought about it, but I suspect it wouldn't be all that interesting visually. The temporal equivalent of pixelation (first example) is just low frame rate, which is visually unoffensive but boring, and essentially makes the same point (compared to vision, hearing is weird).
The second example would be motion blur - again, not any more illuminating than what I have here, probably?
The third... logically, it's probably just a video with colors posterized, so no benefit to the temporal component altogether.
The fourth video would be a bit more interesting - motion trails - but again, not hugely different from the static visual.
The only case where it could get wacky is probably the last case with FFT manipulation, if we do it in the t axis instead of x-y. Of course, maybe there are other filters worth covering here.
This is great! The echo example illustrates an interesting finding in psychoacoustics - since echoes are naturally occuring, our auditory system effectively filters them out, while giving us information about the space. It's pretty amazing.
I have been thinking lately how "curated" our perception of reality really is. How much editing, filtering, extrapolation and transformation our brain does with incoming sensory input signals and how programmable or configurable this editing/filtering/extrapolation/transformation might be.
Thank you! It’s been the best 20 minutes I’ve wasted all day.
When I grow up (I am 45 now), I want to be lcamtuf.
> Also note that the loss of fidelity is far more rapid for audio than for quantized images!
Interestingly, the image is cheating! The numbers in an image file don't linearly correspond to pixel brightness. The darker values are given way more bins then they should be to reduce the effects of quantization. Without gamma encoding an 8-bit image would have noticeable quantization artifacts in the darker areas. (try setting an image editor to "linear light" and 8-bits)
You can do a similar trick with audio: A-law/Mu-law coding. It was used as an early form of compression for phone systems. It's much simpler then doing anything to the digital bitstream: Instead of needing a computer, it's just a few opamps on either end.
(The same trick also works for other kinds of limited-dynamic range, high-noise channels like analog radio links and tape recorders.)
Nicely done, although I think it would also be informative to show the frequency response from each 'technique'. That connects the dots a bit more perhaps. If you do that, and you plot the the frequency spectrogram of the sound clip, you can see where the entire spectrogram is in the pass band of the frequency response and so one would not expect to hear much difference.
Probably! My main motivation was to explore the image / audio angle, and unfortunately, frequency-domain representations of images are not easy to grok.