On the topic of discrete cosine transform, one obvious question might be "why cosine instead of sine". The most important consideration is that at k=0, the cosine part evaluates to 1, so F[0] (the 0 Hz DCT component) can capture the DC bias / constant offset of the input signal. In contrast, with sine, F[0] is always zero.
It is perhaps also worth noting that there is an "compromise" solution between DFT and DCT: discrete Hartley transform (DHT). Like DCT, it produces real-only outputs. Like DFT, it uses 2π*k-spaced bins, with information about certain signal phases ending up in the bins above the Nyquist frequency, instead of the "partial-turn" ones. The added perk of DHT is that unlike DCT, it doesn't require hacks for orthogonality. The formula is:
That said, the DHT encoding is less convenient for signal processing than DFT while saving little if any CPU time; and for compression, it's slower than DCT. Because of this, it hasn't really found a mainstream use. It's essentially a nerdy curiosity that gets brought up as a "superior" alternative in some online threads.
I think using calculus could make explanations easier to understand. I am no expert, but summation notation can scare people, especially because so many schools never teach it outside of calculus. Integrals seem less complicated, especially for Laplace transforms and similar. Then again, I could be completely wrong.
>>> For example, the Wikipedia article on the matter assaults the reader with around three dozen cryptic formulas, but offers no accessible explanation how or why the algorithm works.
Yes, yes, yes, and again, yes.
I am a Wikipedia fanboy to the point of having been making a monthly donation for years now, but I would not hesitate to say that, as time has marched on, and nearly all articles have improved, many technical articles have become ... pretty much incomprehensible. To the sort of person who is looking to Wkp for a quick reminder/refresher/intro, I mean. And I say this as someone who is not completely innumerate -- applied math major in college, research analyst and programming jobs afterwards, like that.
Sometimes I remember to check Simple Wikipedia when I am feeling bamboozled, but (a) I don't always, and (b) I shouldn't have to. An article on, say, the Discrete Fourier Transform should read like this post of yours does, with the wannabe textbook authorship farther down the page, or even better, relegated to an "advanced"-type of article.
Anyway, thanks, once again, for your crystal clear explanations.
The limited resolution in the frequency domain seems like a nuisance, but it’s inherent. Frequency is about movement, and motion takes time. You need to observe slow movement for longer to detect and measure it than fast movement. (Consider measuring the speed of a snail.) Movements with precise timing (the ticking of a clock) need quick, higher-frequency components.
In some cases, you can do better if you make assumptions about the shape of the movement. You don’t need to wait for a full revolution of a clock hand to measure how fast it’s going, if you assume constant velocity. But this doesn’t work for unpredictable movements.
Yeah. I think the "wrapping a waveform around a circle" metaphor is subtly less intuitive, for two reasons:
1) The "wrapping" of a two-dimensional object around a curve is actually a non-trivial operation if you want to model it accurately. The "rotating plotter arm" analogy gives you a more obvious sense of where every point of the waveform lands.
2) Strictly speaking, the "circle" in DFT isn't a circle if your signal has no DC bias (well, OK, it's a circle with zero diameter). A circle can be imagined in the presence of DC bias, but its diameter varies depending on the waveform.
I think there's also an inaccuracy in that video with regard to the significance of the real & imaginary part of the vector. If we could just use the real part as the DFT result, we wouldn't need a two-dimensional treatment to begin with. The problem is that using the real part works only if you neatly align the phase of your waveform with the plot, which is what he's doing in the video, but which doesn't represent real life.
On the topic of discrete cosine transform, one obvious question might be "why cosine instead of sine". The most important consideration is that at k=0, the cosine part evaluates to 1, so F[0] (the 0 Hz DCT component) can capture the DC bias / constant offset of the input signal. In contrast, with sine, F[0] is always zero.
It is perhaps also worth noting that there is an "compromise" solution between DFT and DCT: discrete Hartley transform (DHT). Like DCT, it produces real-only outputs. Like DFT, it uses 2π*k-spaced bins, with information about certain signal phases ending up in the bins above the Nyquist frequency, instead of the "partial-turn" ones. The added perk of DHT is that unlike DCT, it doesn't require hacks for orthogonality. The formula is:
Forward DHT: F[k] = sum (n = 0 ... N-1, s[n] * cas(2π*k*n/N) )
Inverse DHT: s[n] = 1/N * sum (k = 0 ... N-1, F[k] * cas(2π*n*k/N) )
...where cas(t) = cos(t) + sin(t).
That said, the DHT encoding is less convenient for signal processing than DFT while saving little if any CPU time; and for compression, it's slower than DCT. Because of this, it hasn't really found a mainstream use. It's essentially a nerdy curiosity that gets brought up as a "superior" alternative in some online threads.
I think using calculus could make explanations easier to understand. I am no expert, but summation notation can scare people, especially because so many schools never teach it outside of calculus. Integrals seem less complicated, especially for Laplace transforms and similar. Then again, I could be completely wrong.
>>> For example, the Wikipedia article on the matter assaults the reader with around three dozen cryptic formulas, but offers no accessible explanation how or why the algorithm works.
Yes, yes, yes, and again, yes.
I am a Wikipedia fanboy to the point of having been making a monthly donation for years now, but I would not hesitate to say that, as time has marched on, and nearly all articles have improved, many technical articles have become ... pretty much incomprehensible. To the sort of person who is looking to Wkp for a quick reminder/refresher/intro, I mean. And I say this as someone who is not completely innumerate -- applied math major in college, research analyst and programming jobs afterwards, like that.
Sometimes I remember to check Simple Wikipedia when I am feeling bamboozled, but (a) I don't always, and (b) I shouldn't have to. An article on, say, the Discrete Fourier Transform should read like this post of yours does, with the wannabe textbook authorship farther down the page, or even better, relegated to an "advanced"-type of article.
Anyway, thanks, once again, for your crystal clear explanations.
The limited resolution in the frequency domain seems like a nuisance, but it’s inherent. Frequency is about movement, and motion takes time. You need to observe slow movement for longer to detect and measure it than fast movement. (Consider measuring the speed of a snail.) Movements with precise timing (the ticking of a clock) need quick, higher-frequency components.
In some cases, you can do better if you make assumptions about the shape of the movement. You don’t need to wait for a full revolution of a clock hand to measure how fast it’s going, if you assume constant velocity. But this doesn’t work for unpredictable movements.
Great post. 3Blue1Brown has a video with some great animations of this same concept:
https://youtu.be/spUNpyF58BY
Yeah. I think the "wrapping a waveform around a circle" metaphor is subtly less intuitive, for two reasons:
1) The "wrapping" of a two-dimensional object around a curve is actually a non-trivial operation if you want to model it accurately. The "rotating plotter arm" analogy gives you a more obvious sense of where every point of the waveform lands.
2) Strictly speaking, the "circle" in DFT isn't a circle if your signal has no DC bias (well, OK, it's a circle with zero diameter). A circle can be imagined in the presence of DC bias, but its diameter varies depending on the waveform.
I think there's also an inaccuracy in that video with regard to the significance of the real & imaginary part of the vector. If we could just use the real part as the DFT result, we wouldn't need a two-dimensional treatment to begin with. The problem is that using the real part works only if you neatly align the phase of your waveform with the plot, which is what he's doing in the video, but which doesn't represent real life.