Part 3 - forward and inverse DFT derived and explained

Forward DFT is a function that computes scaling factors of complex sinusoids. These scaling factors may then be used by inverse DFT to reconstruct signal.

Computing scaling factor $y_k$ is done by *almost* projecting signal x onto csin with k-th frequency. Projection here is in a meaning "projecting line segment onto a line" or "projecting one vector onto another".

The important thing is that samples and scaling factors are two representations of the same signal and thanks to DFT and IDFT we know how to convert between them.

What follows is a bit of computations that show where does DFT comes from. If you want, you can skip next section.

Computing projection of signal onto complex sinusoid

Before reading this part you really shall refresh your knowledge about complex numbers and vector projections.

Projecting vector $\vec{a}$ onto $\vec{b}$ is expressed as

$$\frac{<\vec{a}, \vec{b}>}{<\vec{b}, \vec{b}>} \vec{b}$$

where $<\vec{a}, \vec{b}>$ is a dot product of two vectors.

Assuming that length(a)=length(b)=N then by definition

$$<\vec{a}, \vec{b}> = \sum_{n=0}^{N-1}a_n b_n$$

But for vectors of complex numbers, there is an additional twist

$$<\vec{a}, \vec{b}> = \sum_{n=0}^{N-1}a_n \overline{b_n}$$

where $\overline{b_n}$ mean a complex conjugate of $b_n$.

This modification is not just a strange thing created by mathematicians. It is essential for DFT and IDFT to work together. Without it effects of DFT and IDFT on signal would not cancel each other.

Before moving further let's see what complex conjugate does to csin as it will be needed later.

$$\overline{csin(k,n)} = \overline{e^{j*2\pi*k/N*n}} = \\ \overline{cos(2\pi*k/N*n) + j*sin(2\pi*k/N*n)} = \\ cos(2\pi*k/N*n) - j*sin(2\pi*k/N*n) = \\ cos(-2\pi*k/N*n) + j*sin(-2\pi*k/N*n) = \\ e^{-j*2\pi*k/N*n}$$

For making transition above I have used symmetry properties of sine and cosine that are

$$cos(x) = cos(-x)$$

$$-sin(x) = sin(-x)$$

Summarizing - complex conjugate of a complex sinusoid just puts minus sign next to imaginary unit.

Before we can project signal x onto csin with k-th frequency, let's define how vector of that csin looks like

$$\vec{csin_k} = (csin(k,0), csin(k,1), ..., csin(k,N-1))$$

Now we are ready to go. Let's try to project signal $\vec{x}$ onto complex sinusoid with k-th frequency $\vec{csin_k}$ and see what happens.

$$project(\vec{x}, \vec{csin_k}) = \frac{<\vec{x}, \vec{csin_k}>}{<\vec{csin_k}, \vec{csin_k}>} * \vec{csin_k}$$

Let's compute these two dot products separately.

$$ <\vec{x}, \vec{csin_k}> = \sum_{n=0}^{N-1}x_n*\overline{csin(k,n)} = \\ \sum_{n=0}^{N-1}x_n*\overline{e^{j*2\pi*k/N*n}} = \\ \sum_{n=0}^{N-1}x_n*e^{-j*2\pi*k/N*n} $$

$$ <\vec{csin_k}, \vec{csin_k}> = \sum_{n=0}^{N-1}csin(k,n)*\overline{csin(k,n)} = \\ \sum_{n=0}^{N-1} e^{j*2\pi*k/N*n} * e^{-j*2\pi*k/N*n} = \\ \sum_{n=0}^{N-1} e^{j*2\pi*k/N*n - j*2\pi*k/N*n} = \\ \sum_{n=0}^{N-1} e^{0} = \sum_{n=0}^{N-1} 1 = N $$

Now we can substitute dot products in the projection

$$\frac{<\vec{x}, \vec{csin_k}>}{<\vec{csin_k}, \vec{csin_k}>} * \vec{csin_k} = \frac{ \sum_{n=0}^{N-1}x_n*e^{-j*2\pi*k/N*n} }{ N } * \vec{csin_k} = \\ \frac{1}{N} (\sum_{n=0}^{N-1}x_n*e^{-j*2\pi*k/N*n}) * \vec{csin_k} $$

How DFT works?

Summarizing previous section, projecting signal x onto complex sinusoid with k-th frequency is expressed as

$$project(\vec{x}, \vec{csin_k}) = \frac{1}{N} (\sum_{n=0}^{N-1}x_n*e^{-j*2\pi*k/N*n}) * \vec{csin_k} = y_k * \vec{csin_k} $$

As mentioned eariler signal $x = (x_0, x_1, ..., x_{N-1})$ can be represented as a sum of scaled complex sinusoids.

$$x_n = \sum_{k=0}^{N-1}y_k * csin(k,n) = \sum_{k=0}^{N-1}y_k * e^{j*2\pi*k/N*n}$$

And from the result computed above we have a formula for the scaling factor

$$y_k = \frac{1}{N} \sum_{n=0}^{N-1}x_n*e^{-j*2\pi*k/N*n}$$

So a scaling factor $y_k$ is the same one that is used for scaling $\vec{csin_k}$ when doing projection of $\vec{x}$ onto $\vec{csin_k}$.

By using scaling factors $y_k$ we modify an impact of a k-th csin on values of a signal.
If $y_k$ is a big value then $csin(k,n)$ will have a big influence on the final value of the signal.
If $y_k$ is a small value then $csin(k,n)$ will have negligible (or even zero) effect on the final value of the signal. This describes how inverse DFT works.

And looking from the other perspective. If the signal contains frequency represented by $csin(k,n)$, then projecting signal onto that k-th csin will give big value of $y_k$.
If the signal does not contain frequency represented by $csin(k,n)$, then projecting signal onto that k-th csin will give very small value of $y_k$, possibly even zero. This describes how forward DFT works.

To understand why it all is true you can look at a subject of changing basis of a linear space. Helpful may be knowing that csins used by dft are orthogonal to one another and thus form a basis of a linear space.

This is the framework that constitutes DFT. In real life there is usually a slight modification to this. Scaling factor $1/N$ is not used in forward DFT, but it is incorporated into inverse DFT instead. This is because not always we want exact scaling factors - their relative values are often good enough. With this change formulas became

$$y_k = \sum_{n=0}^{N-1}x_n*e^{-j*2\pi*k/N*n}$$

$$x_n = \frac{1}{N} \sum_{k=0}^{N-1}y_k * e^{j*2\pi*k/N*n}$$

Now we can, fully aware of the meaning and origin, write

$$DFT(x) = y = (y_0, y_1, y_2, ..., y_{N-1})$$

$$y_k = \sum_{n=0}^{N-1}x_n*e^{-j*2\pi*k/N*n}$$

$$IDFT(y) = x = (x_0, x_1, x_2, ..., x_{N-1})$$

$$x_n = \frac{1}{N} \sum_{k=0}^{N-1}y_k * e^{j*2\pi*k/N*n}$$

where $x$ is an input signal and $y$ is a list of scaling factors.

Considering that there are two equivalent ways of representing signal, they have their own names. Signal $x$ sampled in time or space (song, height of a terrain) is called time domain signal. Signal represented by scaling factors $y_k$ is called frequency domain signal.

Forward DFT and signal spectrum

So forward DFT gives scaling factors of complex sinusoids, but how does it relate to spectrum of a signal or frequencies in a recorded sound?

If a signal contains frequency represented by k-th csin, then its scaling factor $y_k$ will be some big value and definetely nonzero.
If a signal does not contain frequency represented by k-th csin, then its scaling factor will be zero, or something very close to zero.
Thus from looking at scaling factors $y_k$ we can say what frequencies does the signal contain, which is the same as computing its frequency spectrum.

But be aware that performance of DFT is severly limited by a few factors that cause spectral leakage which causes DFT to show that signal contains frequencies that are really not there.

next part