Modified Discrete Cosine Transform explained

Purpose of this article is to explain what is MDCT and how to use it. It does not go deep into details why MDCT works.

At the end of the article there is available implementation of MDCT.

To be able to follow this article you should know:

basic math including trigonometric functions
summation using capital sigma notation
concepts of signal transforms - understanding basics of DFT or other transform is very helpful
programming in C

Quick navigation menu

What is MDCT?
The formulas
Example calculations
How to use MDCT on a real signal?
Implementation in C

What is MDCT?

Modified Discrete Cosine Transform (MDCT) is a transform that changes representation of a signal. In this manner it reminds Discrete Fourier Transform, but that is the only similarity. MDCT works only with real values and uses sinusoids instead of complex exponentials to represent signal.

MDCT is mainly used in audio compression eg. in MP3.

MDCT is unusual between transforms in two ways

it has twice as many inputs as outputs (usually there is equal number of these)
inverse transform does not restore original signal directly (usually it does)

You will see later where these properties fit and how to restore original signal from values of inverse MDCT.

The formulas

MDCT consists of forward and inverse transform

Forward MDCT

$$y_k = \sum_{n=0}^{2N-1}x_n\cos(2\pi\frac{k+\frac{1}{2}}{2N}(n+\frac{1}{2}+\frac{N}{2}))$$

Inverse MDCT

$$z_n = \frac{1}{N}\sum_{k=0}^{N-1}y_k\cos(2\pi\frac{k+\frac{1}{2}}{2N}(n+\frac{1}{2}+\frac{N}{2}))$$

$x_n$ is $n$-th sample of a signal
$y_k$ is a scaling factor for k-th sinusoid
$z_n$ is $n$-th sample of inverse MDCT output signal
$n = 0..2N-1$ is index of the sample
$k = 0..N-1$ is index of the frequency
$N$ is half length of the signal
$2N$ is a length of the signal

Because forward MDCT takes 2N samples but gives only N values it loses some information. This is why inverse MDCT gives $z_n$, not $x_n$ as these values are not the same.

Example calculations

Time to see how to calculate MDCT by hand.

Let signal $x=(1,3,5,7)$,
length(x)=4, so 2N=4, N=2, n=0..3, k=0..1

Forward MDCT calculations

$$y_0 = \sum_{n=0}^{2N-1}x_n\cos(2\pi\frac{0+\frac{1}{2}}{2N}(n+\frac{1}{2}+\frac{N}{2})) = \\ \sum_{n=0}^{4-1}x_n\cos(2\pi\frac{0+\frac{1}{2}}{4}(n+\frac{1}{2}+\frac{2}{2})) = \sum_{n=0}^{3}x_n\cos(2\pi\frac{1}{8}(n+\frac{3}{2})) = \\ x_0\cos(2\pi\frac{1}{8}(0+\frac{3}{2})) + x_1\cos(2\pi\frac{1}{8}(1+\frac{3}{2})) + x_2\cos(2\pi\frac{1}{8}(2+\frac{3}{2})) + x_3\cos(2\pi\frac{1}{8}(3+\frac{3}{2})) = \\ 1 * \cos(2\pi\frac{3}{16}) + 3 * \cos(2\pi\frac{5}{16}) + 5 * \cos(2\pi\frac{7}{16}) + 7 * \cos(2\pi\frac{9}{16}) = \\ 1 * 0.38268 + 3 * (-0.38268) + 5 * (-0.92388) + 7 * (-0.92388) = -11.852 $$

$$y_1 = \sum_{n=0}^{2N-1}x_n\cos(2\pi\frac{1+\frac{1}{2}}{2N}(n+\frac{1}{2}+\frac{N}{2})) = \\ \sum_{n=0}^{3}x_n\cos(2\pi\frac{1+\frac{1}{2}}{4}(n+\frac{1}{2}+\frac{2}{2})) = \sum_{n=0}^{3}x_n\cos(2\pi\frac{3}{8}(n+\frac{3}{2})) = \\ x_0\cos(2\pi\frac{3}{8}(0+\frac{3}{2})) + x_1\cos(2\pi\frac{3}{8}(1+\frac{3}{2})) + x_2\cos(2\pi\frac{3}{8}(2+\frac{3}{2})) + x_3\cos(2\pi\frac{3}{8}(3+\frac{3}{2})) = \\ 1 * \cos(2\pi\frac{9}{16}) + 3 * \cos(2\pi\frac{15}{16}) + 5 * \cos(2\pi\frac{21}{16}) + 7 * \cos(2\pi\frac{27}{16}) = \\ 1 * (-0.92388) + 3 * 0.92388 + 5 * (-0.38268) + 7 * (-0.38268) = -2.7444 $$

Summarizing: y = mdct(x) = [-11.852, -2.7444].

Inverse MDCT calculations

$$z_0 = \frac{1}{N}\sum_{k=0}^{N-1}y_k\cos(2\pi\frac{k+\frac{1}{2}}{2N}(0+\frac{1}{2}+\frac{N}{2})) = \\ \frac{1}{2}\sum_{k=0}^{2-1}y_k\cos(2\pi\frac{k+\frac{1}{2}}{4}(0+\frac{1}{2}+\frac{2}{2})) = \frac{1}{2}\sum_{k=0}^{1}y_k\cos(2\pi\frac{k+\frac{1}{2}}{4} * \frac{3}{2}) = \\ \frac{1}{2}( y_0\cos(2\pi\frac{0+\frac{1}{2}}{4} * \frac{3}{2}) + y_1\cos(2\pi\frac{1+\frac{1}{2}}{4} * \frac{3}{2}) ) = \\ \frac{1}{2}( -11.852 * \cos(2\pi\frac{3}{16}) + (-2.7444) * \cos(2\pi\frac{9}{16}) = -1.0 $$

$$z_1 = \frac{1}{N}\sum_{k=0}^{N-1}y_k \cos(2\pi\frac{k+\frac{1}{2}}{2N}(1+\frac{1}{2}+\frac{N}{2})) = 1.0 $$

$$z_2 = \frac{1}{N}\sum_{k=0}^{N-1}y_k \cos(2\pi\frac{k+\frac{1}{2}}{2N}(2+\frac{1}{2}+\frac{N}{2})) = 6.0 $$

$$z_3 = \frac{1}{N}\sum_{k=0}^{N-1}y_k \cos(2\pi\frac{k+\frac{1}{2}}{2N}(3+\frac{1}{2}+\frac{N}{2})) = 6.0 $$

In short: z = imdct(y) = [-1.0, 1.0, 6.0, 6.0]

Another values of MDCT and IMDCT

You should try to compute MDCT yourself. Here are some example values you can use for calculations and to check your result.

x=[10, 2, 7, 9]
length(x)=4, 2N=4, N=2, n=0..3, k=0..1

y = mdct(x) = [-11.72, -13.51]

z = imdct(y) = [4.00, -4.00, 8.00, 8.00]

How to use MDCT on a real signal?

Usage of MDCT is not as simple as using forward and inverse transform. It consists of following steps:

get signal s
divide signal into overlapping blocks
compute forward MDCT for each block
do something with the signal in MDCT form
compute inverse MDCT for each output of forward MDCT
restore signal from values of inverse MDCT

In short:

divide(signal) = bx
mdct(bx) = y_bx
imdct(y_bx) = z_bx
restore(z_bx) = signal

Let's see how it is done using some signal.

Example 1

Let signal $s = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]$

length(s) = 16

Let's divide signal into overlapping blocks with 8 samples each

b0 = [1, 2, 3, 4, 5, 6, 7, 8]
b1 = [5, 6, 7, 8, 9, 10, 11, 12]
b2 = [9, 10, 11, 12, 13, 14, 15, 16]

First half of block b1 is the same as second half of block b0.
Fist half of block b2 is the same as second half of block b1.
This is what it means that blocks shall overlap.

In MDCT every block shall overlap exactly half of previous block. If overlap is less than half of block then original signal cannot be restored.

It is also common to say that MDCT uses overlapping window functions to divide signal into overlapping blocks. In this context a window function takes whole signal and returns a single block. From the use of window functions blocks may also be called windows.

To compute forward MDCT for a signal divide signal into blocks b0, b1, b2, .., and calculate MDCT for each block separately.

y_b0 = mdct(b0) = [-25.42 , -4.78 , 4.11 , 3.17]
y_b1 = mdct(b1) = [-39.92 , -9.87 , 7.51 , 6.06]
y_b2 = mdct(b2) = [-54.42 , -14.96 , 10.91 , 8.94]

To restore signal from MDCT first compute inverse MDCT from y_bx values.

z_b0 = imdct(y_b0) = [-1.50, -0.50, 0.50, 1.50, 6.50, 6.50, 6.50, 6.50]
z_b1 = imdct(y_b1) = [-1.50, -0.50, 0.50, 1.50, 10.50, 10.50, 10.50, 10.50]
z_b2 = imdct(y_b2) = [-1.50, -0.50, 0.50, 1.50, 14.50, 14.50, 14.50, 14.50]

When dividing original signal s into blocks, each block was overlapped. Now is the time to use this. Add values of z_bx where original blocks were overlapped ie. second half of z_b0 to fist half of z_b1, second half of z_b1 to first half of z_b2.

z_b0[4] + z_b1[0] = 6.50 + (-1.50) = 5.0
z_b0[5] + z_b1[1] = 6.50 + (-0.50) = 6.0
z_b0[6] + z_b1[2] = 6.50 + 0.50 = 7.0
z_b0[7] + z_b1[3] = 6.50 + 1.50 = 8.0
z_b1[4] + z_b2[0] = 10.50 + (-1.50) = 9.0
z_b1[5] + z_b2[1] = 10.50 + (-0.50) = 10.0
z_b1[6] + z_b2[2] = 10.50 + 0.50 = 11.0
z_b1[7] + z_b2[3] = 10.50 + 1.50 = 12.0

You can see clearly that these are the values from the middle of the original signal s.

It is not possible to restore original values of s from beginning and end of s without adding overlapped blocks.

To restore those values just add a few zeros to the start and end of s, divide signal into new overlapped blocks and follow the procedure. Following example illustrates this.

Example 2

Let s = [12, 1, 9, 4, 5, 13, 14, 15, 10, 3, 7, 6, 16, 11, 2, 8].

length(s) = 16

This time we want to restore whole signal. We divide signal into blocks of 8 samples and each block overlaps 4 samples. So let's add 4 zeros to the beginning and the end of s. This way all values from signal s will have overlapped blocks and thus every value can be restored.

s0 = [0, 0, 0, 0, 12, 1, 9, 4, 5, 13, 14, 15, 10, 3, 7, 6, 16, 11, 2, 8, 0, 0, 0, 0].

It does not matter if you choose 0 or other value to extend signal. It is just that 0 is a simple, nonmagical value.

Divide s0 into overlapped blocks of 8 values.

b0 = [0, 0, 0, 0, 12, 1, 9, 4]
b1 = [12, 1, 9, 4, 5, 13, 14, 15]
b2 = [5, 13, 14, 15, 10, 3, 7, 6]
b3 = [10, 3, 7, 6, 16, 11, 2, 8]
b4 = [16, 11, 2, 8, 0, 0, 0, 0]

Compute forward MDCT of each block.

y_b0 = mdct(b0) = [-23.11, -5.19, 10.14, 6.94]
y_b1 = mdct(b1) = [-40.23, -21.95, -0.48, 20.34]
y_b2 = mdct(b2) = [-28.86, 5.17, 7.35, -0.40]
y_b3 = mdct(b3) = [-31.26, -7.83, 13.77, 18.05]
y_b4 = mdct(b4) = [6.20, -12.85, 9.04, -2.18]

Compute inverse MDCT of y_bx.

z_b0 = imdct(y_b0) = [-0.00, 0.00, -0.00, 0.00, 8.00, 5.00, 5.00, 8.00]
z_b1 = imdct(y_b1) = [4.00, -4.00, 4.00, -4.00, 10.00, 13.50, 13.50, 10.00]
z_b2 = imdct(y_b2) = [-5.00, -0.50, 0.50, 5.00, 8.00, 5.00, 5.00, 8.00]
z_b3 = imdct(y_b3) = [2.00, -2.00, 2.00, -2.00, 12.00, 6.50, 6.50, 12.00]
z_b4 = imdct(y_b4) = [4.00, 4.50, -4.50, -4.00, 0.00, 0.00, -0.00, 0.00]

Add overlapped blocks to restore original values.

z_b0[4] + z_b1[0] = 8.00 + 4.00 = 12.00
z_b0[5] + z_b1[1] = 5.00 + (-4.00) = 1.00
z_b0[6] + z_b1[2] = 5.00 + 4.00 = 9.00
z_b0[7] + z_b1[3] = 8.00 + (-4.00) = 4.00
z_b1[4] + z_b2[0] = 10.00 + (-5.00) = 5.00
z_b1[5] + z_b2[1] = 13.50 + (-0.50) = 13.00
z_b1[6] + z_b2[2] = 13.50 + 0.50 = 14.00
z_b1[7] + z_b2[3] = 10.00 + 5.00 = 15.00
z_b2[4] + z_b3[0] = 8.00 + 2.00 = 10.00
z_b2[5] + z_b3[1] = 5.00 + (-2.00) = 3.00
z_b2[6] + z_b3[2] = 5.00 + 2.00 = 7.00
z_b2[7] + z_b3[3] = 8.00 + (-2.00) = 6.00
z_b3[4] + z_b4[0] = 12.00 + 4.00 = 16.00
z_b3[5] + z_b4[1] = 6.50 + 4.50 = 11.00
z_b3[6] + z_b4[2] = 6.50 + (-4.50) = 2.00
z_b3[7] + z_b4[3] = 12.00 + (-4.00) = 8.00

Implementation

You can download example implementation of MDCT and IMDCT written in C.
It also comes with a simple makefile.

The file contains simple MDCT and IMDCT implementations and usage as well as usage of MDCT on a divided signal and how to restore it. The implementation is heavy commented.

All code written here is licensed under Creative Commons Zero license, whose full text can be found at https://creativecommons.org/publicdomain/zero/1.0/. It basically means that you can do with the code whatever you like without asking for any permissions.

Article written 16.03.2017