Fourier Transformation

The Fourier Transformation correlates a time signal with the complex exponential eigenfunctions of different frequencies. If there is a lot of energy for a given frequency, the correlation with the eigenfunction will also be greater. The Fourier Transformation results in spectral components, which indicate the energy that each frequency contributes to the time domain signal. ^4bfee2

Types of Fourier Transformations¶

The Fourier Transformation differs between time-discrete vs. time-continuous and finite vs. infinite signals, resulting in four different transformations. However, they all follow the same principle of correlating the time-signals with complex eigenfunctions.

Continuous-Time Fourier Transform¶

The name-giving and most general Fourier Transformation is the Fourier Transform for infinite time-continuous signals. It results in a infinite frequency-continuous spectral representation. For a frequency, the spectral coefficient is given by:

\[ S(f)=\int\limits_{-\infty}^{\infty}s(t)e^{-j2{\pi}ft}dt \]

The inverse transformation is then given by:

\[ s(t)=\int\limits_{-\infty}^{\infty}S(f)e^{j2{\pi}ft}df \]

^f14e5c

Fourier Series¶

For periodic time-domain signals (=finite signals), the #Fourier Transform can be simplified to a sum of discrete frequencies. For the forward transformation, the signal needs only to be correlated with the eigenfunction over one period length:

\[ S(k)=\int\limits\limits_{0}^{P}s(t)e^{-j2{\pi}kt}dt \]

For the inverse, we can now sum up the eigenfunctions, weighted by their spectral component:

\[ s(t)=\sum\limits\limits_{-\infty}^{\infty}S(k)e^{j2{\pi}kt} \]

Sinusoidal Representation¶

The FS was originally formulated using sine and cosine functions:

\[ s(t)=\frac{a_0}{2}+\sum\limits_{h=1}^{\infty}(a_{h}\cos(2\pi hf_{0}t)+b_{h}\sin(2\pi hf_{0}t)) \]

Discrete-Time Fourier Transform¶

The DTFT is applied signals with a discrete time domain. It results in a periodic frequency spectrum:

\[ S(f)=\sum\limits\limits_{-\infty}^{\infty}s[n]e^{-j2{\pi}fn} \]

Just as in the forward transformation of the #Fourier Series, the inverse DTFT can be performed by integrating only over one period length of the spectrum:

\[ s[n]=\frac{1}{2\pi}\int\limits\limits_{2\pi}S(f)e^{j2{\pi}fn}df \]

Discrete Fourier Transform¶

The DFT is the FT transformation when applied to a finite and discrete time domain signal. It is the most widely applied type of Fourier Transformation as pretty much all digital signals are finite and have discrete timesteps. The forward transformation is given by:

\[ S(k)=\sum\limits_{n=0}^{N-1}s[n]e^{-j\frac{2\pi}{N}kn} \]

and its inverse transformation is:

\[ s[n]=\frac{1}{N}\sum\limits_{k=0}^{N-1}S(k)e^{j\frac{2\pi}{N}kn} \]

When applying the DFT, we often only consider a fragment of a time domain signal. To decrease Spectral Leakage, most often Analysis Window are applied to the time signal first.

Fast Fourier Transform¶

The DFT is well suited for the spectral transformation of almost all digital signals, but has one problem: It doesn't scale well. Its computational costs grow quadratically, \(O(n^2)\). But there is a more efficient implementation of the DFT, called the Fast Fourier Transform (FFT), which grows in \(O(n\log{n})\).

The algorithm splits up the time-domain signal \(s[n]\) into two batches, one for the even indices and one for the odd indices. The spectral components of the upper and lower parts of the frequencies are symmetrical in each of those batches, so only half of the frequencies need to be calculated in each batch. This batching approach can now be repeated, each time halving the amount of frequencies that need to be calculated. After \(\log{n}\) steps, there is only one frequency left to be calculated, consisting of \(n\) weighted summations, resulting in the \(O(n\log{n})\) complexity.

Short Time Fourier Transform (STFT)¶

The STFT is a #Discrete Fourier Transform that is applied to windowed segments of a time signal. This allows to retain some temporal information in the frequency domain. The type, length and overlap of the Analysis Window determines the temporal vs. spectral resolution of the STFT.

Definition¶

The STFT is defined as:

\[ X[k,I]=\sum\limits_{n=0}^{N-1}w[n]x[n+IL]exp({-j\frac{2\pi kn}{N}}) \]

where

\(k=0,\ldots,N-1\) represents the frequency
\(I\) is the frame index (\(I\)-th window)
\(n\) is the index around which the window is centered
\(N\) is the window length
\(L\) is the frame shift (and \(N-L\) is the overlap of the windowed segments)

The \(k\)-th frequency is given as \(f_k=\frac{k}{N}f_s\). As the highest frequency that can be represented without Spectral Leakage is, by Nyquist, is \(\frac{1}{2}f_s\), we can usually discard the upper half of the frequencies (\(k>\frac{N}{2}\)).

Bandwidth¶

A wider window has a poor temporal resolution, but a higher frequency resolution

\(\rightarrow\) Wide window = Narrowband spectrogram

A narrower window on the other hand has great temporal resolution, but worse frequency resolution.

\(\rightarrow\) Narrow window = Wideband spectrogram

Pasted image 20240513192352.png

By Nyquist, the bandwidth of the signal is \(\frac{f_s}{2}=8kHz\), which is why it is capped of at that frequency.

Choosing the correct bandwidth is very dependent on the input signal and requirements of the analysis. In the figure above, the time domain signal is sampled at \(f_s=16kHz\) and then windowed with a

time resolution \(N=f_s\cdot32ms\) and frequency resolution \(Ω = \frac{2π}{N} = 31.25 Hz\) (upper)
time resolution \(N=f_s\cdot8ms\) and frequency resolution \(Ω = \frac{2π}{N} = 125Hz\) (lower) The fine-grained frequency resolution of the upper graph is great for analyzing the effects of longer speech sound like the vocals. For shorter sounds like plosives however (duration ca. \(40-60ms\)), the temporal resolution of the upper graph might be too low.

In general, a typical setup for natural sounds would be windows on the order of \(32ms\) with an overlap of around 50 to 75%. As sounds in nature often follow a \(6db\) attenuation, higher frequencies are often boosted through a pre-emphasis filter. Lastly, the segments are often produced by a tapered Analysis Window.

Discrete Cosine Transformation¶

…