In this article, we’ll explore visualization techniques for signal which allow us to derive some additional insights from the data.

Spectrogram

Spectrograms offer a powerful representation of the data. It plots over the time, for a given range of frequencies, the power (dB) of a signal. This allows us to spot periodic patterns over time, and regions of activity.

Spectrograms are used in state-of-the-art sound classification algorithms to turn signals into images and apply CNNs on top on those images.

There are several types of spectrograms to plot.

Linear-frequency power spectrogram

A linear-frequency power spectrogram represents the time on the x-axis, the frequency in Hz on a linear scale on the y-axis, and the power in dB.

import librosa

y, sr = librosa.load(filename)
D = librosa.amplitude_to_db(librosa.stft(y), ref=np.max)

plt.figure(figsize=(12,8))
librosa.display.specshow(D, y_axis='linear')
plt.colorbar(format='%+2.0f dB')
plt.title('Linear-frequency power spectrogram')
plt.show()

Log-frequency power spectrogram

This spectrogram presents the same information except for a logarithmic scale on the y-axis for the frequencies. Sometimes, as in our case, it’s a better scale if most of the information is located on lower frequencies and some noise are at high frequencies.

plt.figure(figsize=(12,8))
librosa.display.specshow(D, y_axis='log')
plt.colorbar(format='%+2.0f dB')
plt.title('Log-frequency power spectrogram')
plt.show()

Constant-Q power spectrogram

Unlike the Fourier transform, but similar to the mel scale, the constant-Q transform uses a logarithmically spaced frequency axis.

CQT = librosa.amplitude_to_db(librosa.cqt(y, sr=sr), ref=np.max)
plt.figure(figsize=(12,8))
librosa.display.specshow(CQT, x_axis='time', y_axis='cqt_hz')
plt.colorbar(format='%+2.0f dB')
plt.title('Constant-Q power spectrogram (Hz)')
plt.show()

Chromagram

Chromagram display the intensity of each pitch \(C, C♯, D, D♯, E , F, F♯, G, G♯, A, A♯, B\) for each time interval. One main property of chroma features is that they capture harmonic and melodic characteristics of music, while being robust to changes in timbre and instrumentation.

C = librosa.feature.chroma_cqt(y=y, sr=sr)
plt.figure(figsize=(12,8))
librosa.display.specshow(C, x_axis='time', y_axis='chroma')
plt.colorbar()
plt.title('Chromagram')
plt.show()

Tempogram

The tempo, measured in Beats Per Minute (BPM) measures the rate of the musical beat. The tempogram is a feature matrix which indicates the prevalence of certain tempi at each moment in time.

plt.figure(figsize=(12,8))
Tgram = librosa.feature.tempogram(y=y, sr=sr)
librosa.display.specshow(Tgram, x_axis='time', y_axis='tempo')
plt.colorbar()
plt.title('Tempogram')
plt.show()

Spectrum

The spectrum of a discrete signal is computed using the fast Fourier transform (FFT) and displays the mangitude (or the energy) at each frequence within a signal.

import scipy

X = scipy.fft(y)
f = np.linspace(0, sr, len(X))
plt.figure(figsize=(12, 8))
plt.plot(f, X) 
plt.xlabel('Frequency (Hz)')
plt.show()

Power Spectral Density

The power spectrum of a signal describes the distribution of power into frequency components composing that signal.

freqs, psd = signal.welch(y)

plt.figure(figsize=(12, 8))
plt.semilogx(freqs, psd)
plt.title('Power spectral density')
plt.xlabel('Frequency')
plt.ylabel('Power')
plt.show()

Conclusion : I hope that you enjoyed this article. These type of plots are nowadays used as images to classify sounds by CNNs.

Sound Visualization

Maël Fabien