Music Upsampler

Ep.2 – Using GNU Octave

Screenshot of GNU Octave GUI

Recap

The Chord Mojo can be connected via USB essentially acting as an external sound card which is mainly how I use it and is able to accept audio up to 768kHz/32-bit.

In all the time I’ve owned it I’ve never come across any popular music that is recorded at that rate or resolution. The best I’ve been able to find is 192kHz/24-bit.

My aim was therefore to try and upsample a 44.1kHz/16-bit recording, standard for most Audio CDs, into a 768kHz/32-bit audio file which I could then play through the Chord Mojo.

Introducing GNU Octave

I decided I was going to use GNU Octave to rapidly prototype something before I moved over to C/C++ to get speed improvements. I used MATLAB extensively at university and always liked it’s ease of use particularly for numerical computation but as I couldn’t afford the cost of owning a copy myself moved to GNU Octave which is a highly compatible free alternative albeit slower.

One of things I like about MATLAB/GNU Octave is that it has a vast array, no pun intended, of functions that import and export to various file formats. This allows you to focus on the what you want to achieve rather than having to worry about the lower level stuff.

Working the problem

As a reminder my aim is to use the Whittaker–Shannon interpolation formula to upsample an audio file.

$$x(t) = \sum_{n=-\infty}^{\infty} x[n] \, {\rm sinc}\left(\frac{t – nT}{T}\right)$$

I’ve started off by creating a simple WAV file where I’m just saying the word ‘test’ which is 0.405 secs,. long. With a short audio file like this I can iterate and bug fix quickly.

I use audioinfo to get information about the WAV file such as it’s length. I could define these variables manually but doing it this way will allow me to run the code on different files in future.

info = audioinfo('test.wav');

The WAV file stores data as 16-bit signed integers but when using audioread to import the data into a matrix it is normalised between -1 and 1 which is easier for me to work with.

Input = audioread('test.wav');

The next step was to create the Output array to store the data for the upsampled audio file – \(x(t)\). I worked out the total number of samples the new file using the ceil function so I don’t cut it short as 768kHz is not a multiple of 44.1kHz.

NewSampleRate = 768000;
NewTotalSamples = ceil((info.TotalSamples*NewSampleRate)/info.SampleRate);
Output = zeros(NewTotalSamples,2);

GNU Octave is very slow when executing for loops so I wanted to avoid using this as much as possible. Looking at the Whittaker–Shannon formula for each point in time of the new upsampled waveform a function is evaluated using every single sample in the original waveform and then the values summed. This lends itself well to using array programming and lookup tables.

I created a lookup table for \(t\) as this determined by the new sample rate and a vector for \(n\) which had to be transposed to align with the direction the data is stored in the Input matrix.

NewInterval = 1/NewSampleRate;
NewTimeLine = (0:(NewTotalSamples-1))*NewInterval;
n = transpose(0:(info.TotalSamples-1));

To implement the summation in the Whittaker–Shannon formula I simplified the sinc function to reduce the number of operations.

$${\rm sinc}\left(\frac{t – nT}{T}\right) = {\rm sinc}\left(\frac{t}{T}-n\right)$$

Then used a for loop to calculate \(x(t)\).

for t = 1:NewTotalSamples
	Output(t,:) = sum(input.*sinc((info.SampleRate*NewTimeLine(t))-n));
end

The last step was to convert the Output array back into a 32-bit audio file which I could play through the Chord Mojo.

audiowrite('output.wav',Output,NewSampleRate,'BitsPerSample',32);

Examining the output

After running the script I used Audacity to compare the output file against the input file. In the screenshot below the upsampled file is the top one and there doesn’t appear to be any obvious distortions which is a good start.

Screenshot showing output audio file above test audio file in Audacity.

Zooming in further and you can see upsampled audio signal contains more detail.

Zoomed in screenshot showing output audio file above test audio file in Audacity.

Getting in as close as Audacity will allow you and you can see the increase in the sample rate compared with the original file.

Zoomed in screenshot showing sample rate comparison of output audio file above test audio file in Audacity.

I’m pretty happy with this first attempt however there are two areas I’d like to focus on next:

  1. I’m running the script on a computer with an Intel Core i5-5300U processor and 16GB DDR3-1600 memory running GNU Octave 5.2. Even though the original audio file is only 0.405 secs,. long it takes roughly 7 mins,. 20 secs,. to generate the upsampled file. If I’m going to try and use this to upsample a longer file I’m going to need to figure out some ways to speed this up as the number of calculations grows exponentially with length.
  1. I’m also keen to carry out some spectral analysis to see if my upsampling method has changed the audio signal beyond that which is obvious to see in Audacity.

If you’re interested and want to have a go yourself I’ve put a copy of all the files on GitHub here.