Music Upsampler

Ep.1 – Upsampling my music to 768kHz

Close-up photo of the Chord Mojo DAC

The Set-Up

Since my teenage years I’ve always been into high-end audio. The engineer in me is fascinated with the technology behind everything probably more so than the final sound. There’s also a little bit of one-upmanship in there as well ?.

It started with watching movies in surround sound but evolved to include listening to music, something which I got more into at university when I didn’t have a TV. Unfortunately my aspirations have always been larger than my wallet ?.

Not only is high-end audio is an expensive hobby but finding the space to make the most out of a good pair of speakers is difficult for me these days so I started to become more interested in headphones.

A couple of years ago I decided to buy a DAC to improve the sound coming from my laptop which is how I listen to most of my music. Having done a lot of research and after much deliberation I finally settled on the Chord Mojo which was getting good reviews at the time (here and here) and still does now over four years later. I’ve been very happy with it and think it sounds brilliant however…

The Problem

The purpose of a DAC is reconstruct a continuous (aka. smooth) signal from a series of samples. Sampling Theory tells you how to sample and then perfectly reconstruct the original signal by using the Whittaker–Shannon interpolation formula:

$$x(t) = \sum_{n=-\infty}^{\infty} x[n] \, {\rm sinc}\left(\frac{t – nT}{T}\right)$$

Practically implementing this however is impossible for two reasons:

  1. Looking closer at the formula you can see that to reconstruct the signal at any point in time it has to analyse every single sample. For a typical pop song which is 3 min., 30 sec., in length and sampled at 44.1 kHz this is nearly 10 million samples and that’s just for a mono recording.
  1. To do that for every point along the signal to make it ‘smooth’ you would need to repeat the above step an infinite number of times.

If you can’t reconstruct the signal perfectly then you’re going to need to make compromises in one or both of these areas.

A trivial DAC design uses zero-order hold and does not try to recreate the signal at any other point in time except where it’s been sampled. If you look at the Whittaker-Shannon interpolation formula again you’ll see that the reconstructed signal equals the sampled signal at those points i.e. when:

$$t = nT$$

From here you can add further complexity to reconstruct the signal to improve the quality of the sound. There are many different ways to do this and manufacturers are coming up with new approaches all the time.

The article here gives you some information on the approach taken by the Chord Mojo. In essence it uses an FPGA to reconstruct the signal with a particular focus on accurately reproducing the time-domain of the audio signal based on the premise that the human brain is more sensitive to transients when perceiving sound.

The Heist

While there is no doubt what the Chord Mojo does is impressive it inevitably still adds it’s own ‘colour’ to the sound and I wanted to see if there was anyway of pre-processing the audio signal to minimise or eliminate this.

In this series I’m going to try different ways of doing this to see if they bring any improvement.

Let’s get started…