MCU #12: music with one bit to spare
Retrofitting waveform playback onto a single digital output pin of a microcontroller.
For the past couple of days, I’ve been working on a new handheld gaming project: a homage to Lemmings, a blockbuster puzzle game from the 1990s. In the game, you guide a cohort of hapless creatures to safety by assigning them specific skills — for example, instructing a lemming to dig a hole or scale a wall.
The hardware I’m using for this project is a variant of the bare-metal Cortex-M7 design I developed for my earlier game, Bob the Cat. Here’s my wife play-testing an early prototype of that device:
In Bob the Cat, I handled sound synthesis with rudimentary, monophonic “chip tunes” — essentially, lists of <frequency, duration> pairs. I programmed the hardware pulse-width modulation (PWM) subsystem to produce square-wave notes; a 1 kHz timer interrupt routine advanced the list and dynamically tweaked the PWM duty cycle (i.e., pulse width) to add complex, fuzzy overtones to the otherwise sterile beeps.
In the hands of the right person, this sound generation system would be quite capable; it could do more than some early 8-bit home computers. Unfortunately, I have no talent for musical composition, and I overestimated the availability of suitable tracks on the internet. For Bob the Cat, I settled on a rendition of Popcorn, an instrumental song from the 1960s, scavenged from an old Nokia ringtone. But when I showed my wife a prototype of the new game chirping out a version of Chicken Dance, she rolled her eyes — and I knew I had to do better.
Let there be sound
For a while, I thought about implementing a fully-fledged MIDI synthesizer to play back the more plentiful .mid files, but the complexity of that rivaled the complexity of the game itself. The alternative was to bite the bullet and support arbitrary-waveform audio — i.e., play back the data contained in .wav or .mp3 files. Waveform files are massive and hard to compress, but given the flimsy PCB speaker and the deliberate retro vibes, I figured I didn’t need high fidelity: I could probably settle for 6-bit audio and a 16 kHz sample rate.
Alas, waveform synthesis deals with analog voltages; all I had on the PCB was a tiny speaker hooked up more or less directly to a digital output pin; the only other audio hardware was a 100 µF DC-blocking capacitor and a 6.8 µF MLCC lowpass to cut down on high-frequency hiss. The microcontroller (Microchip’s SAM S70 series) did have a built-in digital-to-analog converter — but as luck would have it, the DAC pin was in use for the LCD data bus.
After some consideration, I decided to YOLO it and get it done without mail-ordering a new PCB. First, I set up PWM on the PA11 pin as before, but upped its internal clock to a bit over 9 MHz. This is done by taking the 150 MHz system bus clock and dividing it by 16:
PMC->PMC_PCER0 = (1 << 31); /* Enable PWM0 controller */ PMC->PMC_PCER0 = (1 << 10); /* Enable PIOA controller */ /* Disable PIOA control of PA11 (LQFP-64 pin 27) */ PIOA->PIO_PDR = (1 << 11); /* Enable PA11 alternate function B (PWM output). */ PIOA->PIO_ABCDSR[0] = (1 << 11); PIOA->PIO_ABCDSR[1] = 0; /* PWM0 channel 0: system bus clock / 16 = 9.375 MHz */ PWM0->PwmChNum[0].PWM_CMR = 4;
Next, I configured the PWM counter to run with a period of 64 — i.e., an effective output frequency of about 146 kHz, comfortably above the audible range:
PWM0->PwmChNum[0].PWM_CPRD = 64; PWM0->PwmChNum[0].PWM_CDTY = 0; PWM0->PWM_ENA = 1;
With these settings, the PWM module advances an internal counter (PWM_CCNT) at a rate governed by the configured clock, going from 0 to PWM_CPRD - 1. The counter is also continuously compared to the duty cycle register (PWM_CDTY). The PA11 pin is set to zero when PWM_CCNT < PWM_CDTY, and to one otherwise:
This gives us 64 levels of control over the pulse length of the output signal. The output voltage averaged over time should be proportional to the signal’s duty cycle — and I hoped that the existing 6.8 µF lowpass capacitor, together with the inductance and the mechanical inertia of the speaker, would do some impromptu averaging in hardware.
All aboard the pulse train
The next step was to configure a 16 kHz timer interrupt for pushing out waveform samples into the PWM duty register. We take the 150 MHz system bus clock, divide it 128, and then set up a counter period of 73:
PMC->PMC_PCER0 = (1 << 24); /* Power up TC0.Ch1 */ TC0->TcChannel[1].TC_CMR = (1 << 15) | /* "Waveform" (normal) mode */ (2 << 13) | /* Count up, reset on RC compare */ (3); /* Clock: MCK/128 (~1.172 MHz) */ TC0->TcChannel[1].TC_RC = 72; /* Period 73. TC0 freq ~16.05 kHz */ TC0->TcChannel[1].TC_CCR = (1 << 2) | /* Reset and start counting */ (1); /* Counter enable */ TC0->TcChannel[1].TC_IER = (1 << 4); /* Generate IRQ on RC compare */
In this setup, the timer works somewhat similarly to the PWM module: it has an internal counter and a comparator that fires off an IRQ upon reaching the value in TC_RC, then resets to zero in the next cycle. That said, the semantics differ slightly from the PWM behavior: because the timer dwells at the upper threshold for one clock tick, it has a period of TC_RC + 1.
The 16 kHz timer interrupt routine (TC1_Handler) simply grabs audio samples from flash memory and then loads the values into the PWM_CDTYUPD register. Because it operates on packed 6-bit data units, it uses an internal counter (wav_subpos) to cycle through four offsets before advancing the input data pointer by three bytes:
struct audio_6bit { u8 s0:6, s1:6, s2:6, s3:6; } __attribute__((packed)); #define pwm_output_wav6(_v) PWM0->PwmChNum[0].PWM_CDTYUPD = (_v) void wav_interrupt() { ... switch (wav_subpos) { case 0: pwm_output_wav6(wav_data->s0); wav_subpos++; return; case 1: pwm_output_wav6(wav_data->s1); wav_subpos++; return; case 2: pwm_output_wav6(wav_data->s2); wav_subpos++; return; } pwm_output_wav6(wav_data->s3); wav_subpos = 0; wav_data++; ... }
Had I used standard 8-bit or 16-bit sampling, I could’ve relied on direct memory access (DMA) transfers that bypass the CPU. But with the unusual memory-saving 6-bit encoding, the transfers need to be done by hand.
The Cortex-M7 chip has a pretty neat IRQ preemption system. Because I use other interrupts for tasks such as screen refresh, I bumped their relative priority down a notch via NVIC_SetPriority(). This way, when the higher-ranking TC1 IRQ fires, it’s always handled in real time — even if it means interrupting other async code.
The result is a surprisingly clean playback of the audio — that is, if we consider the abysmal speaker and the fact the song is downsampled to 6 bits:
If you want to have a look at the code, a work-in-progress snapshot of the game is available here.
For an in-depth article on ADCs and DACs, follow this link. For more on electronics and MCU programming, click here.
This is awesome. I'm working through the Nand to Tetris course now and having just finished the ALU. I'm hoping to build a simple computer this summer and maybe someday build a nice system like you've made :D