Synthesizing Procedural Audio (In Unity)

In this article, we’re going to be creating PCM data and playing them through Unity. This article assumes that you also know the basics of digital audio [1][2].
Note that to keep this article subject on-topic, we will hold off talking about streaming audio.

The code for creating and calculating audio is pretty straightforward and simple, but there’s a lot going on in the background. So, before we get to talking about how to create our own playable audio in Unity, let’s go through some basics on what we need to interface with to get audio playing on a computer.

This is not only because it’s useful fundamental knowledge, but it will also become relevant for a future audio streaming article.

Table of Contents

Dedicated Audio Subsystems
Audio Buffers
Mixing
Creating Audio
Just a Simple Example
A Warning With Audio Sizes

Dedicated Audio Subsystems

Let’s take a quick refresher on some things about digital audio. Digital audio is created by order a bunch of samples together. Dozens of thousands of them per second. And those sample values are used to control how the voltage of an electromagnet in your speaker(s) change over time. These animated and alternating values create movement in the speaker membrane, which creates vibrations that create sound.

To do this, a sample needs to be read from the array of PCM data and set for the value for the speaker to use. And this action needs to happen a lot – e.g., if our PCM is supposed to run at 44,100 Hz, this means this action happens every 1/44,100 seconds; or once every 22.67 microseconds. While computers are extremely fast, this action is happening continuously and needs to happen precisely.

Because of these tight requirements, computers have their own dedicated audio processor. This means your audio’s compute load and timing is independent of what your CPU is doing.
If your computer has crashed or stuttered while audio was playing, you may have noticed how the audio could still play even though your computer was halted.

While you’re CPU has to deal with

games,
apps,
background processes,
scheduled tasks,
hardware and driver interrupts,
etc.,

your sound hardware is tasked with only one task so that it’s never interrupted and always on time: read PCM samples from memory and set the speaker to the right voltage exactly when it’s supposed to.

Applications perform audio by making requests to the audio processor and providing it with audio to play.

Audio Buffers

This means in order to make your PC play audio, you need to interface with this hardware and systems.
This usually involves:

requesting a writable block of audio memory,
writing PCM to that block of memory,
releasing your control from that block of memory,
requesting the audio systems play that block.

For low-level systems and APIs, this is pretty involved. But usually, an engine or middleware will wrap this process and make it easy, such as with a JavaScript Audio API or Unity’s AudioClip class.

Mixing

It should also be noticed that this dedicated sound hardware is in charge of playing multiple audio sources (PCM buffers) at the same time because between multiple applications, or even in the same application, multiple sounds may be playing simultaneously. This is called audio mixing, and this is the system’s responsibility. When we’re providing audio, we only need to worry about our piece of audio playing in isolation. The mixer will handle the audio playing along-side everything else.

Creating Audio

For Unity, the AudioClip.Create() function is all we need. Specifically, the one that doesn’t take in streaming callbacks. And with an AudioClip, after creating it, we need to call its SetData() member. All the rest of the complexity is wrapped behind the function call.

Here’s the example we’ll be focusing on. For simplicity, some parameters can only be changed from the Unity inspector. The full sample is available here.
Be careful with the sawtooth and square wave examples. Because of their many upper-partials, they’re naturally very loud sound, even at moderate amplitudes.

Fullscreen

Let’s start off with this sample code below. An AudioSource component is needed to play an AudioClip.

using UnityEngine;

public class AudioTest : MonoBehaviour
{
    public AudioSource audioSource;

    public float amp = 0.75f;

    public float length = 3.0f;
    public float freq = 440.0f;
    public int sampsPerSec = 44100;

    public void Awake()
    {
        if(this.audioSource == null)
            this.audioSource = this.gameObject.AddComponent<AudioSource>();
    }
    
    ... // The rest covered below
}

We’ve previously covered some basic wave primitives, so let’s implement those. These code samples no-doubt have room for optimization.

public static float [] CreateSine(float vel, float len, float freq, int samplesPerSec)
{ 
    float tau = Mathf.PI * 2.0f;

    int samples = (int)(samplesPerSec * len);
    float [] ret = new float[samples];

    for(int i = 0; i < samples; ++i)
        ret[i] = Mathf.Sin((float)i/(float)samplesPerSec * freq * tau) * vel;

    return ret;
}

public static float[] CreateSquare(float vel, float len, float freq, int samplesPerSec)
{
    int samples = (int)(samplesPerSec * len);
    float[] ret = new float[samples];

    for (int i = 0; i < samples; ++i)
    {
        float lam = (float)i / (float)samplesPerSec * freq % 1.0f;
        ret[i] = lam > 0.5f ? vel : -vel;
    }
    return ret;
}

public static float[] CreateSawtooth(float vel, float len, float freq, int samplesPerSec)
{
    int samples = (int)(samplesPerSec * len);
    float[] ret = new float[samples];

    for (int i = 0; i < samples; ++i)
    {
        float lam = (float)i / (float)samplesPerSec * freq % 1.0f;
        ret[i] = (lam * 2.0f - 1.0f) * vel;
    }
    return ret;
}

public static float[] CreateTriangle(float vel, float len, float freq, int samplesPerSec)
{
    int samples = (int)(samplesPerSec * len);
    float[] ret = new float[samples];

    for (int i = 0; i < samples; ++i)
    {
        float lam = (float)i / (float)samplesPerSec * freq % 1.0f;
        ret[i] = 
            (lam < 0.5f) ? 
                (lam * 4 - 1) * vel :
                (3.0f + lam * -4) * vel;
    }
    return ret;
}

It’s all basically the same pattern:

Convert the sample to the current time value (in seconds) by dividing the sample index by the number of samples/second.
Convert the time value to tone frequency by multiplying the seconds’ value by frequency.
Find how far along with the current repeating signal (the phase) you’re in. Usually by modulating (% operator) the value by 1.
Take the phase value and remap it to a PCM value between [-1.0, 1.0].

Converting from samples number to tone values.

And then, once we have the PCM as a float array, we can use that to set the PCM data of an AudioClip.

    void SetAudio(float [] pcm, int sampsPerSec)
    {
        AudioClip ac = AudioClip.Create("", pcm.Length, 1, sampsPerSec, false);
        ac.SetData(pcm, 0);

        this.audioSource.Stop();
        this.audioSource.clip = ac;
        this.audioSource.time = 0.0f; // Only matters if the source was previously playing something else
        this.audioSource.Play();
    }

And then, since this is an example, some code to test it out. But to keep things simple, some variables can only be changed through the inspector.

    public void OnGUI()
    {
        GUILayout.Label($"Generation Length: {this.length} seconds");
        GUILayout.Label($"Tone Frequency : {this.freq} Hz");
        GUILayout.Label($"Audio SampleRate : {this.sampsPerSec} samples/second");
        GUILayout.Space(20.0f);

        GUILayout.Label("Audio Gen Amplitude");
        this.amp = GUILayout.HorizontalSlider(this.amp, 0.0f, 1.0f, GUILayout.Width(200.0f));

        if(GUILayout.Button("Generate Sine") == true)
        {
            float [] pcm = CreateSine(this.amp, this.length, this.freq, this.sampsPerSec);
            this.SetAudio(pcm, this.sampsPerSec);
        }

        if (GUILayout.Button("Generate Square") == true)
        {
            float[] pcm = CreateSquare(this.amp, this.length, this.freq, this.sampsPerSec);
            this.SetAudio(pcm, this.sampsPerSec);
        }

        if (GUILayout.Button("Generate Triangle") == true)
        {
            float[] pcm = CreateTriangle(this.amp, this.length, this.freq, this.sampsPerSec);
            this.SetAudio(pcm, this.sampsPerSec);
        }

        if (GUILayout.Button("Generate Sawtooth") == true)
        {
            float[] pcm = CreateSawtooth(this.amp, this.length, this.freq, this.sampsPerSec);
            this.SetAudio(pcm, this.sampsPerSec);
        }

        GUILayout.Space(20.0f);

        if(this.audioSource.clip != null)
        {
            if(this.audioSource.isPlaying == true)
            {
                if(GUILayout.Button("Stop") == true)
                {
                    this.audioSource.Stop();
                }
            }
            else
            {
                if (GUILayout.Button("Play") == true)
                {
                    this.audioSource.time = 0.0f;
                    this.audioSource.Play();
                }
            }
        }

        GUILayout.Space(20.0f);
        GUILayout.Label("Audio Source Volume");
        this.audioSource.volume = GUILayout.HorizontalSlider(this.audioSource.volume, 0.0f, 1.0f, GUILayout.Width(200.0f));
    }

Just a Simple Example

This example is just that, an example – it’s not too complex on its own. But it’s the foundation for synthesizing music and sound effects. We can also imagine that instead of synthesizing tone data, the knowledge of loading PCM into an AudioClip could be used to load PCM data (from a file or online source) during runtime.

A Warning With Audio Sizes

Because of how many samples are required to play high-quality audio, the memory required to represent audio of longer time lengths can grow quickly. To get around this, strategies like audio compression and audio streaming are employed.

Actually, audio compression isn’t too useful when you’re providing the raw PCM data to push directly into the audio API, but streaming can be very helpful, but that subject is for a future article.

– Stay strong, code on. William Leu

Explore more articles here.
Explore more audio articles here.