Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for decoding the Quite Ok Audio Format (QOA) #2592

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ssievert42
Copy link
Contributor

@ssievert42 ssievert42 commented Jan 5, 2025

This adds support for decoding the Quite Ok Audio Format (QOA).

QOA does reasonably fast lossy audio compression from 16 bits per sample down to 3.2 bits per sample.

To encode some audio, the reference encoder / decoder can be used.

I tested this on Jolt.js; reading QOA encoded 8 kHz mono audio from Storage, decoding it and playing it on a speaker using the Waveform class works without noticeable glitches, but I haven't benchmarked it yet.

This implementation uses about 2 KB of additional flash, of which about 500 Bytes are just error messages.

ToDo:

  • add some usage examples
  • maybe don't allocate 2 KB on the stack as a source data buffer to pass to the QOA decoder (feels like this is catapulting us into undefined behaviour land)
  • maybe enable on other boards?

Using ffmpeg and a locally-built version of the QOA reference encoder / decoder, you can create a QOA file like this:

ffmpeg -i audio_source.wav -ac 1 -ar 8k audio.wav
./qoaconv audio.wav audio.qoa

Decode example on the Linux build: https://gist.github.com/ssievert42/eb1edb7227f397d59fc0f2b0b4423d7c
Playback example on Jolt.js: https://gist.github.com/ssievert42/584d3142008b7acffe69f33e25f1622c

I can't really hear a difference between 8 and 16 bits per sample, but weirdly enough everything seems a tad too slow on Jolt.js, even when playing uncompressed PCM from storage.
Changing both the playback samplerate from 8 kHz to 9kHz and the frequency on the H0 and H1 pins from 80 kHz to 90 kHz sounds a lot more like the source audio though 🤷

@ssievert42 ssievert42 force-pushed the qoa branch 2 times, most recently from 97d90f2 to 20221c0 Compare January 6, 2025 17:45
@gfwilliams
Copy link
Member

Thanks! I might have to have a think about this though - I like the idea, but ideally I'd like to be able to pull it into other Espruino builds as well (but then on something like Puck.js 2k is a big deal for something that most users wouldn't need).

It feels like maybe if this were implemented as a module with Inline C then it could just be pulled into whatever board wanted it - the decoder seems reasonably simple so I think that might be a reasonable option...

Odd about the sample rate too - I'd have to try and do some tests, I guess it's possible that the utility timer is processing things a bit too slowly (not taking into account the processing overhead).

@ssievert42
Copy link
Contributor Author

if this were implemented as a module with Inline C then it could just be pulled into whatever board wanted it

This gave me the opportunity to play with inline C I've been waiting for :)

I made a (currently still kind of WIP and in need of cleanup) drop-in replacement module (well, mostly) using inline C: https://gist.github.com/ssievert42/f97dbf3566b61fa5d20a56a7cd641d1d
Usage example on Jolt.js: https://gist.github.com/ssievert42/1a5c9d39e5dc927dcfed0762477d4f32

But how about both?
Having builtin support allows using this on the Linux build (and potentially in the emulator) as well as on unofficial boards (without having to self-host the espruino compiler).
And on official boards you could use the module with inline C to save a few KB of default build binary size.

On a side note: Looks like Jolt.js is missing from the supported boards in https://github.com/gfwilliams/EspruinoCompiler/blob/master/src/utils.js; I had to add it to get my self-hosted espruino compiler to generate binaries for Jolt.js.
Is that still the correct repo?

@gfwilliams
Copy link
Member

This looks great, thanks! It'd be really nice to add the Waveform initialisation and 'nextBuffer' code into the module so you could literally just do require("QOA").play(require("Storage").read("file.qoa")).then(...);

Personally I'd prefer just to have the one implementation of this as the InlineC. It looks like because you're just dealing with flat arrays here (and not having to use iterators like you were in the included version) it may actually be smaller/faster to have the inline C version anyway.

Having builtin support allows using this on the Linux build (and potentially in the emulator) as well as on unofficial boards

... although we have no way to output sound from Linux or the emulator so I'm not sure that's such a big deal? But yes, there is a point for unofficial boards, but I feel like if there was demand I could look at adding support to unofficial boards to the Espruino-hosted compiler (maybe with some rate limit).

I feel like right now that's going a bit far though, as apart from you (who already has the compiler set up) I'm not aware of anyone else that'd want to use this on non-official boards at the moment.

... also for the Pipboy build I already have WAV and AVI file parsing, and while maybe it's not quite as small as QOA at some point I'd really like to drag that code out and allow other boards to include it.

Looks like Jolt.js is missing from the supported boards in EspruinoCompiler

Yes, that's the correct repo. I just looked and I hadn't committed the code (just done) - the Espruino-hosted one already worked I think?

One thing if you fancied another challenge - having to build your own QOA encoder to use this would be a bit of an issue for a lot of people. Looks like there's a JS encoder though so if you submitted this to EspruinoDocs you could actually add an encoder inside the documentation page for the QOA module to make it really easy to encode audio!

The Web browser can decode and re-sample audio itself, so if you use the code below and shove the data from v into the QOA encoder you should be sorted:

<input class="form-input" id="fileInput" type="file"></p>
  <script>
      const fileInput = document.getElementById('fileInput');
      fileInput.addEventListener('change', (event) => {
        const file = event.target.files[0];
        const SAMPLERATE = 8000;
        const offlineAudioContext = new OfflineAudioContext(1, SAMPLERATE*10/*max buffer length*/, SAMPLERATE);

        const reader = new FileReader();
        reader.onload = (e) => {
          offlineAudioContext.decodeAudioData(e.target.result)
            .then(audioBuffer => {
              const bufferLength = audioBuffer.length;
              const numberOfChannels = audioBuffer.numberOfChannels;

              // Get the PCM data from the audio buffer
              const pcmData = new Float32Array(bufferLength);
              audioBuffer.copyFromChannel(pcmData, 0, 0); // copy from first channel only
              // TODO: could average channels?
              // convert it to 16 bit format
              for (let i = 0; i < pcmData.length; i++) {
                var v = Math.round(pcmData[i] * 32767); // we expect this to wrap over negative->positive anyway
                if (v<-32768) v=-32768;
                if (v>32767) v=32767;
                // FIXME now encode data...
              }
            })
            .catch(error => {
              console.error('Error decoding audio data:', error);
            });
        };
        reader.readAsArrayBuffer(file);
      });

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants