Introducing the Audio API extension

The Audio Data API extension extends the HTML5 specification of the <audio> and <video> media elements by exposing audio metadata and raw audio data. This enables users to visualize audio data, to process this audio data and to create new audio data.

Obsolete since Gecko 28 (Firefox 28 / Thunderbird 28 / SeaMonkey 2.25)
This feature is obsolete. Although it may still work in some browsers, its use is discouraged since it could be removed at any time. Try to avoid using it.

Deprecated since Gecko 22 (Firefox 22 / Thunderbird 22 / SeaMonkey 2.19)
This feature has been removed from the Web. Though some browsers may still support it, it is in the process of being dropped. Do not use it in old or new projects. Pages or Web apps using it may break at any time.

Please note that this document describes a non-standard experimental API. This API has been deprecated since Gecko 22, disabled since Gecko 28, and removed from Gecko 31. You should use the Web Audio API instead.

Reading audio streams

The loadedmetadata event

When the metadata of the media element is available, it triggers a loadedmetadata event. This event has the following attributes:

  • mozChannels: Number of channels
  • mozSampleRate: Sample rate per second
  • mozFrameBufferLength: Number of samples collected in all channels

This information is needed later to decode the audio data stream. The following example extracts the data from an audio element:

<!DOCTYPE html>
<html>
  <head>
    <title>JavaScript Metadata Example</title>
  </head>
  <body>
    <audio id="audio-element"
           src="song.ogg"
           controls="true"
           style="width: 512px;">
    </audio>
    <script> 
      function loadedMetadata() {
        channels          = audio.mozChannels;
        rate              = audio.mozSampleRate;
        frameBufferLength = audio.mozFrameBufferLength;      
      } 
      var audio = document.getElementById('audio-element'); 
      audio.addEventListener('loadedmetadata', loadedMetadata, false);
    </script>
  </body>
</html>

The MozAudioAvailable event

As the audio is played, sample data is made available to the audio layer and the audio buffer (size defined in mozFrameBufferLength) gets filled with those samples. Once the buffer is full, the event MozAudioAvailable is triggered. This event therefore contains the raw samples of a period of time. Those samples may or may not have been played yet at the time of the event and have not been adjusted for mute or volume settings on the media element. Playing, pausing, and seeking the audio also affect the streaming of this raw audio data.

The MozAudioAvailable event has 2 attributes:

  • frameBuffer: Framebuffer (i.e., an array) containing decoded audio sample data (i.e., floats)
  • time: Timestamp for these samples measured from the start in seconds

The framebuffer contains an array of audio samples. It's important to note that the samples are not separated by channels; they are all delivered together. For example, for a two-channel signal: Channel1-Sample1 Channel2-Sample1  Channel1-Sample2 Channel2-Sample2 Channel1-Sample3 Channel2-Sample3.

We can extend the previous example to visualize the timestamp and the first two samples in a <div> element:

<!DOCTYPE html>
<html>
  <head>
    <title>JavaScript Visualization Example</title>
  </head>
  <body>
    <audio id="audio-element"
           src="revolve.ogg"
           controls="true"
           style="width: 512px;">
    </audio>
	<pre id="raw">hello</pre>
    <script> 
      function loadedMetadata() {
        channels          = audio.mozChannels;
        rate              = audio.mozSampleRate;
        frameBufferLength = audio.mozFrameBufferLength; 		
      } 
      function audioAvailable(event) {
        var frameBuffer = event.frameBuffer;
        var t = event.time;
        var text = "Samples at: " + t + "\n";
        text += frameBuffer[0] + "  " + frameBuffer[1];
        raw.innerHTML = text;
      }
      var raw = document.getElementById('raw');
      var audio = document.getElementById('audio-element'); 
      audio.addEventListener('MozAudioAvailable', audioAvailable, false); 
      audio.addEventListener('loadedmetadata', loadedMetadata, false);
    </script>
  </body>
</html>

Creating an audio stream

It is also possible to create and setup an <audio> element for raw writing from script (i.e., without a src attribute). Content scripts can specify the audio stream's characteristics, then write audio samples. Users must create an audio object and then use the mozSetup() function to specify the number of channels and the frequency (in Hz).  For example:

// Create a new audio element
var audioOutput = new Audio();
// Set up audio element with 2 channel, 44.1KHz audio stream.
audioOutput.mozSetup(2, 44100);

Once this is done, the samples need to be created. Those samples have the same format as the ones in the mozAudioAvailable event. Then the samples are written in the audio stream with the function mozWriteAudio(). It's important to note that not all the samples might get written in the stream. The function returns the number of samples written, which is useful for the next writing. You can see an example below:

// Write samples using a JS Array
var samples = [0.242, 0.127, 0.0, -0.058, -0.242, ...];
var numberSamplesWritten = audioOutput.mozWriteAudio(samples);
// Write samples using a Typed Array
var samples = new Float32Array([0.242, 0.127, 0.0, -0.058, -0.242, ...]);
var numberSamplesWritten = audioOutput.mozWriteAudio(samples);

In the following example, we create an audio pulse:

<!doctype html>
<html>
  <head>
     <title>Generating audio in real time</title>   <script type="text/javascript">
     function playTone() {
       var output = new Audio();
      output.mozSetup(1, 44100);
       var samples = new Float32Array(22050);
       for (var i = 0; i < samples.length ; i++) {
         samples[i] = Math.sin( i / 20 );
       }
              output.mozWriteAudio(samples);
     }
   </script>
 </head>
 <body>
   <p>This demo plays a one second tone when you click the button below.</p>
   <button onclick="playTone();">Play</button>
 </body>
 </html>

The mozCurrentSampleOffset() method gives the audible position of the audio stream, meaning the position of the last heard sample.

// Get current audible position of the underlying audio stream, measured in samples.
var currentSampleOffset = audioOutput.mozCurrentSampleOffset();

Audio data written using the mozWriteAudio() method needs to be written at a regular interval in equal portions, in order to keep a little ahead of the current sample offset (the sample offset that is currently being played by the hardware can be obtained with mozCurrentSampleOffset()), where "a little" means something on the order of 500 ms of samples. For example, if working with two channels at 44100 samples per second, a writing interval of 100 ms, and a pre-buffer equal to 500 ms, one would write an array of (2 * 44100 / 10) = 8820 samples, and a total of (currentSampleOffset + 2 * 44100 / 2).

It's also possible to auto-detect the minimal duration of the pre-buffer, such that the sound is played without interruptions, and lag between writing and playback is minimal. To do this start writing the data in small portions and wait for the value returned by mozCurrentSampleOffset() to be greater than 0.

var prebufferSize = sampleRate * 0.020; // Initial buffer is 20 ms
var autoLatency = true, started = new Date().valueOf();
...
// Auto latency detection
if (autoLatency) {
  prebufferSize = Math.floor(sampleRate * (new Date().valueOf() - started) / 1000);
  if (audio.mozCurrentSampleOffset()) { // Play position moved?
    autoLatency = false; 
}

Processing an audio stream

Since the MozAudioAvailable event and the mozWriteAudio() method both use Float32Array values, it is possible to take the output of one audio stream and pass it directly (or process first and then pass) to a second. The first audio stream needs to be muted so that only the second audio element is heard.

<audio id="a1" 
       src="song.ogg"
       controls>
</audio>
<script>
var a1 = document.getElementById('a1'),
    a2 = new Audio(),
    buffers = [];
function loadedMetadata() {
  // Mute a1 audio.
  a1.volume = 0;
  // Setup a2 to be identical to a1, and play through there.
  a2.mozSetup(a1.mozChannels, a1.mozSampleRate);
}
function audioAvailable(event) {
  // Write the current framebuffer
  var frameBuffer = event.frameBuffer;
  writeAudio(frameBuffer);
}
a1.addEventListener('MozAudioAvailable', audioAvailable, false);
a1.addEventListener('loadedmetadata', loadedMetadata, false);
function writeAudio(audio) {
  buffers.push(audio);
  // If there's buffered data, write that
  while(buffers.length > 0) {
    var buffer = buffers.shift();
    var written = a2.mozWriteAudio(buffer);
    // // If all data wasn't written, keep it in the buffers:
    if(written < buffer.length) {
      buffers.unshift(buffer.slice(written));
      return;
    }
  }
}
</script>

See also

Document Tags and Contributors

Tags: 
 Last updated by: teoli,