Audio and video manipulation

The beauty of the web is that you can combine technologies to create new forms. Having native audio and video in the browser means we can use these data streams with technologies such as <canvas>, WebGL or Web Audio API to modify audio and video directly, for example adding reverb/compression effects to audio, or grayscale/sepia filters to video. This article provides a reference to explain what you need to do.

Video Manipulation

The ability to read the pixel values from each frame of a video can be very useful.

Video and Canvas

<canvas> is a useful way of drawing onto web pages; it is very powerful and can be coupled tightly with video.

The general technique is to :

  1. Write a frame from the <video> element to an intermediary <canvas> element.
  2. Read the data from the intermediary <canvas> element and manipulate it.
  3. Write the manipulated data to your "display" <canvas>.
  4. Pause and repeat.

We can set up our video player and <canvas> element like this:

<video id="my-video" controls="true" width="480" height="270">
  <source src="http://jplayer.org/video/webm/Big_Buck_Bunny_Trailer.webm" type="video/webm">
  <source src="http://jplayer.org/video/m4v/Big_Buck_Bunny_Trailer.m4v" type="video/mp4">
</video>
  ...
<canvas id="my-canvas" width="480" height="270"></canvas>

and manipulate them like this (in this case we are making a black and white version of the video):

var processor = {  
  timerCallback: function() {  
    if (this.video.paused || this.video.ended) {  
      return;  
    }  
    this.computeFrame();  
    var self = this;  
    setTimeout(function () {  
      self.timerCallback();  
    }, 16); // roughly 60 frames per second  
  },
  doLoad: function() {
    this.video = document.getElementById("my-video");
    this.c1 = document.getElementById("my-canvas");
    this.ctx1 = this.c1.getContext("2d");
    var self = this;  
    this.video.addEventListener("play", function() {
      self.width = self.video.width;  
      self.height = self.video.height;  
      self.timerCallback();
    }, false);
  },  
  computeFrame: function() {
    this.ctx1.drawImage(this.video, 0, 0, this.width, this.height);
    var frame = this.ctx1.getImageData(0, 0, this.width, this.height);
    var l = frame.data.length / 4;  
    for (var i = 0; i < l; i++) {
      var grey = (frame.data[i * 4 + 0] + frame.data[i * 4 + 1] + frame.data[i * 4 + 2]) / 3;
      frame.data[i * 4 + 0] = grey;
      frame.data[i * 4 + 1] = grey;
      frame.data[i * 4 + 2] = grey;
    }
    this.ctx1.putImageData(frame, 0, 0);
    return;
  }
};  

Once the page has loaded you can call

processor.doLoad()

Note: Due to potential security issues if your video is on a different domain to your code, you'll need to enable CORS (Cross Origin Resource Sharing) on your video server.

Note: The above is a minimal example of how to manipulate video with canvas, for efficiency you may consider using requestAnimationFrame instead of setTimeout for browsers that support it.

Video and WebGL

WebGL is a powerful API that uses canvas to (typically) render three-dimensional scenes. You can combine WebGL and the <video> element to create video textures, which means you can put video inside 3D scenes.

Playback Rate

We can also adjust the rate that audio and video plays at using an attribute of the <audio> and <video> element called playbackRate (see HTMLMediaElement). playbackRate is number that represents a multiple to be applied to the rate of playback, for example 0.5 represents half speed while 2 represents double speed.

HTML:

<video id="my-video" controls src="http://jplayer.org/video/m4v/Big_Buck_Bunny_Trailer.m4v"></video>

JavaScript:

var myVideo = document.getElementById('my-video');
myVideo.playbackRate = 2;

Note: Try the playbackRate example live.

Note : playbackRate works with the <audio> and <video> elements; however in both cases the rate changes but the pitch doesn't. To manipulate the audio's pitch you need to use the Web Audio API — see the AudioBufferSourceNode.playbackRate property.

Audio Manipulation

playbackRate aside, to manipulate audio you'll typically use the Web Audio API.

Selecting an audio source

We can use the audio track of an <audio> or <video> element as a source to feed into the Web Audio API, or a plain audio buffer, or a simple sinewave/oscillator, or a stream (e.g. from WebRTC/getUserMedia). Find out exactly how to use these by reading the following pages:

Audio Filters

The Web Audio API has a lot of different filter/effects that can be applied to audio using the BiquadFilterNode, for example:

HTML:

<video id="my-video" controls src="myvideo.mp4" type="video/mp4"></video>

JavaScript:

var audioSource = context.createMediaElementSource(document.getElementById("my-video"));
var filter = context.createBiquadFilter();
audioSource.connect(filter);
filter.connect(context.destination);

Note: unless you have CORS enabled, to avoid security issues your video should be on the same domain as your code.

Common filters that can be applied to nodes:

  • Low Pass: Allows frequencies below the cutoff frequency to pass through and attenuates frequencies above the cutoff.
  • High Pass: Allows frequencies above the cutoff frequency to pass through and attenuates frequencies below the cutoff.
  • Band Pass: Allows a range of frequencies to pass through and attenuates the frequencies below and above this frequency range.
  • Low Shelf: Allows all frequencies through, but adds a boost (or attenuation) to the lower frequencies.
  • High Shelf: Allows all frequencies through, but adds a boost (or attenuation) to the higher frequencies.
  • Peaking: Allows all frequencies through, but adds a boost (or attenuation) to a range of frequencies.
  • Notch: Allows all frequencies through, except for a set of frequencies.
  • Allpass: Allows all frequencies through, but changes the phase relationship between the various frequencies.

Example:

var filter = context.createBiquadFilter();
filter.type = "lowshelf";
filter.frequency.value = 1000;
filter.gain.value = 25;

Note: See BiquadFilterNode for more information.

Convolutions and Impulses

It's also possible to apply impulse responses to audio using the ConvolverNode. An impulse response is the sound created after a brief impulse of sound (like a hand clap). An impulse response will signify the environment in which the impulse was created (for example, an echo created by clapping your hands in a tunnel.)

Example:

var convolver = context.createConvolver();
convolver.buffer = this.impulseResponseBuffer;
// Connect the graph.
source.connect(convolver);
convolver.connect(context.destination);

Note: See ConvolverNode for more information.

Spatial Audio

We can also position audio using a panner node. A panner node allows us to define a source cone as well as positional and directional elements — all in 3D space defined by 3D cartesian coordinates.  

Example:

var panner = context.createPanner();
panner.coneOuterGain = 0.2;
panner.coneOuterAngle = 120;
panner.coneInnerAngle = 0;
panner.connect(context.destination);
source.connect(panner);
source.start(0);
// Position the listener at the origin.
context.listener.setPosition(0, 0, 0);

Note: See ConvolverNode for more information.

JavaScript Codecs

It's also possible to manipulate audio at a low level using JavaScript. This can be useful should you want to create audio codecs.

Libraries currently exist for the following formats :

Note: At AudioCogs, you can Try out a few demos; Audiocogs also provides a Framework, Aurora.js, which is intended to help you author your own codecs in JavaScript.

Examples

Tutorials

Reference

Document Tags and Contributors

 Contributors to this page: chrisdavidmills
 Last updated by: chrisdavidmills,