SAX

SAX, short for Simple API for XML, is a parsing API. SAX was the first widely adopted API for XML in Java, and later implemented in several other programming language environments. Starting with Firefox 2, a SAX parser is available to XUL applications and extensions. For more information, please see SAX homepage.

Quick start

The SAX parser functionality is available through the XML reader component. To create one, use the following code:

var xmlReader = Components.classes["@mozilla.org/saxparser/xmlreader;1"]
                          .createInstance(Components.interfaces.nsISAXXMLReader);

After you created the SAX parser, you need to set the handlers for the events you're interested in and fire off the parsing process. All functionality is available through the nsISAXXMLReader interface.

Set the handlers

Handlers are user-defined objects implementing SAX handler interfaces, depending on what kind of information they need to get from the parser. After the parsing process is started, handlers receive a series of callbacks for the content of XML being parsed. The following handlers are available:

Interface Purpose
nsISAXContentHandler Receive notification of the logical content of a document (e.g. elements, attributes, whitespace, and processing instructions).
nsISAXDTDHandler Receive notification of basic DTD-related events.
nsISAXErrorHandler Receive notification of errors in the input stream.
nsISAXLexicalHandler SAX2 extension handler for lexical events (e.g. comment and CDATA nodes, DTD declarations, and entities).

An example implementation of the most commonly used content handler:

function print(s) {
  dump(s + "\n");
}
xmlReader.contentHandler = {
  // nsISAXContentHandler
  startDocument: function() {
    print("startDocument");
  },
  endDocument: function() {
    print("endDocument");
  },
  startElement: function(uri, localName, qName, /*nsISAXAttributes*/ attributes) {
    var attrs = [];
    for(var i=0; i<attributes.length; i++) {
      attrs.push(attributes.getQName(i) + "='" + 
                 attributes.getValue(i) + "'");
    }
    print("startElement: namespace='" + uri + "', localName='" + 
          localName + "', qName='" + qName + "', attributes={" + 
          attrs.join(",") + "}");
  },
  endElement: function(uri, localName, qName) {
    print("endElement: namespace='" + uri + "', localName='" + 
          localName + "', qName='" + qName + "'");
  },
  characters: function(value) {
    print("characters: " + value);
  },
  processingInstruction: function(target, data) {
    print("processingInstruction: target='" + target + "', data='" + 
          data + "'");
  },
  ignorableWhitespace: function(whitespace) {
    // don't care
  },
  startPrefixMapping: function(prefix, uri) {
    // don't care
  },
  endPrefixMapping: function(prefix) {
    // don't care
  },
  // nsISupports
  QueryInterface: function(iid) {
    if(!iid.equals(Components.interfaces.nsISupports) &&
       !iid.equals(Components.interfaces.nsISAXContentHandler))
      throw Components.results.NS_ERROR_NO_INTERFACE;
    return this;
  }
};

Start parsing

The XML Reader component can parse XML from a string, an nsIInputStream, or asynchronously via the nsIStreamListener interface. Below is an example of parsing from a string:

xmlReader.parseFromString("<f:a xmlns:f='g' d='1'><BBQ/></f:a>", "text/xml");

This call results in the following output (assuming the content handler from the example above is used):

startDocument
startElement: namespace='g', localName='a', qName='f:a', attributes={d='1'}
startElement: namespace='', localName='BBQ', qName='BBQ', attributes={}
endElement: namespace='', localName='BBQ', qName='BBQ'
endElement: namespace='g', localName='a', qName='f:a'
endDocument

Document Tags and Contributors

 Contributors to this page: teoli, Nickolay, Nukeador, Ptak82, BenoitL, Esquifit, Sayrer, Mgjbot
 Last updated by: Nickolay,