Detailed description and usage:
This is an overview of the classes that this package defines and how to use
each one
class AbstractSAXParser: This class defines an abstract SAX parser, if you want to build your own
SAX parser or adapt some parser you should implement this class methods.
What does a SAX parser do? It must parse the XML document and generate "events" that are passed to a listener
object. Note that the parser doesn't process the XML document at all it just parses the documents and generate
events that will be processed by a listener object (An AbstractFilter object).
Class methods to be implemented are:
Method | Description |
AbstractSAXParser() | The constructor should build the parser and can receive an XML file, for example if needed indicating the XML document to be parsed. How the parser knows where is the document to parse is left free to the parser implementation. |
other optional methods | Other optional methods allowinf to set XML documents from different sources, parser options etc can be added as needed. |
parse() | This is the principal method of the class, parsers must parse the XML source and generate the proper events calling the following methods defined in this class: startElementHandler($parser,$name,$attribs)/endElementHandler($parser,$name)/characterDataHandler($data) those methods will call the same methods on the listener object thus propagating SAX events to the listener as they are produced by the parser. |
TIP:Note that you can implement an AbstractSAXParser for non-xml data converting the non-XML data to XML by simply producing events
and then processing the events using filters that are prepared for XML processing.
class ExpatParser: This class is an implementation of the AbstractSAXParser class using the PHP built-in expat parser which is, precicesly, a SAX parser.
This version receives the XML as a file receiving the name of the file as an argument of the constructor. The class can be used as following:
$parser = new ExpatParser("foo1.xml");
$filter=new SomeFilterHere();
$parser->setListener($filter);
$parser->parse();
Note that all the processing is done at the filter so it is time to see what a filter is (keeo reading, it's easy!)
class AbstractFilter:Filters are objects that receive SAX events (from a parser or another filter), process them
doing something useful and then pass the events to another filter or output the result in some way, filters that don't
propagate events are called "finalizer" filters and are typically filters that output the document to the browser or
a file.
Filters must extend the AbstractFilter class implementing the following methods:
startElementHandler($name,$attribs) | This method is called when an element starts in the XML document, the method receives the element names as well as an array of asociative arrays with the
element's attirbutes. A filter can artificially call this method to "create" elements in the result. |
endElementHandler($name) | This method is called when an element ends. It receives the name of the element. |
characterDataHandler($data) | This method is called when text data occurs in the XML document note that context is not provided so the filter
object must keep track of the context if needed using variable members, an stack or smilar methods. |
Other methods | Other methods specific to what the filter should do may be added as well. |
Besides that methods all filters have a predefined "setListener" method that allows you to set a listener object
for the filter events what is needed to propagate events from one filter to another.
As an example two filters are provided in the package: FilterName and FilterNameBold, FilterName converts all the
<name>something</name> elements uppercasin its content for example to <name>SOMETHING</name>
The FilterNameBold adds a "bold" element to all name elements thus converting <name>something</name> into
<name><b>something</b></name>
The FilterOutput method is a "finalizer" filter that doesn't propagate events, it just outputs the XML content to the browser.
So it is useful as the last filter in filter chains for testing.
If you want to convert all name elements to uppercase you use the classes as follows:
include_once("class_sax_filters.php");
$f1=new ExpatParser("applications.xml");
$f1->parserSetOption(XML_OPTION_CASE_FOLDING,0);
$f2=new FilterName();
$f3=new FilterOutput();
$f2->setListener($f3);
$f1->setListener($f2);
$f1->parse();
We create an Expat parser, a FilterName object and a FilterOutput object.
First we set the FilterOutput as the FilterName listener, what means that events created by FilterName
will be passed to FilterOutput.
Then we set the FilterName as the parser listener what means that events generated at the parser level
will be propagated to FilterName and since FilterName passes events to FilterOutput that will be the last
link in the filter chain.
The order in which listeners are set is very important since when we set the parser listener that object
must already have been set with a listener in order to do something.
Then we just call the parse method. What will happen is that the parser will parse the XML document generating
events, the events will be passed to filterName where name elements are uppercased and then the events will
be propagated to filterOutput where the content is just printed.
Filter Chains can be as complex as you want linking several filters to produce a complex task. Filters can
add elements, remove elements (absorbing events) and change elements thus allowing any kind of XML processing
from queries to transformations.
SAX filters are a sound way to modularize SAX processing of XML documents. When documents are very large or
huge only a SAX based processing is efficient since SAX never reads the whole document in memory it just
processes the document chunk by chunk.
Documentation
Classes
AbstractSAXParser
Extends: None
Description: This is an abstract class defininf the methods that SAX parsers must implement in order to be able to work with SAX filters.
AbstractSAXParser
void AbstractSAXParser()
- The constructor may or may not receive arguments pointing to the XML source to be parsed, this heavily depends on the parser itself, we may have parsers for XML files, parsers for XML files or XML strings etc.
-
- Parameters:
-
- Returns:
- Throws:
None
parse
void parse()
- This method parses the XML source specified to the parser in some way. While parsing this method must call the proper methods in order to propagate events to this class' listener. The methods to be called are startElementHandler($parser,$name,$attribs), endElementHandler($parser,$name) and characterDataHandler($data). Note that these methods must not be implemented by the parser.
-
- Parameters:
-
- Returns:
- Throws:
None
setListener
void setListener(Object $obj)
- This method sets the listener to a parser object. The listener is a Filter object extending the AbstractFilter class that will receive the events generated by the parser and do something with them.
-
- Parameters:
-
$obj - An object of a Filter class extending the AbstractFilter class
- Returns:
- Throws:
None
AbstractFilter
Extends: None
Description: This class defines the methods that must be implemented by a Filter.
Method Summary |
void |
setListener(object $obj)
Sets the Filter's listener object |
void |
startElement(string $name, array $attribs)
Method that is called when an XML element starts |
void |
endElement(string $name)
Method that is called when an element ends |
void |
characterDataHandler(string $data)
Method that will be called when text data is found |
setListener
void setListener(object $obj)
- This method defines an object that will be used as a listener for events propagated from a Filter. This method is already implemented in the abstract class so Filters don't have to implement it.
-
- Parameters:
-
$obj - An object from a class extending the AbstractFilter class
- Returns:
- Throws:
None
startElement
void startElement(string $name, array $attribs)
- This method should be implemented by Filters, the method receives the name of the element and its attributes. What the method does depends on the filter.
-
- Parameters:
-
$name - Name of th element
$attribs - This is an array of associative arrays containing the attributes for the element. You can process it using
a construct like: foreach($attribs as $name=>$value) { }
- Returns:
- Throws:
None
endElement
void endElement(string $name)
- This method should be implemented by filters, it will be called when an XML element ends
-
- Parameters:
-
$name - Name of the element that ends
- Returns:
- Throws:
None
characterDataHandler
void characterDataHandler(string $data)
- This method should be implemented by filters, the method will be called when text is found in an XML document, the method can be called several times for the same text node (by chunks) and no context information is provided, the filter should track context if it needs to know, for example, the name of the element where text was found
-
- Parameters:
-
$data - The text chunk found
- Returns:
- Throws:
None
ExpatParser
Extends: AbstractSAXParser
Description: This is an implementation of the AbstractSAXParser class using the PHP internal Expat parser
Method Summary |
void |
ExpatParser(string $xmlfile)
Constructor |
void |
parse(string $xmlfile)
Parses the XML document |
void |
setListener(object $obj)
Sets the listener object of the ExpatParser |
void |
parserSetOption(constant $option, some $value)
Sets options for the Expat parser |
ExpatParser
void ExpatParser(string $xmlfile)
- The constructor receives the name of the XML file to be parsed.
-
- Parameters:
-
$xmlfile - Name of the file containing the XML document to be parsed
- Returns:
- Throws:
None
parse
void parse(string $xmlfile)
- This method parses the XML file pointedby the filename indicated when the parser was constructed. The method will parse the document and propagate events to the listener object. A setListener method must havebeen used before parsing.
-
- Parameters:
-
- Returns:
- Throws:
None
setListener
void setListener(object $obj)
- This method is used to set the listenerObject for the parser: the first Filter in the chain. The object must be an instance of a class extending the AbstractFilter object.
-
- Parameters:
-
$obj - An object from a class implementing the AbstractFilter class
- Returns:
- Throws:
None
parserSetOption
void parserSetOption(constant $option, some $value)
- Sets options for the Expat parser
-
- Parameters:
-
$option - For example XML_OPTION_CASE_FOLDING to set if case folding is applied or not to the document. (See the PHP documentation for options that can be set for an Expat parser)
$value - Value for the option being set
- Returns:
- Throws:
None
FilterOutput
Extends: AbstractFilter
Description: This is a finalizer filter that must be used always at the end of a filter chain. This filter absorbs SAX events ouputting the XML document to the browser.
Method Summary |
This class doesn´t have any method |
|