RDF Resource Description Framework Icon RDF Parser (class_rdf_parser.php)

Description: This class is a port to PHP of the Repat parser by Jason Diammond. Repat originally written in C was ported to PHP code using the XML parser functions of PHP, the result is a SAX-like RDF parser in native PHP code. RDF is an initiative of the W3C to describe metadata, you can find everything about RDF in the W3C site.

The class implements an event-driven interface for parsing RDF, when an RDF statement is found a handler defined by you (a PHP function) will be called and you can do whatever you want with the RDF triple (subject,predicate,object).
See usage information below and class documentation for a full description of this class.

NEWS:
  • 06-13-2002 First version of this class released.
This class code as well as documentation are hosted at SourceForge please visit our SourceForge page for releases, documentation, bug-tracking, support forums and mailing lists.

Features To-dos
  • All the valid RDF XML syntaxes defined by the W3C are supported. (See RDF XML syntax spec here)
  • Can parse RDF embeddded in other XML vocabulaires.
  • You can define SAX handlers for non-RDF XML elements. Thus allowing you to parse RDF and non-RDF in just one pass.
  • If you have some to-do in mind let me know it.

Contact: Luis Argerich (lrargerich@yahoo.com)

Detailed description and usage:

The first thing to do is to create an object of the class and then create a new RDF parser:

$rdf=new Rdf_parser();
$rdf->rdf_parser_create( NULL );

Now we have to define the handlers to be used, you can use the following methods to set handlers:

  • rdf_set_statement_handler($handler)
  • rdf_set_element_handler($start_handler,$end_handler);
  • rdf_set_warning_handler($warning_handler);
  • rdf_set_character_data_handler($character_data_handler);


The Statement handler

The statement handler is the most important handler since it will be the one called when RDF statements are found, we'll define the handler here and then describe each parameter:

  $rdf->rdf_set_statement_handler("my_statement_handler");
  function my_statement_handler(&$user_data, $subject_type, $subject, $predicate, 
                                 $ordinal,$object_type,$object,$xml_lang )
  {
    // Code here
  }


The statement handler parameters

ParameterDescription
user_datauser_data is a php variable that can be set using rdf_set_user_data($data),once the variable is set you can access it or use it in every handler for whatever you want.
subject_typeThe subject type of the RDF statement, posible values are:
  • RDF_SUBJECT_TYPE_URI
  • RDF_SUBJECT_TYPE_DISTRIBUTED
  • RDF_SUBJECT_TYPE_PREFIX
  • RDF_SUBJECT_TYPE_ANONYMOUS
subjectThe value of the subject of an RDF statement.
predicateThe predicate of an RDF statement.
ordinalThe ordinal position if the statement describes a member of a RDF collection.
object_typeThe object type of the RDF statement, values are:
  • RDF_OBJECT_TYPE_RESOURCE
  • RDF_OBJECT_TYPE_LITERAL
  • RDF_OBJECT_TYPE_XML
objectThe value of the staement's object
xml_langThe value of the xml_lang attribute if set


Other handlers

The start and end element handlers are triggered (if set) when some non-RDF XML element is found by the parser, the start_element_handler should receive $user_data,$name and $attributes while the end_element handler receives $user_data and $name. The character data handler

This handler is triggered when data is found outside RDF elements (if the handler is set). The handler receives two arguments: $user_data and $data containing the characters found. The warning handler

The warning handler is triggered (if set) when an RDF error is detected, teh handler receives one argument: $message containing the description of the error. Parsing

To parse a RDF document use the $rdf->rdf_parse($s,$len,$is_final) function you can parse the document by chunks calling this function many times the arguments are: $s is the data to be parsed, $len is the length of the data to be parsed and $is_final is a boolean indicating if this chunk is the last chunk of the document (no more data). The method returns true if everything went well or false if there's an XML error while parsing the document.

Documentation

Classes

Rdf_parser

Extends: None
Description: This class parses RDF documents in any RDF valid syntax according to the RDF specification. Events are produced that can be intercepted by callback PHP functions.

Method Summary
 Boolean rdf_parser_create(string $encoding)
          Creates an RDF parser
 void rdf_parser_free(string $encoding)
          Frees resources allocated by an RDF parser
 void rdf_set_user_data(Any $user_data)
          Sets a php variable as the parser user data
 void rdf_set_statement_handler(string $handler)
          Set the handler to be called when statements are found.
 void rdf_set_parse_type_literal_handler(string $start, string $end)
          Sets handlers for parse type literals
 void rdf_set_element_handler(string $start, string $end)
          Sets handlers to be called when a non-RDF element starts or ends
 void rdf_set_character_data_handler(string $handler)
          This allows you to define a handler for text outside RDF elements.
 void rdf_set_warning_handler(string $handler)
          Sets a warning handler to be called if the RDF document is broken
 Boolean rdf_parse(string $s, int $len, boolean $is_final)
          Parses a portion of an RDF document or a whole document
 void rdf_set_base(string $base)
          Sets the base name of the document being parsed
 

Method Detail

rdf_parser_create

Boolean rdf_parser_create(string $encoding)
This function creates a new RDF parser, this is the first method you should call before parsing a RDF document.
 
Parameters:
$encoding - This optional parameter can indicate the encoding that can be used to parse the document, see the XML parser extension of PHP for valid encodings.
Returns:
False if the parser cannot be created
Throws:
None

rdf_parser_free

void rdf_parser_free(string $encoding)
This method cleans the internal state of the RDF parser, you can parse multiple documents with the same object if you call this method and then rdf_parser_create.
 
Parameters:
Returns:
Nothing
Throws:
None

rdf_set_user_data

void rdf_set_user_data(Any $user_data)
This method allows you to set a PHP variable as "user data", handlers will then receive and can access/modify this variable. Note that this is completely optional.
 
Parameters:
$user_data - A PHP variable to be used as user_data inside the parser
Returns:
Nothing
Throws:
None

rdf_set_statement_handler

void rdf_set_statement_handler(string $handler)
When the parser sees an RDF statement it will call the method defined with this function.
 
Parameters:
$handler - The handler prototype is: function my_statement_handler(&$user_data,$subject_type, $subject,$predicate,$ordinal,$object_type,$object,$xml_lang) Where: $user_data: is a PHP variable passed as user_data (if set) $subject_type: is the kind of subject for the RDF statement, possible values are: RDF_SUBJECT_TYPE_URI, RDF_SUBJECT_TYPE_DISTRIBUTED, RDF_SUBJECT_TYPE_PREFIX, RDF_SUBJECT_TYPE_ANONYMOUS. $subject: is the subject of the RDF statement. $predicate: is the predicate of the RDF statement. $ordinal: is the ordinal $object_type: defines the type of object of the RDF statement, possible values are: RDF_OBJECT_TYPE_RESOURCE ,RDF_OBJECT_TYPE_LITERAL, RDF_OBJECT_TYPE_XML $object: Is the object of the statement $xml_lang: Is the xml_lang attribute value if present.
Returns:
Nothing
Throws:
None

rdf_set_parse_type_literal_handler

void rdf_set_parse_type_literal_handler(string $start, string $end)
This function sets handlers to be called when a parse type literal starts and when it ends.
 
Parameters:
$start - The handler receives just one parameter: &$user_data, you can ignore it if you are not using user_data in the parser.
$end - The handler receives just one parameter: &$user_data
Returns:
Nothing
Throws:
None

rdf_set_element_handler

void rdf_set_element_handler(string $start, string $end)
RDF can be used in conjunction with other vocabularies, in such a case you can define normal SAX start-end element handlers to process non-RDF elements with this function.
 
Parameters:
$start - The handler reeives: &$user_data, $name, $attributes. Containing user_data (if used), the name of the element, and the attributes for the element. Attributes are received as an sociative array ($att_name=>$att_value)
$end - The handler should receive &$user_data, $name containing user_data if set and the name of the ending element.
Returns:
Nothing
Throws:
None

rdf_set_character_data_handler

void rdf_set_character_data_handler(string $handler)
When some data is found outside RDF elements the function defined by this method is called (if set)
 
Parameters:
$handler - The handler should receive &$user_data and $data containing user_data (if used) and the data found.
Returns:
Nothing
Throws:
None

rdf_set_warning_handler

void rdf_set_warning_handler(string $handler)
If the parser detects an RDF error then the function defined by this method will be called (if set)
 
Parameters:
$handler - The handler receives just one parameter: $warning with the description of the error detected, you can do whatever you want with the description, log it, display it etc.
Returns:
Nothing
Throws:
None

rdf_parse

Boolean rdf_parse(string $s, int $len, boolean $is_final)
This function is used to parse RDF documents, you can process the document by chunks using repetitive calls of this function thus limiting the amount of memory to be used. This allows you to parse huge RDF documents without consuming all the server resources. Example: $input=fopen("some_file_or_url","r"); while(!$done) { $buf = fread( $input, 512 ); $done = feof($input); if ( ! $rdf->rdf_parse( $buf, strlen($buf), feof($input) ) ) { // process_error here } }
 
Parameters:
$s - This is a string containing a chunk of the RDF document being parsed.
$len - This contains the length of the data to be parsed, usually strlen($s)
$is_final - For example if you are parsing a file (or URL) use feof($fp) as a way to indicate if this is the final chunk.
Returns:
False if there\\\\\\\'s an error, true if everything is ok
Throws:
If an XML error is detected the function returns false.If an RDF error is detected then the function defined by rdf_set_warning_handler is called if set.

rdf_set_base

void rdf_set_base(string $base)
If you are parsing a document from a URL use the URL as base, if you are parsing a file use file:://path as the base. This method is important since URIs will be constructed relative to the base when parsing the document.
 
Parameters:
$base - The URI of the RDF document being parsed, if you are parsing a file use file://path.
Returns:
Nothing
Throws:
None