Posted on 2007-10-04 20:39:50-07 by thb
svn branch

At the moment you cannot access the responseDate OAI element, although I consider it essential for organized harvesting over time.

Upon an Identify request you get responseDate for free, this however is a side effect of the Identify parser collecting everything, not only the subelements of <Identify>

Investigating the flow of control I gathered the following:

The different methods build up a filtering parser hierarchy, with the following constituents

- error parser (acting on local names only, ignorant of any context) - resumption token parser (acting on local names only, ignorant of any context) - verb-specific parser

for listRecords now some rudimentary dispatch action happens:

* on record elements create a new Record object * set (this level's !) handler to the following filter hierarchy: - record header parser - metadata parser (pluggable)

(Thus the metadata parser is on the 5th level of filter hierarchy and also is confronted with the OAI::metadata element itself)

IMO almost everything should be handled by a top-level parser to be defined in Base.pm. This parser should monitor the top-level elements and beyond:

OAI-PMH responseDate request dispatch to error element parser on error element dispatch to verb-specific parser on verb-specific element

this top-level parser optionally (e.g. controlled by a global $N:O:H:NSaware flag) could do some namespace checking

the verb-specific parser for listRecords should behave as follows


on each record object:
dispatch to header parser on header element
handle metadata element
dispatch to custom metadata parser on metadata element
globally:
dispatch to resumption token parser

here the custom metadata parser would live on the 3rd level of filter hierarchy.

One might even consider to include top-level parsing of those verbs with resumption tokens, since they have a certain amount of overlap.

Direct Responses: Write a response
Perl Weekly newsletter
A free weekly newsletter for people who are busy to read all the blogs. click here to check it out.