Appendix E. XML metadata import

Editor/administrator application functionality may be extended with so called extensions. It is possible to add extension which imports metadata from external sources. XML import extension is an extension of this type. It allows importing bibliographic description from XML files (details about XML can be found here).

XML metadata import extension uses XQuery language (details about XQuery can be found here).

To make the import avaliable, appropriate configuration for the XML extension has to be provided. By default, XML extension configuration allows editor to import metadata from RDF and MASTER format (both formats use XML to present metadata).

XML extension is configured using two property files (property files contain key=value pairs):

Relation between both files is very strict - for every XQuery test in tests.properties file there are corresponding conversion rules in conversion.properties file. For a given metadata XML file, import mechanism performs each test query from tests.properties file. If the result have one or more values then metadata are imported from XML file using conversion rules (corresponding to successful XQuery test) from conversion.properties files.

tests.properties file contains XQuery queries which test whether the metadata file may be imported using corresponding to XQuery test conversion rules. The key identifies conversion rules in conversion.properties file.

For example, let us assume that we have the following files (this example presents extension's default configuration):

tests.properties file:

master=for $x in fn:doc({document})/*[fn:compare(fn:name(), 'msDescription')=0] return $x
rdf_dc=for $x in fn:doc({document})/*[fn:compare(fn:local-name(), 'RDF')=0] return $x

conversion.properties file:

master.Title=for $x in fn:doc({document})//msHeading/title return $x
master.Creator=for $x in fn:doc({document})//msHeading/author return $x 
master.Description=for $x in fn:doc({document})//msContents/overview return $x
master.Publisher=for $x in fn:doc({document})//msContents/respStmt/resp/name return $x
master.Contributor=for $x in fn:doc({document})/msDescription/msContents/respStmt//resp return $x
master.Date=for $x in fn:doc({document})//msHeading/origDate return $x
master.Type=for $x in fn:doc({document})//physDesc/form return $x
master.Identifier=for $x in fn:doc({document})//msIdentifier/country/settlement/repository/idno return $x
master.Source=for $x in fn:doc({document})//msPart//idno return $x
master.Language=for $x in fn:doc({document})//msContents/textLang return $x
master.Language=for $x in fn:doc({document})//msContents/textLang/@otherLangs return $x
master.Rights=for $x in fn:doc({document})//msIdentifier/repository return $x

rdf_dc.Title=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Title'] return $x
rdf_dc.Creator=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Creator'] return $x
rdf_dc.Subject=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Subject'] return $x
rdf_dc.Description=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Description'] return $x
rdf_dc.Publisher=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Publisher'] return $x
rdf_dc.Contributor=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Contributor'] return $x
rdf_dc.Date=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Date'] return $x
rdf_dc.Type=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Type'] return $x
rdf_dc.Identifier=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Identifier'] return $x
rdf_dc.Source=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Source'] return $x
rdf_dc.Language=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Language'] return $x
rdf_dc.Relation=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Relation'] return $x
rdf_dc.Coverage=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Coverage'] return $x
rdf_dc.Rights=for $x in fn:doc({document})//*[fn:local-name()='Description']/*[fn:local-name()='Rights'] return $x

As we can see, conversion.properties file contains conversion rules which correspond to tests in tests.properties file. The key in conversion.properties file is composed of the key from tests.properties file, dot and dLibra attribtue's identifier (RDF name). Values resulted from queries which are in conversion.properties file will be assigned to attributes with specific RDF name.

Let us assume that we want to import file A which contains metadata in XML format. Import mechanism performs XQuery queries which are placed in tests.properties file. The first test which results with non-empty list of values decides which conversion rules will be applied for metadata import. Let us assume that it was a test which key is metadata. Import mechanism chooses conversion rules from conversion.properties file - all keys which start with master. Then values from XQuery queries are assigned to specific attribute, for example to attribute with Title RDF name the mechanism assigns values from query for $x in fn:doc({document})//msHeading/title return $x. If there is a need for an attribute to have more that one query then line with additional query should be added (for example Language has two queries).

Wvery XQuery query should use {document} string to specify document on which the query is performed. Extension automatically replaces this string with appropriate path to XML file.