1 * some notes on the relation of XML and ISIS
4 XML is in widespread use as a lingua franca
5 for glueing software components together.
6 Several tools for this can be found at xml.apache.org.
8 What is missing here is an efficient, easy to use
9 way of storing XML data. Only the most trivial cases
10 are easily mapped onto the relational data model,
11 which uses flat records, consisting of a fixed number
12 of fields. The data structures modelled in XML
13 typically have a variable number of childs.
14 Hierarchical databases like ADABAS C are well suited
15 and actually used by SoftwareAG in their Tamino XML DB,
16 but aren't widely and freely available.
21 ISIS records can be easily and canonically converted to XML.
22 Anything up to the first subfield delimiter is the body (a text node),
23 subfields are attributes
24 (strictly XML-ish this is ok only for non-repeated subfields).
25 Other special subdivisions of field content like the typical
26 <key word> may split to real child nodes.
28 The result (as generated by make pdemo) may look like:
32 <key>Educational Psychology</key>
33 <key>universities</key>
36 <v70>Okatcha, F.M.M.O.</v70>
38 <v24>Personal statement</v24>
40 <v12 p="Tbilisi, USSR" d="1976">Symposium on the Psychological Bases of Programmed Learning </v12>
44 Instead of tag numbers and subfield characters,
45 symbolic names from the FDT may be used.
50 XML data structures can be
51 easily and efficiently mapped to the data model of ISO2709.
53 The general conversion (based on a SAX parser) works as follows:
54 - when encountering an opening tag, look up it's name in the FDT.
55 If there is no FDT provided, create one on the fly.
56 If the FDT does not contain the tag name,
57 create a new entry using tag number max(100,1+highest tag in FDT).
58 Create a field using the tag number found and field value '+'.
59 - when encountering an attribute, look up it's name in the
61 Create a new subfield entry if needed using code 'a'
62 or 1+highest code used (for this tag).
63 Append a subfield using the code found.
64 - When encountering an empty tag (the current field ends with />),
65 change the starting '+' to '-'.
66 - When encountering a text node, add a field using tag number 0
67 with the node's body as value.
68 - When encountering a closing tag, look up it's name as for opening tags,
69 add a field with an empty value.
70 - As additional optimization, most text nodes can be eliminated
71 by using the initial value of a node to represent an immediatly
74 For example look at RDF (
75 > http://www.w3.org/RDF
77 > http://archive.dstc.edu.au/RDU/reports/RDF-Idiot
81 <DC:Creator parseType="Resource">
82 <vCard:FN> Dr Jacky J Crystal </vCard:FN>
83 <vCard:TITLE> Director </vCard:TITLE>
84 <vCard:EMAIL> jacky@dstc.com.au </vCard:EMAIL>
85 <vCard:ROLE> Researcher </vCard:ROLE>
97 or, with text-node elimination, to
100 101 -Dr Jacky J Crystal
105 using about half the bytes it takes to store the original.
107 If they had made an attribute what can be an attribute
108 (not substructered, not repeatable) instead of a child,
109 it would read (with explicitly assigned subfield codes)
110 much more efficiently like
112 100 ^pResource^fDr Jacky J Crystal^tDirector^ejacky@dstc.com.au^rResearcher
121 $Id: xmlisis.txt,v 1.7 2003/06/23 14:43:42 kripke Exp $