--- /dev/null
+Using views in OpenIsis.
+
+A "view", like a VIEW in SQL, creates new, typically temporary records based on
+existing ones by means of some transformation like selecting a subset of the
+available fields (a projection), retagging fields or manipulating field values.
+
+
+As general concept, a view can be implemented using any algorithm
+in any of the available programming languages to create new records
+(and need not only refer to record contents, but may also access other
+ressources like files).
+
+In a more narrow sense, however, a view is a special kind of transformation
+defined by a "view record". The fields of a view record have tags
+as they should appear in the target, typically some valid tags of the source
+plus, for example, index control tags, if the view describes indexing.
+
+
+In the following, the term "alphanumeric" denotes any ASCII letter or digit,
+or any non-ASCII character.
+"Word character" denotes any alphanumeric, hyphen '-' or underscore '_'.
+
+
+The value can have one of several forms:
+- if it is empty,
+ the tag is passed to the source record's v command (see below).
+- if it starts with a %,
+ the rest of the value (w/o the %) is passed to the source record's v command.
+ If the tag is not 0, '=tag;' is prepended.
+- if the value starts with any word character,
+ it is used literally.
+- if it starts with a quote,
+ the rest of the value is used literally (w/o the quote).
+ If the value's last character is a quote, it is discarded.
+- if it starts with an @,
+ the rest of the value names a view to be included
+- if it starts with an &,
+ the rest of the value is the name of an extension exit to call
+- if it starts with an {,
+ the rest of the value is a script to be executed in the host language
+ (after stripping an optional } as last character)
+- any other form
+ (i.e. starting with other ASCII punctuation) is reserved for future use
+
+Example: the view
+$
+24
+70
+$
+is a simple projection selecting fields 24 and 70 from the source.
+
+
+* the v command
+
+is described here as an abstract command.
+It is available in the C-API as well as from the language bindings,
+possibly with language specific variations.
+
+It resembles the core concepts of traditional formatting,
+including access to and looping over fields and subfields,
+selecting substrings and attaching optional literals.
+It is sort of the record's printf.
+Like printf, and unlike traditional formatting,
+it neither supports flow control nor screen rendering.
+
+
+It takes a source and target record plus a string specifying a format.
+Depending on the language environment, the source and/or target may be implicit.
+
+If the format starts with '=tag;', where tag is a tag,
+this gives the tag used in the target and as default.
+Otherwise, tags from the source are used in the target and default is *.
+
+The first (next) character is then checked for an encoding mode, see below.
+
+
+The format is a series of output specifications,
+consisting of a field tag (word characters, either numerical or by field name),
+selectors and modifiers. The special tag * selects all fields.
+Each spec may contain several subspecs, separated by commas,
+using the same child context (otherwise, specs and subspecs are the same).
+So the format is spec[;spec...], and a spec is spec[,subspec...].
+
+
+The general operation of the v command is to loop over the record
+until the last occurence was seen for all tags.
+In the nth repetition, for each tag in any spec,
+the (n+i)th occurence of a field with this tag is used,
+where i is an offset given by an occurence selector.
+Determine whether this is the last occurence.
+For every iteration, a new output field is started,
+and the format is processed as follows:
+- loop over the (main) specifications
+- loop over childs (or use the given field)
+- loop over subspecs
+- loop over subfields (or use the whole field)
+- apply decoding
+- apply substring
+- apply encoding
+- attach literals
+- append the result to the target record
+
+
+Each spec starts with an optional decoding mode,
+optionally followed by a tag,
+optionally followed by a child selector,
+optionally followed by a subfield selector,
+optionally followed by string modifiers,
+optionally intermingled with occurence selectors and literals:
+- , starts a new subspec
+- ; starts a new spec with default context reset to the last tag seen
+- . starts a child selector
+- ^% start a subfield selector
+- ([ start an occurence selector
+- /~"'`|+ start a literal
+- : starts a substring selector
+- & calls an extension
+- { evaluates a script
+
+
+* encoding mode
+
+One of the following operators as first character of the format
+can select an output "encoding":
+- ? outputs a 1, if the selected entitity exists, 0 else
+- ! the opposite of ?
+- & applies HTML encoding
+- % applies URL encoding
+
+The test encodings ?! inhibit normal processing;
+they immediatly return after checking the first occurence of the the first tag.
+For example, using a default of all tags (*), the format consisting
+solely of a '?' checks wether a record is empty.
+
+More special characters (but not the '*') may be designated in the future,
+so a format should always start with a tag (possibly explicit *).
+
+
+* decoding mode
+
+An uppercase character before the tag may denote a decoding mode:
+$
+- H heading mode:
+^x is replaced as ';' for x=a, ',' for x=b..i, '.' for others
+angle brackets are removed (>< replaced by '; '), <a> or <a=b> evaluates to a
+
+- D data mode:
+in addition to heading mode, if there is no explicit literal after this field,
+append ' ', if it ends in "punctuation", or '. ' else.
+
+- X index mode
+like heading, but <a> evaluates to nothing and <a=b> to b
+
+- M traditional
+For compatibility, specs reading MHx or MDx (x = L or U) set heading
+or data mode, resp., as default processing (before substringing).
+The case directive is ignored.
+$
+
+
+* child selector
+
+If a tag is immediatly followed by a dot '.' and optional tag,
+field context is switched, for this spec and following specs separated by ',',
+to loop over the childs with the given tag.
+Tag defaults to 0, selecting text nodes in the canonical XML representation.
+A * selects all childs, a second . recursively selects all childs.
+
+
+* subfield selectors
+
+The primary subfield selector is the hat '^', followed by one character.
+It can produce multiple items, like repetitions of a subfield or keywords.
+
+If the selector character is
+- alphanumeric
+ select the (repetitions of the) subfield tagged with this character.
+- an opening pairing brace
+ i.e. one of '(','{','[' or the angle bracket '<',
+ words between pairs of this brace are selected (commonly keywords).
+- a *
+ selects the part up to the first subfield delimiter
+- a space
+ selects naive words as sequences of alphanum
+- a )
+ selects parts between TABs (array mode)
+- other punctuation
+ like / or | selects parts between pairs of this character
+
+
+The percent sign '%' (think printf) works basically like the hat, but
+- removes quotes surrounding values
+- by default treats the TAB as subfield delimiter
+- if followed by a punctuation character or space,
+ treats this plus surrounding whitespace as delimiter,
+ not separating within quotes.
+- if followed by a ),
+ (optionally after another punctuation) goes to array mode,
+ that is there is no subfield indicator stripped from the values
+- if followed by multiple word characters,
+ (including '-' and '_', optionally after an initial punctuation)
+ searches for subfields starting with that sequence followed by '=' or ':'
+
+Examples:
+- '^)' splits at TABs
+- '%)' splits at TABs with quote removal
+- '%a' selects a sequence following a TAB and 'a'
+- '%,)' splits a line of comma separated values
+- '%;*' selects the primary value of a MIME property
+- '%;charset' selects the charset attribute of a MIME property
+
+
+* occurence selector
+
+By default, all occurences of fields, childs and subfields are used.
+One or multiple occurences can be selected explicitly following a tag,
+child selector or subfield selector using brackets [] (counting from 1)
+or parentheses (counting from 0) like (i) or (i..j).
+
+- If i is ommited, it defaults to the first (1 or 0, resp.).
+- If j is ommited, it defaults to last.
+
+Alternatively occurences may be selected by contents.
+The general format is an optional subfield selector,
+followed by an comparision operator, followed by a literal.
+Only occurences where the field or specified subfield matches
+the literal according to comparision are selected.
+Parentheses select all such occurences,
+while brackets select the first match
+and default to the first occurence if none matches.
+
+Operators are
+- = for equality
+- ~ for contains
+- * for starts with
+- + for ends with
+The equality operator may be ommited, where unambigous.
+If some key subfield is known to occur at the start or end of field,
+it is probably more efficient to test for +^zen than for ^z=en.
+
+
+* literals
+
+Each tag, child or subfield selector may be followed by one or more literals.
+Every literal but the / extends to the next occurence of the same
+special character by which it is introduced.
+This special character may be escaped using a backslash.
+A literal backslash may be escaped as two (but need not, except at the end).
+
+The special character governs when and where the literal is output:
+- " before the first occurence
+ (of the entity in question; i.e. field, child or subfield)
+- ' before each
+- ` after each
+- | inbetween (after each but the last)
+- + after the last
+- / this single-character literal starts a new output field after each occurence
+- ~ this literal is used if the given entitity does NOT occur
+
+Literals are not subject to the string modifiers.
+
+
+* substring selector
+
+Introduced by a colon ':', it has the form :l or :o.l, where o and
+l are integers denoting an offset and length to cut from the currently
+selected value.
+
+
+* extension exits
+
+An exit is a C-function (i.e., using C calling convention) in a dynamic library.
+TODO: describe interface.
+
+
+* script evaluation
+
+If a scripting environment like Tcl is available,
+a {} block may contain a script to be evaluated.
+TODO: describe interface.
+
+
+---
+ $Id: Views.txt,v 1.3 2003/06/02 07:49:08 kripke Exp $