Converting TEILite files to XML

Files to be converted should be normalized (all elements closed); element names can be upper or lower case, but must match in case in each pair. Files from UMich HTI are already normalized. First we make viewable, then we add/convert to HTML elements to display images and set A anchors.

I.

  1. Comment out the DOCTYPE declaration and all local declarations (entities, stylespecs, navigators—the works) and begin the file with the following two lines:
    <?xml version="1.0" encoding="UTF-8" ?>
    <? xml:stylesheet type="text/css" href="[path to cethtei.css]" ?>

    "xml" in lower case; UTF-8 is the default, but do it anyway. ?'s at both ends.

  2. Close all milestones, page and line breaks (MILESTONE, PB and LB) with the trailing />.

  3. Use "replace-string" (EMACS) or any "search and replace" command in your editor to change the ISO/TEI character entity names to Unicode decimal addresses (e.g. &mdash; --> &#8212;

    You can't include character entities a la TEI without the DOCTYPE declaration etc. being in. Then MSIE5 acts like a validating SGML browser, which in most cases you would rather it did not. Of course you hope for Unicode (or WGL4) font support on the browsing machines. But that is offered along with the MSIE5 download as part of the "Internet upgrades."

At this point, you should have something that displays in MSIE5 or Mozilla, though with no images or links.

II.

  1. To include HTML markup, add the xmlns stuff to the root <TEI.2> so: <TEI.2 xmlns:HTML="http://www.w3.org/1999/XHTML">

    the URL is a dummy, but one is required.

  2. To use IMG to render images (FIGUREs), replace the FIGURE tags with a single IMG tag with a closing "/": <html:img src="images/emersonpic.jpg" />

    Most attributes of the HTML IMG element can be used except Javascript events; so WIDTH, HEIGHT, ALIGN, HSPACE, VSPACE

  3. To convert XREF and REF anchors into A anchors, you must again use the "html:" format, this time <html:a href= [etc]> and similarly in closing the anchor </html:a>. Again a string search of <REF TARGET="...">works, replacing it with <html:a href="..."> and so on for the closing tags. Very fortunately, the MSIE5 and Mozilla browsers accepts ID as a target for HREF links, so you don't have to alter the target of a html:a link to make it a "NAME" target (unless you want to make a return link from target back to source--a nice way to approximate the bidirectional ID/IDREF support in Panorama and others.)