Content, Structure, Webbiness:
Elements of a Rhetoric of HTML

Introduction: "Lord, you know it ain't easy"

1.The alleged simplicity of HTML

 
Signs of the times:
  • Netscape Communicator comes out, enabling us mark up our email in HTML;
  • MS Word 97 comes out with a prominent option in the File dropdown menu "Save as HTML";
  • Add-on manuals for English Composition appear teaching HTML in 30 to 60 pages.

And so the how-to promotion is in full swing: HTML is easy, HTML opens incredible riches, HTML doesn't require you to program, HTML is your friend--a sort of Disneyfied Aladdin's lamp but easier to operate. All of these are true, in their way, but offered as the whole story they tend to obscure what is tense and edgy in HTML as a medium and how it makes us rethink what texts and text-making are all about--what might make it interesting, in short, to write in. Yes, HTML is used by computer scientists and action artists, archivists and activists, but this does not prove its neutrality, though it may testify to its adaptability. Read the lists, the handbooks, the reflective articles, and you find contrary claims being made about the medium and contrary practices in it. It is here that the story of HTML's simplicity runs out and interesting issues begin to emerge. You can see HTML at the center of a number of conflicting forces, tendencies, allegiances, among which are those touched on in the initial graphic:

  1. searching a database vs. exploring a maze
  2. public interest/domain vs. private enterprise/property
  3. text content vs. image (display)
  4. reader choices vs. author direction
  5. universal reader vs. techie elite

1.1 database and maze

*e.g. the info documentation for emacs still on line on most UNIX computers and winhelp the Windows help engine.

 

*see the very widely linked Yale Center for Advanced Instructional Media Web Style Guide by Patrick Lynch and Sarah Horton

 

*see Nicholas C. Burbules, "Rhetorics of the Web: hyperreading and critical literacy" in Ilana Snyder, ed. Page to Screen, Routledge, 1998: 102-22.

 

1.1 One early and widespread use of hypertext was for computer documentation*, and today HTML is taking over that function and retains some of assumptions and outlook of information seekers. Hypertext was a necessity of early on-line documentation, since a screenful at that time was not very much: you had to link a whole lot of small screenfuls, each of which replaced the other when you "jumped" to it (as was the case on the Wintel side of things until multiple concurrent screens became possible with Windows 95). Technical information is not as strongly ordered linearly as many other genres such as narrative fiction, argument, or even instruction, so little was lost in a medium that made jumps from one small cluster to another. Put another way, information in documentation consists largely of lists and descriptions of commands and functions. Writers on HTML who come from this information retrieval perspective* emphasize clear and regular navigational apparatus and document structure (e.g. everything that is a link should look like a link)--in short, the structure should disclose itself in standard, conventionalized ways. And, it is hardly necessary to add, the links should not be digressive or associative, but should lead the reader down a tree of descending generality, the hierarchy providing a more efficient way of finding one object among many others than a linear search. The preferred images of structure at this end of things are outlines*, tables (real tables, not layout grids), and databases. The reader is assumed not to be just sampling or browsing but to be pursuing an exhaustive search--try to find all information on the chosen topic that is available on the site.

But there is another traditional engagement with the computer screen that plays by different rules for different purposes: adventure game players--the many descendents of Dungeons and Dragons who now sponsor online versions of Choose Your Own Adventures, Myst/Riven (slow speed) or Quake II (with graphic acceleration), and surf for the sheer vertigo. Here the reader is presumed to be exploring a maze whose structure is not candidly, even earnestly disclosed, who may enter a room to find nothing, a monster, or jewels. Browsing and surfing--the words suggest recreation and leisure, the activity not goal directed but for its own sake. To write for this mode is to hide and scatter links around a page for the reader to discover, to set border=0 on all tables, images, and frames, to hide the machinery and enhance the magic. Ars celare artem.

1.2 Public/private

 

*maybe permission, but how many permissions to use original digital images have you seen acknowledged on the net?)

 

*www.altx.com --"where the digerati meet the literati" --however warily and briefly

* Xanadu/xanadu; transcopyright and all

1.2 The UNIX culture, which has supported most of the Web servers and Internet Service connections, is the home breeding ground of the Free Software Foundation, X Window, and a staggering amount of free software, mated very readily with the "public interest" restrictions of the old Arpanet/Bitnet mentality. It produced a culture of potlatch, in which your prestige is reckoned by what programs you have written and given to the community, or other service you have provided. By that standard, one of the web's greatest heroes is Larry Wall, the inventor of Perl. The first principle of the Free Software Foundation is that anyone who distributes a program licensed under their General Public License must also distribute the source, for it is by keeping the source secret that private companies maintain control of their products. With HTML, of course, the design required the downloading of source, including separate files for graphic and audio elements, so that nothing is easier to copy and reuse than somebody else's HTML code. Further, nothing is easier to grab than someone's images. What difference does it make whether you make an inline link to an image in somebody else's directory or download the image into your directory and link to it there? what if the other person (the "owner") moves it, or has slow or unreliable service? The Web is very casual about ownership of pages and sites: many pages aren't even signed and many more are not even dated, and without these, you cannot establish a claim of priority in court, and hence defend a copyright. But there is a reason for the lack of dating, which is that Web documents are expected to change, to be revised, corrected, updated. An un-updated page is an abandoned page. Web pages really have no date of completion and hence are rather unlike the "objects" that copyright law contemplates. And frankly, there is no great art or ingenuity involved in HTML markup, hence little originality to be ripped off. More individual effort can be involved in the making of images or scripts, and some of the latter are offered for sale, so a sense of fair reward for labor comes into play, or at least acknowledgement. The code of the net in fact requires generous thanks and praise.* But the basic view is that in putting up original material, you are just paying back the debt you incurred to others when you browsed their freely posted work.

Such thinking makes perfect sense to some people, especially anarchist, idealist, Scandanavian coop-ers, but people have successfully claimed rights of property over their artistic and intellectual products for a couple of centuries, now, and can be downright unpleasant if they take exception to your poaching on their domain. Publishers of printed works derive income from the copyrights they hold or defend for their authors, and providing free, digital copies constitutes 'reproducing' the work and is forbidden unless permitted by the publisher for the work in question. Since no one is at all sure what if anything "fair use" is, only works not under copyright can safely be reproduced or cited online. The effect of this standoff may be to deepen the already enormous chasm between online works and those in print.* They laughed and still do when Ted Nelson proposes assigning units of value to hits,* but as long as people hope to make money writing books and articles, they will not give their best stuff away on the Net, and the Net will be the place for free chapters and other teasers for the real thing--BOOKS.

1.3 Content/ display

*Gunther Kress and Theo van Leeuwen, Reading Images, Routledge, 1997, pp.

*Alertbox, Oct. 97; Alertbox, Mar. 97

*Gunther Kress, "Visual and verbal modes of representation in electronically mediated communication" in Ilana Snyder, ed., Page to Screen, Routledge, 1998: 53-80.

*Many people still put up papers tagged with <PRE>

 

1.3 It has often been said that the ratio of text to image functions in print culture as an index of seriousness: the higher, the more serious.* The lowest ratios are found in picture books (of course), then books for beginning readers, and comic books. On the Web, however, text with no image signifies "clueless academic." Over and over Jakob Nielsen says "make your text 50% of what it would be as conventional text."* So the writer for the Web has several extra skills and sensibilities to develop.*

At present, predominantly text pages tend to be almost wholly text; other media are just not there, and even the text markup is minimal.* This is especially the case with literary sites: you get poems, stories, novels, all with few or no images and no hypertext links to other passages, comment or footnote popup windows, or other sites. Straight from OCR to Net. This reflects an archival mentality ("don't alter the text!"), and ducks the big question:

Do we really expect people to read (literary) texts online? Do we not hit the print button as soon as we see even a pageful of wall-to-wall text? What readership can you expect for online hyperfiction unless there is a goodly body of readers of online fiction, perhaps especially classics? Here there seem to be two great musts:

  1. writers must design texts for monitor display. They should use columns or other ways of narrowing the text block on the screen, margins, type fonts, colors, size, and leading to give the effect they want. Time was, it was impossible to control most of this, and good advice was not to become overly obssessed with what you can't control. Now, however, with Cascading Style Sheets and dynamic fonts, more of these display characteristics are coming under control.
  2. writers must make online text more rewarding to read than paper text: they should use the devices of popup windows for footnotes, secondary windows, charts, and illustrations. They should use hypertext links to tie in other, related online texts and images, even at the cost of potentially losing a reader to the other text. This is the way to take advantage of the medium. This text you are reading offers itself as an example.

Even so, no format, however ingenious, can display a document so that it has maximum effect in writing and on the screen. Thus the paper form of, say, this document will seem thin or dense but in any case underdeveloped as an article of print. It should perhaps be thought of as an abstract of a virtual printed document.

Markup languages themselves are poised between marking up content according to its units of structure and marking it up for display. Just about the time HTML writers were ingeniously wracking standard markup to control display, Cascading Style Sheets received enough browser support to move physical markup back to a separate part. Of this more in the next chapter.

1.4 choice/ direction

 

*See Chapter 2

 

*or "jump", an old Windows Help and Multimedia Viewer name for it

 

*See George P. Landow, Hypertext 2.0, Johns Hopkins University Press, 1997: 52, 53, 100, 101, 134, 136, 173, 213, 244, 259

 

*e.g. The Iam poem from Dada Net Circus (chapter 3)

 

* The mechanism typically is not a hypertext href anchor but a META refresh tag in the HEAD

1.4 Reader choice, it is generally held, is the very essence of hypertext. There are plenty (too many!) information kiosks around! As sites get bigger, people are inclined to make sitemaps of possible traversals of the pages connected by hypertext links. These maps are usually based on one of several visual metaphors or schemes, such as a branching tree, a network, or a flow diagram.* Most sites enable multiple paths through the site, including exits to external sites (with possible returns via the Back button or the browsing records the machine keeps). A very crude measure of the amount of choice in a site is the ratio of links to pages: if it is one (1) or close to one, the site offers the minimal degree of choice for the reader; it is a box-car train or tunnel full of files linked only with a "Next" anchor. Here the reader can only go forward one step at a time, or one back.

In the classic view of hypertext, a choice to "go"* via a link replaces the current material on the screen with the material of another page completely and without a trace of the first page remaining. The two pages are two "places" and the reader cannot choose to be two (or more) places at the same time. The display screen is like the screen in a slide show. This classic view in fact reflects early hardware limitations on displays, the lowest common denominator design of the early Web browsers, and the general design of MS Windows before Win95. Now that JavaScript allows us to open several windows, size, resize, and place them and alter their contents, the reader's choices are potentially greatly increased. Readers can now experience multiple windows such as those that have been available for quite some time on X Window terminals and Macintoshs--displays that George Landow so frequently uses in his books to illustrate the hypertext writing space as collage and montage.* That is, you can venture forward without your starting point vanishing behind you, and you can explore both branches of a forked path, keep them both on screen, and compare them.You can open a window with notes and comments, or illustrations.

Just as Landow's technology allowed him a certain foresight into what was to become available on virtually every desktop computer, so it limited him to thinking of hypertext sites as self-contained labyrinths. Lacking HTTP, his IRIS project at Brown could be come rich and complex, but always self-contained, always under the control of the web master. It reflects the isolation of the studio, the museum, and the gallery--the three spaces we associate with art. In comparison, the Web is a street fair. And similarly, as he moved from mainframe to distributed computing on desktop computers and the Web, he stayed with the proprietary authoring software StorySpace which is not HTML and not free.

Thus multiple screens appear, increasing the reader's options. They are basic to the new trendy Dynamic HTML, which offers ways of opening them by mouse actions (controlled by the reader) or by system events and timing (controlled by the program--ultimately the author). Server "push" can replace one page with another on your screen, or one image or another, or one style with another, sometimes linking the pages with fades and dissolves.* Pages can take you for a ride, sometimes one you cannot stop. These new options may make browsing more exciting and surprising than it used to be, but notice that we have reversed direction on the choice parameter: the reader becomes a viewer or spectator as she so frequently is before the video screen. So perhaps we should say:* a page that automatically replaces a previous one is not hypertext-linked to it.

1.5 Universal Reader/ Techie Elite

1.5 It may well be that the Internet is more nearly universal than any medium and body of texts has ever been and that the issue of "access" is basically just a matter of getting computers into people's hands and getting Internet Service Providers for the computers, so that the Net-wired computer takes its place alongside the family tube (or even, indeed, merges with the tube!). But this view erases the manifold differences in hardware and software that exist at any time between the digitally well furnished and those living on handmedowns with no sound cards (or cards they are allowed to use), smaller monitors, limited graphics capacity with little ability to debug, repair, or enhance the system. Similarly, new web browsers support more expressive options, but the rate of adoption of the new, level four browsers has tailed off, so that a semi stable division of the world into low tech have-nots and high tech haves seems to be emerging. For whom will you write?

When you choose to write in HTML, you choose to work in a zone of contestation and every choice you make tips ever so slightly in one direction or another. This field of forces is extremely unstable, constantly offering more expressive possibilities and greater author control, while making earlier work look stodgy and amateurish. At the same time, however, writing using features anywhere near "the edge" will lack impact on the many machines running pre-level 4 browsers. At present, the future of Cascading Style Sheets, JavaScript, and dynamic HTML seem relatively assured and perhaps half the machines in the field are able to browse these pages as written. But HTML that takes advantage of these increasing possibilities is not so easy to learn or write. Using tables to control layout, for example, made HTML twice as hard to write. Even a modest use of JavaScript--twice again. And so on. So it is hard to keep one's mind on content while learning to use the latest expressive options, or to put it another way, to make the options one is exploring expressive of something.

Summary

 

X Window, like Macintosh, has little use for TrueType Fonts at present; fortunately Dynamic Fonts allow me to specify a font that all the more recent browsers, regardless of the platform they are running on, can load and display.

 

Much could be done with boxes and the CSS, but I am not sure how reliable that is yet.

In a way, this is all a way of explaining what lay behind the choices I made when writing this document.

  • I wanted to achieve layout control, to avoid wall-to-wall prose, and yet to use moderately wide walls (800 pt) which would fit within what is said to be the most common resolution and screen size. However, if someone had to view it at 640 pt width, they could slide the window over to the right and give up easy reading in the notes column.
  • I wanted a font and font size that could be both viewed easily and printed, and developed a liking for Microsoft's Verdana font which was designed for viewing and which they press upon people to download. However, it is very big for its nominal size (its 10pt = 12pt of most others), which means that when other fonts are substituted for it on machines which don't have it, they will be too small and the layout will be messed up. So I gave up on lovely light Verdana and settled for another of Microsoft's new model fonts, Trebuchet MS.
  • I did not want to write for one platform, however, since my preferred setup is in Linux/X Window, and going back and forth was a good reminder of how non-general many things are. Similarly, I made a choice to write for level four browsers, even though they are only installed on at most half of all online computers, because I wanted the enhanced layout control of Cascading Style Sheets as well as various bits of Dynamic HTML.
  • I wanted to exploit the reader's being on line to make various hyperlinks and to create various kinds of secondary windows which optionally amplify points. These links and options hanging off of each screenful create I hope a sense of a third dimension to the page, but of course these things cannot be directly experienced as you read the printed page. The closest thing in print media might be a child's popup book, or one where there are doors and windows and chests for you to open. So the printed version is also only an abstract of what can be experienced on line.
  • I decided to use tables to further control layout. At first I used hidden tables, but would set the borders to "on" when I was debugging the pages because the borders and cells made the structure of the document much more evident. Finally, I decided to offer it that way for a first reading.

So. Was it worth it?


dillon@u.washington.edu
On to chapter two: Markup