Thursday, January 20, 2011

Interfaces and Models

In my last post, I argued that TEI is a text modelling language, and in the prior post, I discussed a frequently-expressed request for TEI editors that hide the tags. Here, I'm going to assert that your editing interface (implicitly) expresses a model too, and because it does, generic, tag-hiding editors are a losing proposition.

Everything to do with human-computer interfaces uses models, abstractions, and metaphors. Your computer "desktop" is a metaphor that treats the primary, default interface like the surface of a desk, where you can leave stuff laying around that you want to have close at hand. "Folders" are like physical file folders. Word processors make it look like you're editing a printed page; HTML editors can make it look as though you're directly editing the page as it appears in a browser. These metaphors work by projecting an image that looks like something you (probably) already have a mental model of. The underlying model used by the program or operating system is something else again. Folders don't actually represent any physical containment on the system's local storage, for example. The WYSIWYG text you edit might be a stream of text and formatting instructions, or a Document Object Model (DOM) consisting of Nodes that model HTML elements and text.

If you're lucky, there isn't a big mismatch between your mental model and the computer's. But sometimes there is: we've all seen weirdly mis-formatted documents, where instead of using a header style for header text, the writer just made it bold, with a bigger font, and maybe put a couple of newlines after it. Maybe you've done this yourself, when you couldn't figure out the "right" way to do it. This kind of thing only bites you, after all, when you want to do something like change the font for all headers in a document.

And how do we cope if there's a mismatch between the human interface and the underlying model? If the interface is much simpler than the model, then you will only be able to create simple instances with it; you won't be able to use the model to its full capabilities. We see this with word processor-to-TEI converters, for example. The word processor can do structural markup, like headers and paragraphs, but it can't so easily do more complex markup. You could, in theory, have a tagless TEI editor capable of expressing the full range of TEI, but it would have to be as complex as the TEI is. You could hide the angle brackets, but you'd have to replace them with something else.

Because TEI is a language for producing models of texts, it is probably impossible to build a generic tagless TEI editor. In order for the metaphor to work, there must be a mapping from each TEI structure to a visual feature in the editor. But in TEI, there are always multiple ways of expressing the same information. The one you choose is dictated by your goals, by what you want to model, and by what you'll want the model to do. There's nothing to map to on the TEI side until you've chosen your model. Thus, while it's perfectly possible (and is useful,* and has been done, repeatedly) to come up with a "tagless" interface that works well for a particular model of text, I will assert that developing a generic TEI editor that hides the markup would be hard task.

This doesn't mean you couldn't build a tool to generate model-specific TEI editors, or build a highly-customizable tagless editor. But the customization will be a fairly hefty intellectual effort. And there's a potential disadvantage here too: creating such a customization implies that you know exactly how you want your model to work, and at the start of a project, you probably don't. You might find, for example, that for 1% of your texts, your initial assumptions about your text model are completely inadequate, and so it has to be refined to account for them. This sort of thing happens all the time.

My advice is to think hard before deciding to "protect" people from the markup. Text modeling is a skill that any scholar of literature could stand to learn.

UPDATE: a comment on another site by Joe Wicentowski makes me think I wasn't completely clear above. There's NOTHING wrong with building "padded cell" editors that allow users to make only limited changes to data. But you need to be clear about what you want to accomplish with them before you implement one.

*Michael C. M. Sperberg-McQueen has a nice bit on "padded cell editors" at

No comments: