Hi Desmond, really nice to have your input, even a...

2013-09-12T17:03:37.255-04:00

Hi Desmond, really nice to have your input, even a couple of years later! You seem to say that TEI compatibility is next to impossible, but I don't think that's the case. I do think it entails either agreement to work to the same standards or crosswalking between projects, and I certainly agree that it's not something you get for free. I'd forgotten about that Bamboo paper, but I look at it and giggle a little hysterically at the idea anyone would think TEI structured bibliography is that simple an animal.

I think too that it depends what you mean by "interoperable". I do agree that projects which seem to have the idea that you can take all the TEI texts in the world, throw them into some sort of bin, and then by some automatic process distill something useful from that are being naïve. I don't really see the point of that sort of project though...

Given your background, you'll understand that the majority of the texts I deal with (papyri and inscriptions) are your exceptions :-).

If I understand your last argument, it's that TEI is useless because you can't use it to produce some sort of critical hyperedition where every textual variation is recorded and aligned. I don't think TEI would be very useful for that sort of project. I can imagine having a TEI base text and recording variants as standoff annotations on that base—and in fact I've played around with doing that sort of thing to model manuscript collation—but I don't think doing the whole thing in TEI would be sensible. Actually, I think it would be nutty. And doomed.

But that's not how critical editions work. An edition is a single reading derived from 1-n sources, where the editor chooses to surface only a subset of the source variations, their own conjectures, and those of earlier editors (and only those they consider relevant) in the apparatus, not the text. TEI is perfectly capable of representing that sort of edition, or an edition of a single manuscript.

So I think the variance argument is a red herring. TEI has its flaws, and indeed there are serious problems with some of its encoding recommendations. You don't get free interoperability with it and it's not suitable for every text-based project. None of that makes it useless though.

Thanks again for taking the time to reply to an old blog post! I enjoy thinking about this stuff and would love to continue the discussion.

I realise this post is really cold now, but I must...

2013-09-12T15:15:38.978-04:00

I realise this post is really cold now, but I must disagree with your assessment of TEI interoperability: "Compatibility is certainly achievable if both documents follow the same set of conventions" is I think is a tad ambitious. Documents, after all, don't follow anything. It is people that follow conventions or not and therein lies the problem. Patrick Durusau points out that there are more than 4 million ways to transcribe a single sentence taken from a printed book using the TEI Guidelines. Given that kind of variation the chance that any two people would encode the same features in the same way is just about zero. As Alan Renear pointed out, a TEI tag added to describe an analog document has a completely different illocutionary force to exactly the same tag created by the author of an electronic document as part of his text. The first tag is pure interpretation, the second is pure fact. So in my view TEI texts cannot ever be interoperable. It's funny that a lot of people think they ought to be, or act as if they were. If you look at projects like TextGrid or the British version TextGrid VRE, or the TAPAS project you'll see this assumption underlies the whole project. But it's a big mistake. You only have to look at Project Bamboo to see what happens when you try to make TEI texts interoperate. Or the original grant proposal of the TEI where they make interoperability one of their goals, and then abandon it 25 years later. Or the chilling assessment of TEI's uselessness by the chairman of the TEI board.
As for your objection to my supposed "assumption that there is a text independent of the editorial apparatus. Maybe there is sometimes, but I can point at many examples where there is no text, as such, only readings. And a reading is, must be, an interpretive exercise." I guess you are thinking of fragmentary texts - but they are the exception not the rule. I didn't ever say that you couldn't annotate a text containing computed variation. What I said was that although computerised comparison could never be perfect it was still far more accurate than manually created variant recording using embedded markup. The reason is that recording variants using XML may seem to give you more freedom but in fact it powerfully restricts the kinds of variation you can record. What about texts with 100 versions? Or major transpositions? Or variations in the markup itself? Or non-hierarchical variation? You soon get tied up in knots trying to represent that in XML so that the automatic collation works out to be much better. Are you really saying that automatic comparison serves no purpose? I think you mean that you want to annotate variation, and you can, but you don't need embedded markup to do it. A reading that is literally different from some other reading is just differnt. There's no interpretation required to see that, and having that information to hand has got to be useful.

Comments on Scriptio Continua: TEI is a text modelling language

Hi Desmond, really nice to have your input, even a...

I realise this post is really cold now, but I must...