Tuesday, December 28, 2010

DH Tea Leaves

From reading my (possibly) representative sample of DH proposals, I'd say the main theme of the conference will not be "Big Tent Digital Humanities" but "data integration". Of the 8 proposals I read, more than half of them were concerned with problems of connecting data across projects, disciplines, and different systems. My proposal was too (making 9), so perhaps I did have a representative sample.

Data integration is a meaty problem, resistant to generalized solutions. To my mind the answers, such as they are, will rely on the same set of practices that good data curation techniques use: open formats and open source code, and good documentation that covers the "why" of decisions made for projects as well as the "how." Data integration is a process that involves gaining an understanding of the sources and the semantics of their structures before you can connect them together. So, while there are tools out there that can enable successful data integration, there are (as usual) no silver bullets. Grasping the meanings and assumptions embodied in each project's data structures has to be the first step and this is only possible when those structures have been explained.

jodischneider.com said...

The word 'integration' in this context makes me think immediately of semantic technologies and the semantic web.

Of course, the same work on standardizing vocabularies and sharing underlying code could go on under other banners!

Digital humanities folks interested in interoperability may want to check the "From Metadata to Linked Data summer school" -- I'm not affiliated with it, but I can point you to their webpage: