I’ve just spent a couple of days in Belfast attending the Digital Resources for the Humanities and Arts (DRHA) conference. This is a conference for people who work in a relatively new discipline called digital humanities – which is something not many people have heard of and, I suspect, even those who have may be unsure what it actually is. So I thought this would be a good excuse to put down some of my own thoughts about digital humanities.
What are digital humanities, then? Simply, in digital humanities, you build computer applications to work with data in disciplines such as history, literature and art. For example, the recently released Irish 1911 Census is a typical project in digital humanities: it involves digitizing lots of hand-written documents, annotating and indexing them, designing a database to store them and building a pretty website on top to let people search and browse them.
For those who are not sure how this is different from other areas of computer applications, let me illustrate that with a personal story. Originally, I started my career as a “universal” IT person: I designed databases, developed software, built websites and I didn’t care what those products were “about”. Most people in this profession are like that: they don’t care what kind of data they’re working with: one day you could be developing a database of financial data for an accounting firm, another day you could be writing software to manage customer orders in a mail-order company. But I soon discovered that I can have more fun with data from the humanities than I can ever have with financial data or with data about mail orders. Why is it more fun? Because it’s harder!
Let me explain that. In a conventional computer application such as accounting or invoicing, the data you deal with is already pretty well structured even before you begin computerizing it. Typically, the data naturally breaks down into small bits like numbers and dates and snippets of text like people’s names and addresses. The relationships between these bits are pretty straightforward, too; in a personnel-management database a person has precisely one name, one address and one date of birth. In an invoicing system an invoice goes to exactly one customer, was issued on a particular date, and carries an exact amount of money. And so on: everything is nicely rigorous and formalized. In digital humanities, on the other hand, you often work with large chunks of content that doesn’t easily break into smaller bits, and even when it does, the relationships between those bits are uncomfortably complex. For example, in a database of historical biographies, a person may have more then one name, the names may be of different kinds (spelling variations, pen names), the person’s date of birth may or may not be known, it may only be known with a certain precision (such as when we only know the year, or we estimate that the person was born “sometime in the 1800s”) and so on. When building a database for something like that, you need to provide for many more eventualities than when building a database for personnel management. If you’re a data analyst and if your entire experience is in fields such as invoicing and personnel management, this is likely to make your head spin.
And spin it often does. The trouble is that most database designers are trained on “standard” problems such as invoicing and personnel. Pretty much every database textbook I have ever seen contained an exercise on invoicing. Not one had an exercise on biographies, as far as I remember. When you then expose such a person to the unstructured or loosely structured data that floats around in the humanities, the result is much frustration, at least in the beginnings. I know this because I’m talking from personal experience.
The frustration occurs because people with a humanities background are not used to communicating with the kind of formalized rigour that people with an IT background require. To simplify it a little, IT people think in crisp categories while humanities people think in fuzzy categories. So, when you’re designing an invoicing database for a customer and you ask that customer whether an invoice can ever be issued to more than one customer at once, she will answer “no” and you can take her word for it because you’re in a “classical” computer application, your customer has probably dealt with IT people before and knows that when she says “no”, she must mean “no, never ever”. If, on the other hand, you’re designing a database for a glossary and you ask your customer whether a word can ever have more than one plural, she may well answer “no” and you may well take her word for it, only to discover later that the “no” really meant “sometimes yes”. I swear that this kind of thing has happened to me (and driven me up the wall) several times. But I must admit that when it does happens, both parties are guilty: neither one understands how the other one thinks and so they kind of near-miss each other in the communication.
I hope I have convinced you that digital humanities is a real discipline which requires its own particular set of skills. For somebody with a purely humanities background, “going digital” requires a stronger commitment to rigour and formality. For somebody with an IT background, “going humanities” requires a willingness to tackle larger amounts of complexity and to touch the limits of what you learned in school.
When I “went humanities” myself, that was a long time before I had actually used or heard the term “digital humanities”. For many years I was happily hacking away at things like dictionaries and placename databases and didn’t have much awareness that I had entered a new territory. But then, in 2008, the Digital Humanities Observatory (DHO) was founded at the Royal Irish Academy, everybody in my work environment suddenly started talking about this “digital humanities” thing and I realized that that’s what my home is called. I’ve dealt with linguistic data, with geographical data, with historical data, and lately with biographies. I am “in” digital humanities whether I like it or not.
So coming to the DRHA conference in Belfast (which was co-organized by the DHO) was like coming home. It was inspiring to learn what other people are doing and how they’re doing it. But if I’m allowed one complaint, I noticed that pretty much all the people at the conference were people with a humanities background who have an interest in IT, rather than the other way around. And I had noticed the same thing before when I was attending a DHO workshop on the Semantic Web in Dublin: practically all the people in the room were “humanities types” who were learning about this whole “computer thing”. I think I was the only IT person there. The result is that if you already have a strong background in IT and you go to these events, you find it slow-moving and you get bored easily. A lot of time is spent educating people about the benefits of rigorous data analysis, which is beneficial for people like historians and literary critics, but a person trained in IT already knows that.
I suspect that this is a common occurrence in digital humanities. The whole discipline sometimes seems like a computer fanclub for people from the humanities. But if digital humanities are to be a truly interdisciplinary field that straddles both humanities and IT, then it must cater for people coming from both directions, not just one. I’d like to see more events, workshops and seminars for IT people who have an interest in the humanities, like myself. As I hope I’ve demonstrated, IT folk actually need to learn a new skill or two before they can function in digital humanities successfully – for example how to spot when “no” means “sometimes yes”.