Thinking About Your DH Projects

12 November 2017 by pencechp • permalink

This is a sort of a version of the talk that I gave earlier today at THATCampHSS.

There’s a pretty standard story that you can tell about the birth of most digital humanities projects. (At least, in my experience. Caveats abound here, of course, but let me overgeneralize for a while.) You’re working on some area of research, and one of three things tends to happen. You might stumble upon a big corpus of data that you’d like to work with as an organic whole. You might see a talk or find a website for a tool that is asking some kind of question that you think could be really useful for your project. Or you might basically start doing a digital analysis, often long-hand and slowly, and realize partway through that there’s groups of people who are already doing this stuff.

All of those are great ways to get introduced to the field. But all are either tool-based or data-based. And they lend a general cast of “tool-and-data-centric” to much work in DH.1 This will be reinforced if you go to most meetings in DH. Lots of people working on new tools, talking about new public corpora, crash courses in programming languages, all that good stuff.

And let me be clear: it is good stuff! You can’t do this work without the tech. And so we need to get together and talk about the tech! Opportunities like, say, the Epsilon project, about which I only just heard this weekend – a linked, marked-up database of long-19th century scientific correspondence, drawing on all kinds of sources and open to new submissions (taking all the transcriptions to TEI under the hood, finally, so that we might get some durability and interoperability) – is the kind of thing that you probably wouldn’t have heard of unless somebody talked to you about it! And it will only get better if it gets widely adopted. (Shameless plug: my own DH project, the evoText tool for analyzing journal articles in the evolutionary sciences, is the same kind of thing. But then I just told you about it, so now you know.)

But that means that sometimes there are hard-won lessons here about the parts of DH that don’t involve the tech and data. One classic example is the Project Management course at DHSI, which is consistently one of the most desired topics: humanists aren’t used to managing large, multi-part, distributed projects, and often don’t begin to realize that until they’re in too deep.

I want to point out three more such lessons, all of which I think you should think about very early in your DH work. I didn’t think about any of them until I’d been doing this stuff for years, and it’s led to me making a hash of more than one opportunity that I could have been properly prepared for had I been thinking clearly in advance about this stuff.

Choosing Your Research Question

I have to admit that the first thing that got me thinking about these morals was getting embarrassed at the pub. I was giving a DH talk in a couple spots in the UK over the course of this past May (eventually that material will hopefully be coming out in a volume edited by Andreas De Block and Grant Ramsey; watch this space). Several of the graduate students in the (totally awesome) Leeds HPS group have been interested in digital methods. But one of them asked me, in essence, the following question: How do you know that you have a project that’s amenable to digital analysis? How do you even get started? And I had basically no idea how to answer, which I was sad to have to admit.

One thing that inspired me to do was write up some clear discussions on how you can get started, at least with the tools that I’ve built.. But it also got me to thinking about how you can and should build a good DH research question. Here’s some ideas (once again, that I wish I’d followed):

Work backwards and think counterfactually. That is, try not to start with tools. Start with results! Start with your research questions, what animates you to get out of bed in the morning. What do you wish you could learn about but don’t think you can? Of course, this can’t happen purely in a vacuum – some basic familiarity with the available digital methods is important. So it’s worth knocking around a few DH blogs, reading an article or two, going to a talk that sounds interesting. But don’t think that you have to just parrot back the methods and approaches that other people have used. Bend the tools to your will.

Dream big. And perhaps equally importantly, don’t let yourself get bogged down in worries about data or tools at this stage. Don’t get discouraged if you think you’d need too much data, or archival data you can’t get, or copyrighted data you don’t have rights for. Don’t worry that you don’t know which tools to use, and especially that you don’t know how to use the tools that are already out there. That will all come later. There’s world enough, time, and grant funding to solve those worries. (Write them if you have to!)

Processing Your Results

Now, let me fast forward past all the stuff about tools and data. One thing that we don’t do often enough is sit and think about what we’ll do when we get answers. I can tell you exactly how this has gone for me. You’ve been using your toolset for weeks/months/years. You know exactly what kind of data it spits out when you’re done. (It’s probably not even really human-readable.) You do some basic quick-hack visualizations just so you can look at what you’ve created. And you think you’ve got it! Here’s something interesting! In ten minutes or so, an entire outline of the paper you think you want to write will have materialized on your whiteboard.

This is awesome. (And man is it ever a good feeling after having slaved away at this stuff.) But you should plan for it happening in two different ways, neither of which I bothered to actually do ahead of time in my most recent project, and both of which left me pedaling fast to catch up.

Start with hypothesis-driven research. In our heart of hearts, I think everybody knows that we should be doing hypothesis-driven work. There’s random patterns in every big dataset, and you don’t want to be misled by them, so you really need to have some idea in advance what your hypotheses are, and what you’d take the success and failure conditions for them to be.

But let yourself be surprised! That said, I know that in my own work a decent number of the conclusions that I’ve come up with have emerged organically over the course of doing my work. The conclusion of some of my latest work was literally the opposite of what I’d expected to find. You can’t just tell yourself that you won’t follow those leads when you see them. But what you should do, I think, is to put in some energy ahead of time thinking about how you might validate the chance patterns that you otherwise stumble upon. I had to kill a lot of time – and do so when I already had an idea what answer I wanted to justify, which is epistemically dangerous – trying to figure out how to do this after the fact. Don’t be like me.

Generating New Research

Lastly, one of my favorite features of DH projects is that they not only are good at answering our first-level questions, but also at generating new questions that we hadn’t thought of before. In essence, this moral is quite simple, but very rarely discussed: think about what new research your work could produce.

This takes on lots of facets. Could you do more with more data? Is it possible that you’ll find partial signal of a bigger phenomenon that you should try to explore more thoroughly? What about expanding into other areas – other authors, other sciences, more time periods? What if you tried to argue for an integrated-HPS conclusion, supporting an inductive philosophical conclusion on the basis of your data in the history of science?

All of these are questions that you would likely ask eventually. But if you’ve thought about them at the beginning of your project, you’ll do better work, with more foresight, and you’ll be better equipped to seize on opportunities as they arise. Now go forth and do digital HPS!

  1. This is, of course, also a common critique of the digital humanities – that we all only work on tools and have lost sight entirely of the humanistic research pursuits that should drive our research. I think this critique is overblown. (On the one hand, plenty of great scholarship is happening in DH, and on the other, plenty of other, non-digital humanists get tool-focused!) But I do think there’s something right about it, somewhere in there. This is, in no small part, my attempt to tease out that something.