"Digital humanities," or DH, is the use of digital technologies to pursue research questions in the humanities. Research data is the raw material we need to examine these questions.

When you think of the “digital humanities,” you might picture visualizations such as word clouds, multimedia web-publishing projects, or interactive historic maps.  All digital humanities projects are built using data, so early considerations for DH projects of any scale are:

  • What data to use
  • Which formats data need to be in, depending on the platforms and tools involved
  • Where to collect and store data during project building

Examples of humanities research data include:

  • text files extracted from a corpus of texts by Optical Character Recognition software
  • image files of archival items or artworks
  • geospatial data including raster and vector files
  • oral history sound files and transcripts
  • archival metadata


Digital collections are rich sources of humanities research data.  Increasingly, libraries, museums, and other cultural heritage institutions offer digital collections among their holdings.  See the Lafayette Special Collections & College Archives digital collections, for instance.

Further resources for digital archival research data:

Open access digital collections, such as: 

Open-access multi-institutional collaboratives: the Digital Public Library of America and the HathiTrust Digital Library and Research Center are hubs where multiple libraries share resources for scholarly use.

Proprietary digital archival databases such as Adam Matthew primary source collections.  You may be able to request your library to fund access to proprietary databases.

Archival metadata is another rich source of humanities research data for digital scholarship.  For an intriguing model, see DH scholar Anne Donlon’s project “From Encoded Archival Description to Network Graphs.”  The British Library Labs provide guides and datasets for digital scholarship using archival metadata.