Does BHL use crowdsourced transcriptions?

We have implemented functionality to allow BHL Partners to upload transcriptions in place of the automatically-generated OCR (Optical Character Recognition) for archival materials digitized in BHL. This functionality supports transcriptions generated as part of Partner crowdsourcing projects on Smithsonian Transcription Center, DigiVol, and From the Page.

Book viewer in BHL with correspondence and crowdsourced transcription.
Example of correspondence in BHL for which a crowdsourced transcription has been uploaded in place of the automatically-generated OCR, making the document searchable and easily-readable.

These transcriptions allow archival materials in BHL, like field notes and correspondence, to be full-text searchable and enable our taxonomic name recognition software to index scientific names within their pages. Since the transcribed text can be viewed alongside the digitized page image, users can also more easily read materials with difficult-to-decipher handwriting. Thus, this new functionality makes it easier for researchers and the public to explore these valuable primary source materials and access specific information from their pages.

Interested in transcribing archival materials? Several BHL Partners have active transcription projects on various crowdsourcing platforms. Follow the links below to explore the opportunities and get involved:

Tags: crowdsourcing, citizen science, transcription, OCR, full text search, archives

Leave a Comment