Does BHL use crowdsourced transcriptions?

We have implemented functionality to allow BHL Staff to upload transcriptions in place of the automatically-generated OCR (Optical Character Recognition) text for digitized materials in BHL’s collection. This functionality supports transcriptions generated as part of in-house or crowdsourced transcription projects hosted by BHL Partners. The Show Text tab now indicates whether the text has been:

automatically generated and uncorrected;
automatically generated and error corrected, by machines, which may still include inconsistencies;
or manually transcribed by humans.

Please note that BHL’s OCR is generated by its Internet Archive digitization partner using Tesseract Open Source OCR (as of 2020) or ABBYYFineReader.

Web-based crowdsource transcription projects are largely managed through the following providers, DigiVol, FromThePage, and Smithsonian Transcription Center.

Example of correspondence in BHL for which a crowdsourced transcription has been uploaded in place of the automatically-generated OCR, making the document searchable and easily-readable.

Especially for archival materials, like field notes and correspondence with handwritten text, transcriptions make these items full-text searchable and enable our taxonomic name recognition software to index scientific names within their pages. Since the transcribed text can be viewed alongside the digitized page image, users can also more easily read materials with difficult-to-decipher handwriting. Thus, this new functionality makes it easier for researchers and the public to explore these valuable primary source materials and access specific information from their pages.

Interested in transcribing materials? Several BHL Partners have transcription projects on various crowdsourcing platforms. Follow the links below to explore the opportunities and get involved:

Tags: crowdsourcing, citizen science, transcription, OCR, full text search, archives

Permalink

Empowering Global Research

BHL supports research across the globe and in a variety of disciplines. Explore more BHL user testimonials.

Having access to this literature through BHL is a treasure. Being able to show the students the original publications...speaks volumes.

Dr. Tracey Hunter-Doniger

College of Charleston

BHL is a wonderful resource. I use a lot of old and obscure resources in my line of work, and BHL makes getting access to these sources a lot easier.

Dr. Paul D. Brinkman

North Carolina Museum of Natural Sciences

As a free, mobile archive for natural history literature, BHL is ideal for 21st century research, which can happen on the field, in a museum, or at a coffee shop, as long as there’s internet connectivity.

Dr. Nicholas Pyenson

Smithsonian National Museum of Natural History

The Biodiversity Heritage Library is an amazing resource for visual artists! Any artist interested in learning about natural history and science would consider these rare resources invaluable.

Emily Williams, MFA

Troy University

BHL is doing a wonderful service for researchers like me, who work with limited resources in developing countries like India. BHL has had a big, positive impact on my research.

Dr. Varad B. Giri

National Centre for Biological Sciences

BHL is an incredible resource. It provides access to material that is otherwise hard to get and enables me to undertake detailed searches of these sources.

Dr. Karen Sayer

Leeds Trinity University