Purposeful Gaming

 
Purposeful gaming and BHL: engaging the public in improving and enhancing access to digital texts.
Although this project ended in Nov 2015, both Smorball and Beanstalk games will continue to be available in 2016 at http://smorballgame.org and http://beanstalkgame.org and the input will continue to improve OCR output from BHL. Thank you for playing and helping improve access to science resource!

About Smorball

Smorball wins “Best Serious Game” award at Boston Festival of Indie Games! BFIG logo

Players of the more challenging Smorball game are asked to type the words they see as quickly and accurately as possible to help coach their team, the Eugene Melonballers, to victory to win the coveted Dalahäst Trophy in the fictional sport of Smorball. Each word typed correctly defeats an opposing smorbot and brings the Melonballers closer to the championships.

Back to top

About Beanstalk

Players of the more relaxed Beanstalk game must type the words presented to them correctly in order to grow their beanstalk from a tiny tendril to a massive cloudscraper. The more words they type correctly, the faster the beanstalk grows. Players who accurately transcribe the most words will ascend to the top of the leaderboard as a result of their valuable contributions.

Back to top

Both Smorball and Beanstalk were designed by Tiltfactor and are licensed as Free and Open Source Software (FOSS).

We’re not currently integrating material from other institutions in OUR build of the game, but the good news is the games and their supporting software are open source so you can fairly easily host your own.

There are a few steps to hosting your own Smorball or Beanstalk games:
1. Prepare your material. The games are OCR correction games, and in order for them to function they take data in the form of single words that different OCR software disagree on their interpretations of. Each “difference” sent to the games must have a page image URL, a location on that page image, and two strings that represent what the two OCR software THINK the word is. It’s from these two strings that the games estimate whether or not the player has typed the right answer.
2. Host the game(s) and the game backend. You can find the game code here: https://github.com/tiltfactor/smorball and the code for the game database and data management server here: https://github.com/tiltfactor/SmorballBeanstalk-Backend.
3. Configure the games. If you want to run Beanstalk, make sure your version of Beanstalk has its own high score database (via parse.com). If you want the facebook and twitter buttons in your Smorball to go to your social media accounts, generate facebook and twitter developer API keys, etc.

Back to top

Project Overview

This project, which has been generously funded by the Institute of Museum and Library Services (IMLS), aims to significantly improve access to digital texts through the applicability of purposeful gaming for the completion of data enhancement tasks needed for content found within the Biodiversity Heritage Library (BHL). This project tackles a major challenge for digital libraries: full-text searching of texts is significantly hampered by poor output from Optical Character Recognition (OCR) software. Historic literature has proven to be particularly problematic because of its tendency to have varying fonts, typesetting, and layouts that make it difficult to accurately render. The European Union’s IMPACT project, a 2008-2012 effort to improve access to texts states that poor OCR does in many cases not produce satisfying results for historical documents. Recognition rates are poor or even useless. No commercial or other OCR engine is able to cope satisfactorily with the wide range of printed materials published between the start of the Gutenberg age in the 15th century and the start of the industrial production of books in the middle of the 19th century.” This state of affairs illustrates the pressing need to identify additional solutions to OCR for improving access to digital texts.

The BHL is an international consortium of the world’s leading natural history libraries, including the Missouri Botanical Garden’s Peter H. Raven Library, that have collaborated to digitize the public domain literature documenting the world’s biological diversity. This has resulted in the single largest, open-licensed source of biodiversity literature made available both through the Internet Archive and through a customized portal at http://www.biodiversitylibrary.org. BHL is a perfect testbed for investigating alternate solutions to the generation of digital outputs both because it is a significantly large corpus (41 million pages of scanned texts accompanied by 41 million OCR outputs) and because most of its content is historic literature (the majority of BHL content was published between 1450s-1900s). OCR is also largely ineffective on hand-written texts such as field notebooks–a growing content type in the BHL.

Purposeful Gaming and BHL will demonstrate whether or not digital games are a successful tool for analyzing and improving digital outputs from OCR and transcription activities because large numbers of users can be harnessed quickly and efficiently to focus on the review and correction of particularly problematic words by being presented the task as a game.

The project runs from December 1, 2013 through November 30, 2015 and will be conducted by the Missouri Botanical Garden’s Center for Biodiversity Informatics (CBI) in partnership with Harvard University, Cornell University, and the New York Botanical Garden.

Back to top

Project Team

Missouri Botanical Garden

  • Trish Rose-Sandler, Data Project Coordinator, Center for Biodiversity Informatics
  • William Ulate, Senior Project Coordinator, Center for Biodiversity Informatics
  • Mike Lichtenberg, Programmer, Center for Biodiversity Informatics
  • Stephen Kappel, Programmer, Center for Biodiversity Informatics
  • Doug Holland, Director, Peter H. Raven Library
  • Mike Blomberg, Imaging Lab Coordinator, Peter H. Raven Library
  • Chuck Miller, Vice President of Information Technology and Chief Information Officer

Ernst Mayr Library of the Museum of Comparative Zoology at Harvard University

  • James Hanken, Director of the Museum of Comparative Zoology
  • Constance Rinaldo, Librarian of the Ernst Mayr Library
  • Joe deVeer, Project Manager
  • Robert Young, Special Collections Librarian
  • Patrick Randall, Outreach and Communications

The LuEsther T. Mertz Library, New York Botanical Garden

  • Susan Fraser, Director
  • Susan Lynch, Systems Librarian
  • John Mignault, Systems Librarian (previous)
  • Kevin Nolan, Digital Projects Manager
  • Lisa Studier, Metadata Cataloger
  • Yumi Choi, Catalog Librarian
  • Andrew Tschinkel, Scanning Technician
  • Paul Silverman, Scanning Technician

Cornell University Library

  • Martin Schlabach, Librarian
  • Kevin Nixon, Professor of Botany
  • Holly Mistlebauer

Back to top

Original Proposal & Schedule

Project Narrative Purposeful Gaming Narrative.

Schedule of Completion Schedule of Completion.

Workflow diagram Workflow Diagram.

Word comparison across outputs Word Comparison across Outputs.

Back to top

Reports

IMLS Final Report.

IMLS Narrative Report Year 1.

Back to top

Presentations/Papers

Back to top

Media Coverage

Games Coverage

Back to top

Featured Texts

Back to top

Choice of Game Designer

Back to top

Initial Grant Award

Back to top

Related

Back to top

Contact Us

For more information please contact the project’s Principal Investigator, Trish Rose-Sandler at 314-577-9473 x6396 or trish.rose-sandler@mobot.org.

Back to top