OCR Detective

Review and correct digitised text snippets

OCR (Optical Character Recognition) is a fantastic technology that enables
computers to automatically read words from scanned images and convert them to
text that we can then read and search. However, even the best OCR algorithms
stumble with certain issues like dirty pages and rough scans.

In this campaign, we ask you to review words and the text that was identified
by an OCR algorithm, and tell us whether the algorithm got it right -- and if
not, what the correct answer should be. The texts come from Marchiver's trove
of historical publications, and your input will help improve their corpus.

_This campaign is a collaboration with the
Marchiver project.

Start contributing!


  1. rotsee: 360 (36 answers)
  2. annelise: 160 (16 answers)
  3. chemikyn: 150 (15 answers)
  4. rlafuente: 80 (8 answers)
  5. annelise2: 70 (7 answers)
  6. atestuser: 70 (7 answers)
  7. xx: 30 (3 answers)
  8. daharvest: 10 (1 answers)

Back to the campaign list Back to the index