Quick tip – Search for common OCR errors

Text mistakenly indexed as 'Harnerstein'

Text mistakenly indexed as ‘Harnerstein’

As genealogists, we often search pictures of texts, for example when we use newspaper websites like Delpher. Often, automatic character recognition techniques (OCR) have been used to convert the images to searchable text. These techniques aren’t perfect, especially with poor quality ink, old fonts or digitized microfilms.

The mistakes that OCR techniques make are somewhat predictable: an m gets recognized as rn, an e as a c, and an l as a 1. By searching for the versions with and without a predictable OCR error, you may find surprising results.

For example, when searching for Hamerstein , also try Harnerstein to search for cases where the m got mistaken for rn. In the case of Delpher, Harnerstein will give you just one hit. But if that’s an article about your ancestor, wouldn’t you be glad you found it?

About Yvette Hoitink

Yvette Hoitink, CG® is a professional genealogist in the Netherlands. She holds the Certified Genealogist credential from the Board for Certification of Genealogists and has a post-graduate certificate in Family and Local History from the University of Dundee. She has been doing genealogy for over 30 years and helps people from across the world find their ancestors in the Netherlands. Read about Yvette's professional genealogy services.

Comments

  1. Another excellent tip. Thanks!

  2. Mary Nash says

    Great tip. I have used Delpher a lot in finding info on my family both in Holland and Indonesia. Congratulations on winning the atlas!

Leave comment

*