Column – Big Data

“Big data” is about finding patterns in large amounts of data. It is used in many places. Supermarkets use the purchase data to optimize store design so they know to put the beer next to the nachos. Medical researchers discover correlations between genetic mutations and illnesses to diagnose hereditary diseases.

As genealogists, big data can help us too. Genealogical data is being made available as open data, allowing researchers to use it for their own analyses. Some patterns will be predictable, such as that infant mortality is lower when the parents’ income is higher, or that families have fewer children if the parents are older when they married. But unpredictable patterns can provide new insights.

I have made a modest start by analyzing the data from my one-place-study of Winterswijk. I discovered that one in four children born in 1839 died before the age of eighteen, and that one in three of the surviving children emigrated to America. By analyzing family connections, more than half of these emigrants settled in a place where a relative already lived. This “chain migration” was much more common than I had realized and emphasizes the importance of researching other emigrating relatives.

Insights based on statistics can also help us prove relationships. When trying to identify parents, I can use that the names of the candidate-parents appear among the children’s names. That argument is stronger if I can show that 99% of families in that time and place named their children after their grandparents.

Statistical analysis by itself is not enough to come to a conclusion. Our ancestors may well be that 1% who did not conform, so we have to demonstrate that the general pattern applies in their situation by finding supporting evidence in other sources. Because we don’t want 1% of our tree to be incorrect.

Computer demonstration, 1966. Credits: Eric Koch, collection Nationaal Archief (CC-BY-SA)

Filed Under: Blog Tagged With: open data

Comments

Virgil Hoftiezer says

23 February 2018 at 11:16 pm

Very interesting. And how many of those 1 in 3 immigrants have descendants who have had their DNA tested????

- Yvette Hoitink says
  
  26 February 2018 at 10:30 am
  
  Plenty are showing up as my matches 🙂 It’s one of the ways I discover where the emigrants went to. I’ll look at a tree of a DNA match and see a Winterswijk name in a Dutch settlement in the US I didn’t already know about. I will then check the census records and usually find a small cluster of Winterswijkers. I will discuss this in my emigration lecture at NGS in Grand Rapids in May.

Column – Big Data

Comments

Leave comment Cancel reply

About

Free newsletter

Webinar

Categories

Column – Big Data

Share this:

Comments

Leave comment Cancel reply

About

Free newsletter

Webinar

Categories