Categories
New Features Tools

Put your sibling tests to work with the Shared cM Investigator

What is it? The Shared cM Investigator is a new free tool at DNA Painter.

Why would I use it? It allows you to estimate how much DNA the parent of a set of siblings might have shared with a match.

What do I need in order to do this? You need a mystery match and the segments that two or more siblings share with that match.

I’m pleased to release a new tool today called the Shared cM Investigator. Intended for those who have siblings tested, but not the parents of those siblings, the tool uses segments and a simple mathematical equation to estimate how much DNA the parent might have shared with a match.

The tool was developed in collaboration with Amy Williams, whose lab published a paper in 2018 presenting the DRUID (deep relatedness utilizing identity by descent) method. Amy also is releasing a DRUID tool today, and she explains the technical details in this blog post.

The Shared cM Investigator tool

You can find the tool at https://dnapainter.com/tools/sci

Before you use it, you’ll need:

  1. Two or more siblings who have taken an autosomal DNA test with their data at a site that provides shared segments (e.g. 23andme, FamilyTreeDNA, Gedmatch, Geneanet or MyHeritage)
  2. A match that you are investigating who is on the same site
  3. Segment data for the DNA that each sibling shares with the match under investigation (for information on how to find this, please see this help page)

Entering the data

The Shared cM Investigator entry screen
The Shared cM Investigator entry screen

The interface is similar to the Distinct Segment Generator tool, but with separate boxes for each set of data:

  • You’ll see two fields, Sibling 1 and Sibling 2. Paste the segment data that each sibling shares with the match into a different numbered field (the order is not important)
  • In cases where you have data for more than two siblings, you can click ‘add data for another sibling’ and an additional field will appear
  • If you leave an additional sibling field blank, it will not be used in the calculation, unless…
    • … if an additional sibling has tested and definitely does *not* share DNA with the match, you should check the box ‘Sibling has tested but shares no segments of DNA with the match’ in order to have this taken into consideration
  • Once you’re ready, click the ‘Estimate Shared cM for Parent’ button

What happens next

The Shared cM Investigator:

  • Calculates the total cM for the distinct segments that the siblings share
  • Takes this total and divides it by F where F = 1 – ½|S|
    • (where S is the number of siblings used)

Please see Amy’s blog post for more technical background.

Results

The results page outputs:

  • The total distinct cM extracted from the sibling data
  • The estimated cM that the parent might have shared with the match based on the proportion of the parent’s DNA that this number of siblings typically accounts for.
  • The extracted segment data in a format ready to be pasted into a DNA Painter chromosome map
  • A link to the shared cM project for the estimated DNA amount so that you can explore possible relationships that the parent might have with the match.
The Shared cM Investigator results screen
The Shared cM Investigator results screen

Shared cM Investigator accuracy

The results will not surprisingly vary. By comparing the amounts that known parents share with the total output by the Shared cM Investigator (SCI), I can offer the data below based on a limited number of examples in my own research:

SCI cM shared outputActual cM shared% diff
289314-8.65%
1491434.03%
84769.52%
66660.00%
5062.6-25.20%
10194.66.34%
322311.83.17%
315311.81.02%
728724.30.51%
11601208-4.14%
My table showing a comparison of Shared cM Investigator estimates with the actual amounts shared. I generated these estimates using segment data from four siblings.

As you can see, the output from the tool was within 5% of the actual total cM shared in 60% of examples. It was then within 10% in all other cases except one.

  • The tool underestimated the total by 25% in one case where the parent shared 62cM of DNA across three segments.
  • Notably, in the example above where the parent shared 66cM across 6 segments, the tool’s output was exactly the same as the actual amount shared.

To state the obvious, the output of the tool is always an estimate:

  • The parent might have shared a large additional segment of DNA with a match that no tested children have inherited. In these cases, the total will be underestimated.
  • The parent might have passed absolutely all the DNA they share with a match down to all tested children. In these cases the tool may over-estimate the total, particularly if only two siblings are used.

Another slightly inevitable caveat:

  • With apologies for those with endogamous ancestry or pedigree collapse, the tool assumes that the shared DNA comes from just one parent, so if this isn’t the case, the result is likely to be overestimated.

I hope you find the tool helpful. As ever, all feedback is welcome.

Contact info: @dnapainter / jonny@dnapainter.com