Frequently Asked Questions about WATO

Some answers to common questions about 'What are the Odds?' (WATO) are listed below. Click on a question to see the answer. I would be very grateful if you could email any suggested improvements or additions to this page to info@dnapainter.com.

Before you start, please bear in mind that WATO is an advanced tool. Before you use it:

  • You should have a clearly stated research question along the lines of “Who was the parent of X,” where X is either someone who has taken an autosomal DNA test or is a recent direct ancestor of such a person, A good idea is to state the research question in the “Notes” field below the tree.
  • You must already have done enough genealogical work to have identified a person or couple who are the Most Recent Common Ancestor(s) (MRCA) of two or more DNA matches, and
  • You need to know precisely how those DNA matches are descended from this couple.

If you're not quite at this point yet, use the family trees of your matches and the Shared Matches/in common with (ICW) feature at your testing company to try to fit the matches into a descendant tree for an MRCA couple.

Please also remember:

  • WATO is still a new and experimental 'beta' tool that may change in future. It is therefore advisable to use the results to identify the best hypotheses for further investigation and not to rely on them as indicating a definitive answer. Further work on your hypotheses might include additional DNA tests or paper-trail research.

General

WATO is designed to help you figure out where someone, called the target, might fit into a known family tree by using the amount of DNA they share with people in that tree. It provides a simple interface to build the skeletal tree in which you enter the descendants of an individual or couple. You might be researching an unknown parentage case, in which case this tree would include people who share DNA with the target. Or you might simply be curious about an unknown match and be looking for a simple way to map out possible relationships based on the amount of DNA your family members share with them.

Once you've built the tree, you can enter the number of centimorgans the target shares with any matches in the tree and then enter hypotheses, potential places where the mystery person might fit into the tree. WATO will then generate a score for each of the hypotheses. The higher the score, the more statistically likely this hypothesis is to be correct.

The scores indicate how your hypotheses compare to one another. First, any hypothesis that is impossible given the data gets a score of zero. Then the possible hypotheses are ranked, starting with a score of 1. When more than one hypothesis is possible, they are ranked with higher scores being direct comparisons to the score = 1 hypothesis For example, if you have three hypotheses with scores 100, 5, and 1, the highest is 100 times more likely than the lowest and 20 times more likely than the second-place hypothesis.

Any score of 1 or higher is considered possible, while any score of zero is not. A single incorrectly placed person in the tree can potentially nullify a hypothesis, making its score zero. To see why a score is zero, make a note of the hypothesis number and scroll down to the 'Collated Match Data' table. Any instances where the cMs shared are not statistically viable for the relationship will be shown in red.

In general, you need to consider the most likely hypothesis compared to the next one. The absolute value can be very high, but it is relative scores that matter. If you have several hypotheses and some have very high scores, then this may mean the lowest scores are very unlikely (again, the collated match data table can confirm this).

For example, if you have scores of 1800, 900, 5 and 1, the hypotheses with scores of 5 and 1 are very unlikely and can probably be ignored, but two highest scores are very close as one is only twice the other.

However, if you have several hypotheses but only one is possible, it will only ever have a score of 1, and this doesn't necessarily make it any less meaningful than a score of 1800!

Another thing to remember is that the calculations involve an approximation which causes the results to be too confident when several closely related individuals are included. If you have matches to siblings, it is worth testing how sensitive the results are to removing all but one of the group of siblings.

More on scores

The tool is not meant for endogamous populations. Endogamy may significantly affect the scores, and we don't yet have good probability data to figure out how.

Recent pedigree collapse (for example grandparents who are first cousins) is also likely to have a significant effect, meaning WATO will incorrectly rule out some relationships that are possible. However, more remote pedigree collapse may not affect the numbers a great deal.

The target person in WATO would normally be someone whose results you have access to. This is because you need to enter the amounts of DNA they share with people in the tree. But if you don't know who a match is, you can still use them as the target and try to fit them into your tree, if they have tested at 23andMe, MyHeritage or Gedmatch. These sites all allow you to see how much DNA your matches share with each other, so you will be able to get the relevant cM amounts as follows:

  • MyHeritage: within the shared matches area of the 'match detail' page.
  • 23andMe: by comparing your match with people in your tree via the DNA Comparison tool
  • Gedmatch: by comparing your match's kit number with those of people in your tree via the 1:1 comparison tool

Not in most cases. For more info please read this article.

Age and birth years

If you don't know this person's birth year, you don't have to enter it, but if you would like to use WATO's 'suggest hypotheses' feature, then you'll need to enter an estimated year.

WATO has the following assumptions built in. Hopefully some precision can be added to these in future with the addidtion of some statistics that take into account the likelihood of people becoming parents at different ages:

  • The minimum age for a female to be considered a feasible parent is 12 and the maximum is 55
  • The minimum age for a male to be considered a feasible parent is 12 and the maximum is 75
  • The maximum age gap for half siblings is 30 years

Building the Tree

The very first thing you should do is to clearly state the question you are trying to answer. Your question will take the form of “Who is the parent of X?” where X is either the target person or a recent direct ancestor of the target person.

WATO lets you hypothesise about where someone might fit into a tree. That person is the “target”. A hypothesis is a position within that tree where you think they might fit.

You can still build the tree manually if you prefer, but you can also now import a GEDCOM via the 'Load' button in the top right menu. After you browse for your file, the site will ask you to select a person or couple whose descendants you would like to import.

There are two constraints on GEDCOM size that might affect you if you have a very big tree:

  • There's a 60MB size limit on files. This is so that the list of people to choose from doesn't get so big that the browser can't load it.
  • If you select a couple with a huge number of descendants, you may encounter problems when importing and/or navigating your tree.
    • The number that qualifies as 'Huge' will vary depending on how fast your computer is an how much RAM it has available.
    • Trees with 250-500 descendants of the selected root person/couple should work fine, and more might work if you have a fast computer.
    • Once you get into thousands of descendants, things are likely to be very sluggish.

Not at this time.

These indicate different half-sibling lines. If you click 'suggest hypotheses', WATO will often create these lines to explore these possibilities. The different colours represent different groups of half siblings with each group having one different parent.

To mark/check half-siblings in the tree, hover over a node and click the 'Define half relationships' button.

You can use use matches from any of the companies in WATO. You can even use matches from different sites in the same WATO tree. For AncestryDNA, 23andMe, and MyHeritage, use the shared DNA amounts as given by the company. For GEDmatch matches, do a one-to-one comparison using the default settings.

You can include both if you wish, but WATO will only consider the parent in the calculations.

Strictly speaking, there is no minimum, but the statistics underlying WATO do not go below 40 cM, so the majority of your DNA matches should share at least 40 cM. The more they share, the better WATO works.

The Ancestry simulations on which this is based only go to 9th degree relatives, meaning 4C, 3C2R, 2C4R, half-3C1R, etc., and down to 40 cM. That is the limit of relationships and sharing that it will work well for. Beyond that we have made some approximations. The analysis will not be accurate unless the majority of relationships are closer than 9th degree and the majority of matches are over 40 cM.

You can, but with only yourself as a data point, the tool will not be as useful.

A low match, or even the absence of a match is significant in that it can correctly rule out hypotheses that might otherwise have been possible. For example, if you can confirm that you share no DNA with someone, then any hypothesis that this person is a 2nd cousin or closer would be ruled out.

You don’t have to add all of the descendants of your root couple, but it’s recommended to add all of the children of that couple who survived to adulthood, whether their descendants have tested or not.

You can add them to the tree as a visual reminder, if you like, but do not include their amount of shared DNA with you. This is because you already know how you're related to them

Alternative: A user with a lot more experience than me showed me a trick for how to capture a sibling's information in the chart. While the sibling himself was not added to the chart, she recommended that I add the amount of DNA he shared with each match. So, for example, if Bob and Ken are brothers and they both match Diane, I would add just one of the brothers to the tree as the hypothesis… but then add the amount of DNA each man shared with Diane, as if Diane has a sibling. To help, label one Diane-Bob and the other Diane-Ken. This captures the amount of DNA both brothers share with Diane.

Because siblings inherit different amounts of DNA from each grandparent, you will share different amounts of DNA with your matches. Sometimes, this results in one hypothesis ranking highest for one sibling and another hypothesis ranking highest for another sibling. This is normal. Focus your next research efforts on the parts of the tree that rank high for both of you.

Per the requirements listed in the tool, you must have the amount of shared DNA between the target person and the other tested members in the family group.

The target person labelled with hypothesis numbers is the person who has tested. This means that for cases like this, you would need to repeat the part of the tree that includes the path from Z down to you in every place where it could fit.

For example, perhaps you have tested and your maternal grandfather's parents are unknown. If your hypotheses are that he was the son of either Mary or Martha, you need to repeat the piece of tree with him, your mother and yourself in the various positions he might fit in, and mark each repetition of yourself as a hypothesis.

Currently only one person can be the hypothesis. The easiest way to handle this situation is to build the tree and copy it for each tested person, then adjust the matching cM numbers. This may rule out some hypotheses and indicate which ones to pursue further. There is an advanced technique called “twinning” that can be used in this situation, but we strongly recommend that you be completely comfortable with WATO and how it works before trying it.

Adding hypotheses

You should try to consider any position in the tree that is technically possible based on the genealogical information you have, not forgetting to include possible unknown half-sibling relationships.

Start with your top match and plug their cM amount into the Shared cM Project Tool. Use those possibilities to guide your initial set of hypotheses. Don’t forget to include half relationships where appropriate.

Yes. You will see a 'suggest hypotheses' button at the top left. This is something of a double-edged sword, particularly if you don't have many matches entered into the tree, because it will attempt to suggest every conceivable hypothesis, even when it may be only tenuously possible from a genealogy perspective. You can right click and select 'remove hypothese' to remove any that are unhelpful. The tool also has a button to remove all the suggested hypotheses.

Interpreting the Scores

No. WATO uses probabilities to steer you in the most likely direction, but cannot provide proof on its own. It can help point you towards which area of the tree you should focus your research next, whether that is testing more people or compiling more documentary evidence to have a more compelling case.

It’s counterintuitive, but this is the best case scenario for WATO. Assuming you have considered all of conceivable hypotheses (including unknown descendants and half siblings), this means that only one hypothesis is possible and is as conclusive as WATO can be.

An hypothesis with a score of zero is not possible considering the data at hand. If you are sure all of the matches are in the right place and the centimorgan amounts are entered correctly, you can safely ignore that hypothesis.

If you only have one possible hypothesis, then its score will always be 1, since the score is a calculation of how likely a hypothesis is when compared to the other possible hypotheses. If you have several possible hypotheses, then your hypothesis with score 1 is considered statistically less likely.

To see the details, click the green 'View score calculation' link at the top left. This will take you to a table where you can see the individual probabilities that have been used to calculate the score.

As noted above (in 'What do the scores mean?'), the scores are relative to all other possible hypotheses. So a score of 24,000 could be compelling, but not if there's another score of 23,900!

The scores are comparing your hypotheses. This shows that the match data you have is not enough to distinguish between the possibilities.

The scores are calculated by taking all possible hypotheses and comparing them. So if you have a hypothesis that is possible, but very unlikely, this may cause the most likely hypotheses to have extremely high scores. If you scroll down to the collated match data table you can see the individual probabilities that contribute to the score. You could try removing your lowest scoring hypothesis if you think it's sufficiently unlikely.

Other resources for help with WATO

Next: Glossary