When you are looking for family using DNA you can get a lot of data to sort through. Haplogroups can be interesting, but are they really any help finding family. Do haplogroups help find family? I say no, not really. Here is a great article by Roberta Estes on DNA-Explained-We Match But Are We Related? that addresses this question. You can follow the link to read the entire article and see the charts.
Last week, I received this question from a reader, we’ll call him Jim:
“I match Susie on the HVR1 and HVR2 regions of our mitochondrial DNA….but I was just wondering….are we related?’
Well, the answer is yes…and maybe.
You see, the answer hinges on the definition of the word “related.”
If Jim means related at any point in time, the answer is yes. If Jim and Susie share the same haplogroup, at any level, then they did indeed share an ancestor at some point in the past. The question is – how long ago? And that part of the answer isn’t easy.
Now, if what Jim means is related in the sense of “in a genealogically meaningful timeframe,” which is generally anytime from the present back in time roughly 500 or maybe as long as 800 years….the answer is a resounding maybe.
And of course, the answer differs a bit, depending on whether you’re talking about mitochondrial DNA, Y DNA or autosomal DNA.
Let’s look at all 3 types of DNA tests.
First, Jim doesn’t have enough information to make that “genealogically meaningful” determination. To do that, he and his match both need to test at the full sequence level for mitochondrial DNA. The full sequence test tests all 16,569 locations of the mitochondria, where the HVR1+HVR2 tests only 1135 locations. Family Tree DNA is the only testing company to provide this level of testing.
Jim needs more information.
If Jim and Susie match at the full sequence level too, then the genealogical timeframe becomes possible. If they match with no mutations, meaning a genetic distance of zero, it becomes even more likely, but it’s certainly not a given – nor is figuring out who the common ancestor might be. For example, below are my closest full sequence matches and my most distant matrilineal ancestor was from Germany. Most of these matches are Scandinavian.
However, exact full sequence matches are where you start to look for a common ancestor. No common ancestor found? Then at least look for common geography.
One of the easiest ways to do that, for both mitochondrial DNA and Y DNA, at Family Tree DNA, is by utilizing the Matches Map, available on your toolbar.
Assuming your matches have completed their most distant ancestor’s location (which is not always the case,) it’s easy to look for match groups and clusters on the map. Your most distant ancestor’s balloon will be white, with your matches color coded. You can click on any of the balloons to see the match, their ancestor and location. These are my full sequence matches. Surprisingly, my closest matches aren’t in Germany at all!!! Hmmm….time to start looking at what happened in history that might account for this population movement.
In many cases, people will match at the HVR1 and HVR2 levels, but not match at higher levels. In fact, they may both be haplogroup H (for example) at the HVR1 and HVR2 levels, but the full sequence testing refines their haplogroups and their extended haplogroups may no longer match each other. For example Jim’s refined haplogroup could be H2 and Susie’s ’s H6. Both are subgroups of H, who was born roughly 12,800 years ago, according to “A ‘Copernican’ Reassessment of the Human Mitochondrial DNA Tree from its Root” by Behar et al, published in The American Journal of Human Genetics 90, 675–684, April 6, 2012.
So, yes, Jim and Susie are definitely related in the past 12,000 years – but I’m not thinking this is what Jim was really asking. I refer to this as “haplogroup cousins.”
However, a lot has happened in 12,000 years. As in, mutations happened, and subgroups emerged. So while Jim and Susie might both be members of haplogroup H, they are not both members of the same subgroup, so their ancestors both developed mutations which classify them into subgroups H2, born not long after H was born, and H6, born about 11,000 years ago.
So, the bottom line is if you don’t match at the full sequence level, you’re not related in a genealogically meaningful time frame. If you do match at the full sequence level, you might be related in a genealogically meaningful timeframe.
A couple years ago, I set about looking at mitochondrial DNA mutation rates and discovered that the only academic paper published that addressed this in the HVR1, HVR2 and coding regions was written about penguins. Not exactly what I was looking for, but it does explain why there is no TIP type calculator for mitochondrial DNA.
Family Tree DNA does provide some guidelines in their learning center.
Matching at the HVR1 level means that you have a 50% chance of sharing a common maternal ancestor within the last fifty-two generations. That is about 1,300 years.
Matching on HVR1 and HVR2 means that you have a 50% chance of sharing a common maternal ancestor within the last twenty-eight generations. That is about 700 years.
Matching exactly on the Mitochondrial DNA Full Sequence test brings your matches into more recent times. It means that you have a 50% chance of sharing a common maternal ancestor within the last 5 generations. That is about 125 years.
I personally think that the 5 generation estimate of a 100% match for the full sequence is overly optimistic. In fact, a lot overly optimistic. I do find people who do share common ancestors at the full sequence level, but it’s the exception and not the rule – although part of that may be because the surname changes every generation so it’s genealogically difficult to track. However, genealogical matches would be much more common if more people tested their mitochondrial DNA.
You can see a good example in this article of how mitochondrial DNA told me a story I didn’t know about my matrilineal line – and would never have known without full sequence testing.
What I didn’t include in this article is that many of my mitochondrial DNA matches shared their mutation information with me, and I created a “tree” that showed exactly where each mutation happened and who shared a common ancestor with whom.
I obviously can’t share that chart publicly, but the chart below conveys the methodology. The oldest known ancestors of these matches lived in the locations listed at the bottom of the chart.
In the above case, you can clearly see that it’s very likely that the founder lived in Scandinavia because at least some of the descendants of all three unique mutation groups, A, B and C live in Scandinavia today. However, Mutation J is found in Germany. This suggests that sometime after the common mutation, F, an individual migrated from Scandinavia to Germany. Mutation K, who also shares mutation F, is still in Scandinavia today.
It’s a bit easier to answer the “are we related” question for Y DNA because the surnames are often the same. So yes, if you match on STR markers (those are panels for 12, 25, 37, 67 and 111 markers) and you carry the same surname, you’re likely related in a genealogically relevant timeframe. Don’t you hate it when you see those weasel words like “likely?”
However, if your surname is Smith, or something else very common, and you only match at 12 markers, and you don’t match at higher levels, then again, you’re probably a haplogroup cousin. Names like Smith and Miller are occupation names and every village across continental Europe had at least one at all times. So, there are lots of Smiths and Millers that have the same base haplogroup and aren’t related in a genealogically meaningful timeframe.
You can see an example of this in my Miller-Brethren project. These are Miller families, German in origin, who belonged to the small German Brethren religious group.
I thought this would be a relatively small, easy project, but not so much. There were a lot more genetically different Miller surname groups even within the small Brethren church than I expected.
As you can see, many of these groups share haplogroups, especially major haplogroups like R-M269.
In some groups, some individuals have tested additional SNPs by taking either individual SNP tests, the Big Y or SNP panel tests, offered on their individual pages.
So, for example, you may see the haplogroup designations of R-M269 and R-CTS7822 in the same family grouping where the STR markers match exactly or nearly. Confusing? Yes, but that means that one individual had taken additional testing. If you look at the haplogroup trees, you would see that CTS7822 is downstream of M269 in haplogroup R.
The important thing for finding genealogically relevant matches is matching high numbers of STR markers. I encourage everyone to test at 67 markers, and I like to see 111 if the budget allows.
If you match someone at 67 markers, exactly, there’s a very good chance you’re very closely related.
For example, cousin Rex matches cousin Richard at 67 markers with only 3 differences. I happen to have their genealogy, and I know when these two men’s lines diverge. They descend from two different sons of Michael Miller (Mueller) who was born in 1692. Three cumulative Y STR mutations have happened since that time in these men’s two lines.
Rex’s haplogroup is R-M269, but Richard took the Big Y test, so his haplogroup is shown as R-CTS7822 and he now sits as proxy for the rest of the Michael Miller descendant group.
Y matches have access to the TIP calculator, that little orange box shown on the match page above to the right of each matches name. The TIP calculator provides generational estimates to a common ancestor, weighted by haplogroup marker mutation frequency.
The TIP calculator shows us that, based on their mutations at 67 markers, these two men are most likely to be related between 6 and 7 generations. At the 50th percentile, they are as likely to be related sooner as later, so the 50th percentile is the number I tend to use for an estimate of the distance to the most recent common ancestor.
In fact, their common ancestor is 7 generations ago, counting their parents as generation 1.
The more markers tested, the more data you, and the TIP calculator, have to work with. I’ve found the TIP calculator to be quite accurate at 67 and 111 markers when using the 50th percentile as a predictor.
What? You say you don’t match anyone with your surname?
That’s more common than you think.
One of two things could have happened.
First, your paternal surname line may simply have not tested yet.
You may be able to search in the appropriate surname project and find a group of people who descend from “your” ancestor with different DNA. That’s a pretty big hint too, assuming the genealogy is accurate. If the genealogy is accurate, and your line is the “odd man out,” the next question is always “when did the genetic break occur,” and why. That leads us to the second scenario.
Second, there could be an undocumented adoption in your line. I’m using undocumented adoption in the most general sense here, meaning anything from a child taking a step father’s name to a true adoption. The surname does not match the biological line and we don’t know why – so some “adoption” of some sort took place someplace.
The question is, one or two?
I first ask people if they really want to know the answer, because once you pursue this avenue, you can’t close Pandora’s box.
If the answer is yes, they are sure, then I suggest they find a male with their surname that they know should be related and test him.
The answer will become obvious at that point, and the test plan from there forward should reflect the discovery from that test.
The question of “are we related” can be more obtuse when discussing autosomal DNA.
On the other hand, like with Y DNA, the answer can be very evident.
In fact, there is an entire spectrum of autosomal DNA matches and I wrote about how much confidence you should put in each type.
Haplogroups – Do They Help Find Family?
But let’s get down to the very basic brass tacks.
There are only two ways you can match someone’s autosomal DNA.
Either you share a common ancestor or you are matching by chance.
When you receive DNA from your parents, that DNA came from their ancestors as well. All of the DNA you receive from your parents came from some ancestor.
Then, how can you match someone by chance?
You have two strands of autosomal DNA. Think of two lanes of a street. However, the houses on both sides of that street have the same address. Your Mom’s DNA value goes in front of one house, in one lane, and your Dad’s goes in front of the house with the same address in the other lane, but we don’t know whose DNA is whose and there is no consistency in whose DNA goes in which lane.
You can see in this example that you received As in all positions from Mom and Cs in all positions from Dad. However, these alleles can be positioned in either your strand 1 or 2, so the entire roughly 700,000+ locations typically tested for genealogy is mixed between Mom and Dad. So, there is no way to tell, just by looking at your DNA, which DNA in any position (strand 1 or 2 at any address) came from whom.
You can also see, looking at the chart above, that if someone matches you on all As, they match you on your Mom’s side, and if they match you on all Cs, they match you on your Dad’s side. This is called identical by descent. This means, yes, you are related.
But what happens if someone has ACA? They match you too, by zigzagging back and forth between your Mom and Dad’s DNA. That’s called identical by chance, and it’s not a valid genealogical match. This means, no, you’re not related, at least not on this segment.
I wrote more about this phenomenon and tools to work with your DNA in “One Chromosome, Two Sides, No Zipper.”
How can you tell the difference between identical by descent (related) and identical by chance (not related)? Therein lies the big question.
If you match someone who also matches one of your parents, then you match them through that side of your family – identical by descent from a common ancestor.
Don’t have parents to test? Then how about your parents siblings, aunts, uncles, first cousins….etc. Often the best way to tell if a match is a legitimate match is by who else they match that also matches you. This is why we encourage people to test all of their relatives!addressing the question.