Identical by State versus Identical by Descent

Started by Justin Durand on Saturday, January 24, 2015
Problem with this page?

Participants:

Related Projects:

Showing all 30 posts
1/24/2015 at 12:20 PM

Private User, you asked a question on another thread that I'm going to answer here.
http://www.geni.com/discussions/144233?msg=997058

I hadn't wanted to this technical, but I think there are a few of us on Geni who might like to look at this kind of problem.

Even though we loosely talk about "matches", there are really two different kinds of matches:
- Identical by Descent (IBD)
- Identical by State (IBS)

IBD matches are the matches that come from having a common ancestor. You and your relatives match on a stretch of DNA because you all inherited that piece from a common ancestor.

IBS matches are coincidental matches caused by the DNA recombining. You match someone who is not really a relative because either you or the other person inherited DNA along that stretch from different ancestors, and the combination just happens to match the other person.

They don't look any different in the databases, so there's a lot of chatter in DNA newsgroups right now about how to tell them apart.

This is where size matters. The experts say anything less than 11 centimorgans (cM) are more likely to be false positives (IBS rather than IBD). Many matches between 5 and 10 cM are likely to be IBS. For segments less than 5 cM, very few will be IBD. There was a study published in 2014 that one-step relationships (parent-parent-child). It found 67 percent of all matches in the range 2 to 4 cM were IBS rather than IBD. And that's just one generation. Imagination what can happen over many generations.

This is the reason Gedmatch and Genome Mate set the default threshold to 7 cM. And it's the reason the experts are all over the Internet begging people not to look at anything less than 5 cM.

There is a more thorough explanation here:
http://www.isogg.org/wiki/Identical_by_descent
http://www.isogg.org/wiki/Identical_by_state

Private User
1/24/2015 at 12:52 PM

Thank you Justin, I understand and I have read articles on this. Do they take into account what happens to DNA when there are cousin marriages or pedigree collapse and does this increase the odds of matching segments?

Private User
1/24/2015 at 12:58 PM

This is the closest article I can find on matching segments in relation to cousin marriages and pedigree collapse and why segments less than 3cm should not be ruled out as "noise" in the IBS.

http://www.fi.id.au/2013/09/why-phasing-dna-is-bad-for-valid-and.html

1/24/2015 at 12:59 PM

Applying that to the problem of Edward III, Philippa of Hainault, Richard III, John Sinclair, and all their kin, you can start to see the problem.

First, we don't have their DNA for comparison. I guess "we" actually do have Richard III but as far as I know it hasn't been published yet. That means we are just guessing at what their DNA might have been. But finding a modern Sinclair match to someone who is a descendant of John Sinclair doesn't mean that match came from John, and even if it did, we have no way to know that John got it from Edward III and Philippa. We can play with ethnic origins utilities, but even if we find a touch of French or Hungarian there is no way to know it came from Philippa rather than someone else.

Second, our matches are more likely to be IBS rather than IBD. It's unlikely that any royal DNA really came down intact. For anything as far back as the 1200s, our segments have probably eroded away over the generations. If we do have bits and pieces left, they are going to be very small, so they will be indistinguishable from IBS.

Having the paper trail go through the male line doesn't help preserve segments on any chromosome except the y. All the chromosomes are recombining independently of one another with no way to know what the others are doing.

1/24/2015 at 1:32 PM

Wanda, part of the problem here is that Felix wrote that article before the 2014 study highlighted how severe the problem really is. I think he makes a good point about not throwing away 3 cM matches, but that's not the same as saying they're reliable ;)

Based just on my own experience, I have tons of matches that are no more than one 2 or 3 cM segment on one chromosome. I don't spend much time on those, although I do look at them.

I also have a lot of much larger 7 to 10 cM matches where the same person might share up to maybe 10 of those little 2 and 3 cM matches with me. I always take time to look at these in depth.

For me, this second pattern is very common with people who came from the same little area of Sweden where my ancestors lived. Also, one guy who is not interested in genealogy but who was born in the same little English village as one of my 2nd great grandmothers. I'm willing to grant that these matches could be IBS rather than IBD. That is, they might be nothing more significant than having ancestors in the same population group.

Still, I always check these. The ones I find particularly interesting are the ones where 2 or 3 of those tiny segments line up exactly for maybe 3 or 4 different people all with ancestors in the same village. Those are probably IBD, but our common ancestor might be so far back that there's no chance of finding the connection.

I get a similar pattern, but a little different, with people who share 3 or 4 or 5 different sets of Colonial ancestors with me. The large segments tend to be smaller and there aren't as many small segments. I think those are interesting but I don't have enough patience to track whether they are IBS or IBD. Too much work to sort through so many common ancestors. I just mark it in my database as "Kittery, Maine" (or whatever) and move on.

I think all highly inter-married groups experience these same kinds of matches. The message boards are full of Ashkenazi Jews who have these kinds of multiple, small, indefinite matches.

The problem with all of this is that once the intermarriage with close relatives stops, normal erosion eliminates the matches very quickly. If I have an ancestor in the 15th century whose parents were fairly close cousins, they would have had significantly more matches to all of their relatives, but over the centuries, without continuing cousin marriages, the evidence of the intermarriage would slowly disappear.

They say that "In a study of a European subset of the Population Reference Sample (POPRES) dataset it was estimated that for the most part IBD blocks longer than 4 cM come from 500 to 1,500 years ago, and blocks longer than 10 cM are within the last 500 years."

Based on that, I think it's better to focus on matches greater than 5 cM, going down to 3 cM only if there are also quite a few 1 and 2 cM matches.

Private User
1/24/2015 at 3:36 PM

That makes sense Justin :) I have had quite a few of those. It was because of the current cousin's who also had smaller cm matches or our shared distant ancestors and a pedigree that we could confirm our matches by both DNA and pedigree. For example Catherine Morgan Bryan, the Wife of William Smith Bryan. My Cousin and I share an enormous amount of DNA because of that line and because of our shared West line. However, he does not share Catherine's Husband William Smith Bryan. Because we share Morgan it validates that Morgan existed. With another cousin I share Bryan farther back which validates that Bryan existed. With yet another cousin I share a Fitzgerald which validates the existence of the Bryan line through the Fitzgerald line and another Fitzgeral line validates that Fitzgerald existed. It's just a way to pinpoint and validate ancestors through those DNA, Surname and Pedigree matches.
My curiosity is about the Beaufort line. As of yet, I have had no Beaufort matches on FTDNA although they show up in Geni. (They may simply not be on there) So I wonder to myself. Having a Sinclair match would seem to validate the existence of Beaufort for there are no Beaufort's to test match with.

1/24/2015 at 3:45 PM

I just caught the end of this discussion but I wanted to say that both your questions, Private User, and answers, Justin Durand were awesome.

Private User
1/24/2015 at 4:57 PM

Thank you Jacqueline and Justin :) Although you believe that small cm matches of so long ago should not be relevant, why do they show up, why do the surnames match and why does the pedigree match? I think it goes with Felix's line of thinking that the cousin marriages and pedigree collapses create longer chunks of DNA segments passed down in future generations. I think it is those longer pieces that have been carried down through the centuries that are the reason that they show up at all. Just my opinion. While they are only chunks or segments in comparison to YDNA, I think they can be relevant in a different way, at least for matching cousins.

1/24/2015 at 5:30 PM

Wanda, the "official answer" across the DNA forums right now is that it's because of our high degree of interrelatedness in much more recent generations.

Using Sinclair as an example, I have a Sinclair ancestor who came to America in 1660. It's not a surprise, then, that I would have matches to other people who have the same Sinclair ancestor, but it doesn't tell us anything about that Sinclair's distant ancestry.

Pushing a step further, the odds would be astronomical in favor of any of us matching many people who have some kind of Sinclair ancestry.

It wouldn't be a surprise if I match someone who has a different Sinclair ancestor. Our match might be on Mayhew, but if the other person hasn't discovered their Mayhew line it would look to them like the match could be through Sinclair.

This is particularly true for the highly interrelated Americans with Great Migration ancestry.

So, while it's true that the DNA might match, and it's true that the surnames might match, and the pedigree might even match, but it's never completely certain that the surnames and pedigree match because the DNA matches.

The significant piece here is that 2014 study. They compared 2 to 4 cMs segments of children to each parent and discovered that in 67 of the cases the segment was a composite of the two parents. Not a real match.

That's mindblowing, I think. It means that 2/3 of the time we look at a segment less than 5 cM it's a false positive. Not a segment that came down intact from someone in our ancestry. I would not have guessed from everything I read before.

Private User
1/24/2015 at 9:34 PM

Justin, that is true and that can happen. I think what they are not taking into account is the silent X. Every Woman has two X chromosomes. Those X chromosomes contain DNA information of many grandparents. The X chromosome however is not consistent. It is completely random. The two children of the Parent's you were talking about inherited that X information from their Parent's but the X skips generations so those kids are more likely to inherit Grandparent information rather then Parent information because of recombination and the silent X. Children are never supposed to be carbon copies of their parents. It's amazing how it all works.

I kind of think of it like this; Generation 1 Jim and Ellen = 8 white marbles, Generation 2 Ted and Tracy = 8 Blue Marbles, Ned and Nancy = 8 Green Marbles, Generation 3 Roxanne and Ricky = 8 Red Marbles.
Let's say Jim and Ellen are first cousins so we give them 12 white marbles.

Generation 4 a baby called Alice is born. Alice has 36 marbles to pick from called her ancestors. Let's say recombination is throwing all the marbles into the air at complete random but before they fall we swipe 8 marbles and call that silent X. Then we take away an additional 6 random marbles away so that Alice only gets 22 marbles because that's all she has room for.
Alice get's 22 marbles which stand for 22 chromosomes.
What color do think the marbles are and which ancestors' will you see?

All the marbles that are left after recombination are the ones we have to work with to match our ancestors. It's a lot more complicated then that because in real life the marbles that stand for Centimorgan's have millions of base pairs and they are not just one color but in the simplest form the recombination is similar to that. It could be why matching DNA is such a spotty process and why some people don't match when they should (silent x) or just assimilated DNA taken up by the more recent generation's. However, the first cousin marriage produced more white marbles which hog up more space on the chromosome thereby improving the odds of "matching to cousins".

Yes you can totally match a Sinclair with one person but not another. (silent x syndrome) I think the more people that participate the better the odds of confirming. We all do the best we can and there certainly are errors and always will be but I think the science is always evolving and improving.

For people who just can' commit to a match there's always the term "Appears to be a valid ancestor". It's good enough for me until the science get's better. :)

1/25/2015 at 12:10 AM

Wanda, I wonder if you can find another way to explain what you mean. I'm not following your example

I'm also not sure what you mean by the "silent x". I can't think of a way or reason to describe any part of the x as silent.

I'm also confused when you say the x chromosome is completely random. All of the chromosomes are completely random, except the y, so I'm not coming up with some particular way the x is (even more?) random.

And, I'm confused by what you mean when you say the x skips a generation. The x recombines just like the other chromosomes.

I think part of the point you are making is something to do with some unique characteristics of the x. You and I understand them, I think, but it might be worth a quick explanation for anyone who doesn't.

Women inherit two x chromosomes from each parent, but men inherit and x from their mother and a y from their father. Simple enough, but this has very serious consequences for genetic genealogy.

The short version is that you can never inherit anything on the x chromosome from a line that goes through a father and son.

For example, I'm a man. I got an x from my mother and a y from my father. If I have a son he will get his y from me. If I have a daughter she will get my mother's x. In other words, there is no scenario where I can pass on an x chromosome from my father's side of the family.

For my sister it's quite different, but there's still a stopping point. She's a woman so she got an x from our father and an x from our mother. However, because our father got a y from his father, my sister can't get anything on the x from our father's father. The x she gets from our father is a re-combined x from the two x's of our father's mother. The x she gets from our mother is a recombined x from the two x's of our mother.

The result is a very odd inheritance pattern. Most people use a computer program rather than trying to sort it out with pencil and paper.

For DNA purposes I use a subset of my database that goes back only 15 generations. In those 15 generations I have 2,344 known ancestors (of out a potential 32,768), but only 86 of them are potential contributors to my x chromosome. Just a little over 3 percent of my entire known ancestry.

What's happening here is that most of us are more successful at tracing male lines than we are at tracing female lines, but the inheritance of the x chromosome tends to cut off male lines (without actually following just the female line).

Private User
1/25/2015 at 8:03 AM

Hi Justin, maybe this will explain it better than I can. http://dna-explained.com/2014/01/23/that-unruly-x-chromosome-that-is/

1/25/2015 at 9:40 AM

I remember this article. He is saying that the x chromosome does not seem to recombine as much as other chromosomes, so it is far more common to inherit the x intact or almost intact from a particular grandparent. Is that what you get out of it?

So maybe you are saying that the odds are increased that we would have matches on the x to very distant cousins and those matches would mean its more likely we could discover something about very distant ancestors?

That would make sense. Until now I thought you were saying that each ancestor was leaving a tiny mark on the x, so any match, however small, would be significant. But of course that made no sense to me, given the limited number of people in the x line and also this idea that the x is "resist" to recombination.

1/25/2015 at 10:00 AM

Right after this article came out, I was curious to see if I could figure out whether my x had recombined from my mother's parents or whether I had substantially more from one or the other.

My mother, my maternal half-sister, and my half-sister's son have all tested at 23andme. It doesn't do any good to compare children to their mothers because they would naturally match her 100%.

So I compared myself to my sister and her son. My sister and I match a little over 60 percent, which tells me that at least one of us has a recombined x. My nephew and I match percent, so he has a recombined x.

I thought I would be clever and try to use the Ancestry Composition tool to see if I could figure it out by looking at the projected ethnic percentages. This was before 23andme phased my mother's results, so I can't repeat it now, and it was just an off the cuff question so I didn't take notes.

The way I remember it, I used the pieces that were labeled French & German versus the Scandinavian pieces to figure out that about 80 percent of my x came from my mother's father and 20 percent from my mother's mother. My sister seemed to be about 50 percent from our mother's mother, but I couldn't determine the other 50 percent.

The bottom line is that I was able to determine that I have a recombined x, but I couldn't tell whether my sister does. I could probably use Genome Mate or some other program to figure this out more accurately, but I haven't done it yet.

One of the funny things about trying to duplicate this process today, I run into the problem that my mother's results have been phased against "one child" -- obviously my sister, not me. 23andme currently thinks my x is about a third French & German, but my mother doesn't have any French & German on either x ;)

Private User
1/25/2015 at 12:58 PM

The unruly X seems to make matching a little tricky. I think it's why we can find a line we match then we wonder why we don't match another line and should. We never know who where or when the X is going to be silent so matching is spotty for sure.
For example I match de Warenne in my Pierce/Percy line. This is all the way back to the 1200s. Shouldn't show up right? But it does. The nice thing about it is I can say that my line appears to be legit all the way from the 1200's to present for the Percy, Pierce and Pearce's because de Warenne validates the existence of my Father's YDNA line which covers about a thousand years. If there was any hanky panky going on then there would be no Pierce/Pearce/Percy and of course, no de Warenne.

Private User
1/28/2015 at 8:09 PM

Well, friends, my preliminary report is in. I am J2a1a1b. The rest of my information will be available in about 2 weeks.

Now I get to go back and re-read this thread and the Richard III thread to see if anything enlightening was said about J2a's.

Private User
1/28/2015 at 8:50 PM

Congratulation Maria :) are doing haplogroup only or?

Private User
1/28/2015 at 10:04 PM

23andme autosomal test - does that help? as knowledgeable as I am about so many other areas of genealogy and geni.com, I am a complete babe in the woods when it comes to this stuff. I'm going to be a diligent student and learn from those who can teach.

1/28/2015 at 10:30 PM

The 23andme test is equivalent to the FTDNA Family Finder test. You're on the right track, Maria. I'll be excited to hear about your results.

12/26/2016 at 11:58 AM

Good Morning and Seasonal Greetings Justin.

Less than a month ago. My daughter ordered a DNA Kit through Amazon, thinking that she will be receiving a 23andMe Kit. But, instead they had mailed a basic AncestryDNA Kit for Ethnicity. The folks at Amazon apologized for sending this kit because they were out of the 23andMe, and so they had given her credit for difference of pricing.

I completed the test, but I am not satisfied with the results, and realizing that this DNA test is basic, and generic. All it does is showing you what percentage of ethnicity within your DNA. I am not statisfied with this, because I already knew the DNA pathways to my ancestors certain countries birth. The DNA Pie is interesting, but it does not help me in connecting to my ancestor's, and cousins to my family tree.

So, I have ordered another kit (23andMe) which I heard is one of the best rated through ISOGG, and now awaiting for this 23andMe DNA kit.

Since I have the results from AncestryDNA. Is it compatible for Geni Family Tree db, or was it a wast of time taking the this test?

Now, looking at the chart (Pie). It is telling me what I already know about orgins, and paths of ethnicity of my ancestors. Example:

Ireland 20%, Europe West 17%, and 63% on other regions. (They are Native American 15%-From Alaska, North America, Central America, and South America. Which could be from my mothers side, and since she claims to be Mexican Indian with Spanish influences), North African 3%, Asia Central 2%, Italy/Greece 15%, Iberian Peninsula 12%, Scandinavia 8%, Great Britain 5%, Europe East 1%, and West Asia (Caucasus) 1%.

So, my question is this. Is this test acceptable to Geni's Family Tree data pool at this time? Or, should I wait and complete the 23andMe DNA Test? Am I on the right track?

Please let me know.

Respectfully,

Ken

12/26/2016 at 12:24 PM

Currently you cannot upload 23andme test to Geni. See http://help.geni.com/hc/en-us/community/posts/245270147-DNA-uploade...

12/26/2016 at 12:43 PM

Ken, you are on the right track.

Right at this moment the only test results that are being processed immediately at Geni is FTDNA. However, you can upload other results, then Geni will process them when their system is ready.

Each of the testing companies has strengths and weaknesses. If you have a tree at Ancestry, you should start seeing DNA connections to your distant cousins over the next few months. I think you'll like that, and decide it wasn't a waste of time after all ;)

You might also want to upload your Ancestry results to Gedmatch.com. They accept results from all the major testing companies, so it's a good way to extend your matches.

Private User
12/26/2016 at 5:52 PM

Update on Richard III:they finally got around to publishing what they'd been able to unscramble of his Y-DNA. According to these results (they're online but you really have to dig for them), he probably belonged to haplotype G2a, and the details, while fragmentary, showed that he was *not* a close match to (the reported results for) Thomas Plummer of Anne Arundel, MD - so if there was a common ancestor there, it was probably about the time the "Merovingians" were filtering into France from wherever and long before they became an "Important Royal Family".

Same for you too, Justin? :-)

Private User
12/26/2016 at 6:01 PM

Leonid: it'snot that you *can't* upload 23andme results (or anybody's except FTDNA's), it's that we have no idea when they'll get around to it.

12/26/2016 at 6:36 PM

Maven, these days when someone mentions Richard III I generally run the other way.

I try to be relaxed and detached and non-committal about these things. Really try. But this part makes me nuts.

I'm G2a myself. I don't really doubt Richard was (probably) G2a3, but the NPE debate has been so awful and amateurish I think it's kinder to say nothing. I'm G2a2..., so it doesn't affect me.

I have a running battle with a man who believes he has proved the Habsburgs were Merovingians in the male line and R1b -- because that's what he is and he believes it is self-evident his ancestors were Habsburgs under a different name. There are, however, anecdotal reports that the Habsburg-Lorraines are G2a.

With this drama, and the Bourbon drama, and the Habsburg drama, it seems there is probably something significant going on with G2a lines, but it remains to be seen how long it will take everyone to notice and focus their efforts.

Private User
12/26/2016 at 7:07 PM

I'm sticking to the facts. :-)

Sounds like you're not even as close to Richard III as Thomas Plummer - I take a personal interest in Thomas for "family reasons". ;-)

IMHO all the hoo-hah can be traced back to the "Merovingian Mystique". They've been a source of mythmaking from the get-go, what with all that megillah about Merovech being gotten on Chlodio's wife by a Quinotaur (sea-bull?). And it's only gotten worse.
:-P

12/26/2016 at 8:27 PM

Yes, the Plummers might be closer.

Geni says the Plummers are G2a3b1, as I imagine you already know. However, G2a3b1 doesn't exist in the current ISOGG tree. It's likely to be an older notation for G-P303, which is now G2a2b2a.

When articles say Richard III was probably G2a3, they mean G-S126 aka G-L30, which is now G2a2b. But really he is only confirmed G2 (G-P287).

Net result -- the Plummers belong to the same G subgroup as Richard III, but they have a more refined result. They can be shown to be three levels more specific than Richard. He could have belonged to their subgroup. Odds are he did, but the data can't confirm it.

I tried to get by with just saying I'm G2a2, but now I have to confess I'm actually G2a2b2a1a1b1a1a1a (G-L42 aka G-S146).

So, you see, the answer is actually a bit more complicated. To make it a bit easier:

G = M201 the whole group
G2 = P287 confirmed Richard III
G2a = P15
G2a2 = CTS4367
G2a2b = S126 probable Richard III
G2a2b2 = CTS2488
G2a2b2a = P303 Plummer & the majority of haplo G men in western Europe
...
G2a2b2a1a1b1a1a1a = L42 me, way down there in a tiny Swiss & Norwegian group far from the main action.

Private User
12/27/2016 at 3:29 PM

Taken STR by STR, Richard and the Plummers aren't all that close, hence my guesstimate of a *very* distant common ancestor. I may be exaggerating the extent of deviation, but probably not by all that much. :-)

At that, there are at least four major and several minor Plummer families, and some dubious links (particularly on the Anne Arundel line, which it seems everyone wants to belong to - because Frances White Wells and her presumptive gentry < nobility < royal descent).

Plummer of New England: R1b (R-M269/DF83)

Plummer of Pennsylvania: R1a (R-M512 - where'd *they* come from?)

Plummer of Anne Arundel: G2a (G-M201/P303)/R1b (R-M269/P311 - the latter lines have suspect links, and they *don't* match the New England Plummers either).

Southern ("Kemp") Plummers: I-L39

FTDNA finally got around to updating these groups to the new system - took 'em long enough!

12/27/2016 at 8:25 PM

If this is a subject that interests you, you can use Dean McGee's Y-DNA Comparison Utility to compare the STRs and derive an estimate of TMRCA using the known mutation rates for individual STRs.

http://www.mymcgee.com/tools/yutility.html?mode=ftdna_mode

12/28/2016 at 12:42 PM

Thank you Justin. I'll let you know about my results.

Showing all 30 posts

Create a free account or login to participate in this discussion