DNA Relationship on SA Tree - Is the standard deviation 2 more generations than geni predicts?

Started by Sharon Doubell on Monday, November 21, 2016
Problem with this page?

Participants:

Profiles Mentioned:

  • Private
    Geni member
Showing all 22 posts
11/21/2016 at 2:27 AM

Is anyone else coming up with a consistent difference between the relationship predicted by geni using your DNA, and the actual relationship as plotted in the tree?
I'm noticing a pattern where the people who geni predicts I am at a 3rd to 4th cousin genetic distance from, are typically my 6th cousins on the geni SA tree.
eg Private (predicted 3rd to 4th – actually 6th); Christelle Horne Viljoen (predicted 3rd to 4th – actually 6th) Private User (predicted 3rd to 4th – actually 6th); Petrus Philippus Potgieter, b1c8d1e5f9g6h4i9J3 ((predicted 3rd to 4th – actually 5th); Charlette Louise Hoppe (predicted 3rd to 4th – actually 6th); Kirsten-Louise Mason (predicted 3rd to 4th – actually 7th)

I'm wondering if this is the typical standard deviation that the 'white' SA tree produces because we were such a bottleneck (interbred) population?

Private User
11/21/2016 at 4:41 AM

Yes that's correct, or inter marriage or re linking the branches. In-breading sound a bit Brakpan. :-)

11/21/2016 at 4:44 AM

All my at matches are EXACTLY double the distance predicted. Pretty sure that 2x the prediction is correct! (Basically double the distance because the mid point is in the middle and the suggestion - is the actual prediction, dont know how else to explain it.) - this is not true for direct relationships, because there is no mid point, the actual must be equal to the prediction.

11/21/2016 at 4:50 AM

Sharon, when you say 'actually 6th' or 'actually 5th' - you should also add 1 more for every 1 removed. So 6th twice removed becomes 8th. I have over 50 matches all 2x the revised distance.

11/21/2016 at 5:34 AM

It is impossible that this can have anything to do with your suggestion that we are all more related than we should be, as at dna intertwines (disappears) quickly over generations... you will see the max that Geni suggest is 4th cousins (i.e. 8 steps apart, using revised distance) because I guess they realised this too. Statistically impossible to measure this with at dna except to look at a tree for duplicates - must be within 4 generations else it will not be picked up/passed on with sufficient statistical relevance.

11/21/2016 at 6:37 AM

Yes Sharon, noticed this too.

11/21/2016 at 7:50 AM

Examples: Using the GENI layout

1.

Relationship (R)

fourth cousin twice removed = 6 steps apart

Suggestion (S)

2nd Cousin Once Removed to 3rd Cousin Once Removed = 3 (2+1) to 4 (3+1) steps apart

2x suggestion = 6 to 8 steps apart = actually = 6, therefore, within boundaries

2.

R: fifth cousin thrice removed = 8 steps

S: Same as 1

will be true with the above suggestion (2*S=6-8)

3.

R: 6th cousin twice removed = 8 steps

S: 3rd or 3rd Cousin Once Removed = 3 to 4 steps

TRUE (R within 2*S)

4.

R: 7th cousin once removed = 8 steps

S: 3rd to 2nd Cousin Twice Removed = 3 to 4 (2+2) steps

TRUE (R within 2*S)

11/21/2016 at 8:19 AM

"you should also add 1 more for every 1 removed. So 6th twice removed becomes 8th."

Yes. Good point. Hang on - I always need a picture to figure out the removed thing - printing one out, then I'll try and follow.

11/21/2016 at 9:56 AM

Okay - there is a roughly double relationship between all my Dad's predicted relationships matches & the ones I have in reality:

Gerhardus Cornelius Hendrik Viljoen:
predicted: 3rd to 4th
actually: 6th once removed=7

C.Barry:
predicted: 3rd to 4th
actually: 6th once removed =7

Petrus Philippus Potgieter:
predicted: 3rd once removed to 4th = 4
actually: 5th twice removed=7

Charlette Louise Hoppe:
predicted: 3rd once removed to 4th =4
actually: 6th twice removed =8

Kirsten-Louise Mason:
predicted: 3rd once removed to 4th =4
actually: 7th twice removed =9

Zanne Harvey
predicted: 3rd once removed to 4th =4
actually: 6th once removed =7

The reason I left the 'removed' steps out, though is because they don't actually equate to the same genetic distance.
ie You are twice as strongly related to your first cousin once removed, as you are to your second cousin.
So "3rd once removed to 4th" don't actually both = 4.

11/21/2016 at 10:24 AM

This is the picture I use to figure it out: http://isogg.org/wiki/Autosomal_DNA_statistics

11/21/2016 at 11:05 AM

You say "It is impossible that this can have anything to do with your suggestion that we are all more related than we should be, as at dna intertwines (disappears) quickly over generations..."

I'm not following why you think it is impossible, Jan.
Geni's matching algorithm is likely to assume a single common ancestor for a pair of cousins, but the bottleneck that occurred in the white population here makes it far more likely that white SA cousins have multiple common ancestors.
I'm saying that we share more DNA with our 6th cousins than the average population group does, so geni is predicting that we're closer cousins than we actually are (typically about two steps closer).

11/21/2016 at 12:16 PM

Good - the algorithm is good... if you look at the 4th example I gave... it says '3rd cousin to 2nd cousin twice removed' - i.e. 3rd cousin is stronger than a 2nd cousin 2 times removed...

Do you understand the midpoint part tho - it is the reason why we have to multiply by 2.

Hmmm I would prefer if we can take my reasons for stating that at DNA cannot reliably predict commonness in the RSA population (where we are all 10th to 11th generations of the Progs) - perhaps a separate discussion as if you follow the mid point argument and that we inherit 50% it becomes clear mathematically this has nothing to do with the at DNA test for relationships, as it is a 'midpoint' argument. - i.e. the midpoint is a midpoint of another, and if they are related but not close enough, becomes mute...

11/21/2016 at 1:27 PM

I had assumed it was becauseof multiple pathways as Sharon suggests. Carel Barry and I have found that we are related through both parents, not just the one as Geni's pathway shows, and Charlette Hoppe and I are fairly sure the same is true for us.

11/21/2016 at 1:38 PM

Just a fact that I have 50+ proofs where there this not the case. Can anyone perhaps provide an example... where R<2*(S-1)?...

Private User
11/21/2016 at 6:10 PM

The issue with SA autosomal DNA matching is that we often have double line of descent i.e. two parents descendant of the same ancestor, and this is compounded by the fact that we then have intermarriage through the generations of those with double line of descent. What it results in is a greater total shared centimorgans which makes the relationship looks that much closer (by sum total of centimorgans shared).

I thought FTDNA had an algorithm, for that exact situation in Ashkenazi Jewish populations, to downgrade match closeness. Might be worthwhile investigating / integrating. I assume it works by shifting the closeness based on a fraction of shared centimorgans; as Sharon and Jan have pointed out, roughly a factor of 2.

11/22/2016 at 2:35 AM

As far as I could find... take all of the segments above 5 cM, add them together and then divide by 68. There is also a table that shows how likely it is per generation (up to 50%) that the shared centimorgans becomes smaller (and therefore more likely not to be included in the total). Now this is important, in that the overlap percentages are the same regardless of the relatedness of the man and woman, i.e. there is not a 2x higher probability as a result as suggested! Again, there is the limit of 4-5 generations where there will be no segments greater than 5 cM... so that we came from the same 10-11th gen back progs will have no statistical impact on the at DNA test... i.e. would expect at least a few out of 50 tests to have R<=2(S-1)

I thought of another way to describe the midpoint argument... If every step going upwards has 2 lanes the Geni algorithm only tests the single lane to the closest mid point, i.e. we lose 50% - the other lane (at each step), up and down, which results in 2x the distance.

I would actually think that if R<=2*(S-1) then more than one lane is utilised somewhere and therefore you are more related than suggested and should check your tree!! :)

Private User
11/22/2016 at 3:23 AM

I am not going to go into calculations, I'm just going to keep in mind that is is not accurate.
In any case, a match with a total cM of less than about 200 is not worth looking for a match because the relationship is so far away that it is not significant.
Only useful if you have a person on the tree and want to confirm the connection, and then is the shared cM is small the connection could be from other lines and combined lines. then you need more persons to triangulate the match. That is what Zanne and I did some time ago, but that is also a long and time consuming process.

Private User
11/22/2016 at 3:56 PM

Jan, take my second highest match at 72cM as an example (as my highest is my father 3384cM).

Charlette Louise Hoppe
R: Sixth cousin. (6)
S: Third to second cousin, twice removed. (4 or 5)

R<=2*(4-1) or R<=2*(5-1) e.g. 6th - 8th cousin.

FTDNA puts the relationship at 2nd - 4th cousin range, and Geni.com at third to second cousin, twice removed.

Something occurred to me. If my father represents 50% at-relationship at 3384cM, 72cM would comprise just 1%. Which could place the match between 6 and 7 generations ago?

11/23/2016 at 1:19 AM

That one is not R<=2(S-1), R needs to be 5 or less... R=2S :)

11/23/2016 at 2:09 AM

To summarise (and to get all the brackets and meaning of S corrected in one post)

Get R=Relationship steps, (adding 1 for each removed)

Get SB=best case suggestion steps (adding 1 for each removed) - the first of the two suggestions

Get SW=worst suggestion steps (also adding 1 for each removed) - the second suggesion

1. If R<(2*SB)-1

This is an error, investigate both trees...

2. If 2*SB <= R <= 2*SW

This is correct - is a 'proof' that both trees are correct.

3. If R > 2*SW

It means that one or both of the trees is incomplete, or incorrect. within 5 generations. (Max 2*SW=8 using the current algorithms)

12/11/2016 at 1:20 PM

Yes I'm having this issue and not just with my SA matches - many of my matches are predicted 3rd-4th cousins which I doubt is correct. My Heritage is also predicting cousin matches too close.

12/11/2016 at 1:33 PM

Is it because it's hard to predict cousin relationships past 3rd cousin level due to recombination?

Showing all 22 posts

Create a free account or login to participate in this discussion