Sep20

26 Comments

Hello out there in Genealogy Land! I have been running silent these last few months here on the blog, but continue to march yet unto that elusive tree of perfection. I have a quick item that I wanted to post today and file in Things That Make You Go, “Huh?”

When checking my matches over on MyHeritage DNA, I found one recently that is estimated at the 3rd – 5th cousin level.

DNA Match Summary
OK, I says, that’s not a bad match – although I strongly suspect that MyHeritage’s cousin-relatedness are overestimated by at least 1 or 2 levels. That would make me feel like this is really at the very best a 4th – 6th match and perhaps further. In fact, another factor feeds this assumption for me:

This match is from Norway. I am 50/50 Czech and Irish. While I understand those country-centric terms don’t accurately represent the mishmash of DNA we all carry, I have found that this match along with other matches from Norway have trees that go many generations back with clearly Norwegian names in them. (Go figure.)

So, OK, perhaps 5 or 6 generations ago someone from one of “my” countries headed up that way, or vice-versa and now I need to welcome my new Norwegian cousins and brush up on learning how to cook Kjøttkaker and Gravlaks. (Mmm, Gravlaks.) But, a couple of these Norwegian matches are in the 3rd to 5th cousin range, so you’d at least think we’d see some slightly similar locations on the map start to appear around the time our common ancestor would be. Nope.

But forget about all that – let’s look at the next thing that MyHeritage gives us for matches: Shared Ethnicities.

Now, I also understand that there is some algorithmic voodoo at play in mapping ethnicities. (By the way Algorithmic VooDoo is now my new band name.) Can one really attribute a particular snippet of DNA to a location absolutely? Eh, maybe in some cases, but overall I think they are smearing the lipstick a little broadly. In the case of this match, I found something else surprising.


Shared Ethnicities Chart

I pasted our “Shared DNA” numbers on this chart for reference – so assuming we share enough DNA to be in the 3rd – 5th cousin range, would we not also have at least one category of ethnicity that we are both a member of? I realize our total shared is only 0.4%, but even so, if they can estimate her Scandinavian ethnicity down to a 10th of a percentage, there shouldn’t really be any rounding error going on.

Yet, there is not a single ethnicity that we share.
And now I am left wondering what I should do with all of these Gravlaks?

(By the way: That Iberian % doesn’t show up at all in my 23andme results, and on AncestryDNA I have 3% Iberian in the Low Confidence Region. One of my favorite regions, doncha know.)

26 Comments

  • avatar

    Comment by David — June 25, 2020 @ 3:02 am

    Hey, it is funny. I also have this problem even after they made the matching update in 2018. Most of my 3rd-5th cousin matches (at least 2 out of 4) do not really share any ethnicities with me. Moreover, I have a Norwegian match whom I share 1.1% with (shown as 3rd-4th cousin, mind you, even closer than in your case), and we only share West Asian ethnicity (and based on their extensive trees they have nothing to do with West Asian continent, go figure what the deal of that is). It is probably an admixture to them, since they have less than 15% of it shown. I am not sure how reliable the whole matching algorithm is even now, definitely something feels off. Not only that, but some matches on FtDNA do not appear on MyHeritage either. Supposedly due to how segments are interpreted differently or missed altogether for those that transfer vs those that buy the test on MyHeritage. I used to think that ethnicity estimates are bogus, but now i am starting to lose faith in matching. These issues are simply confusing and not helpful to me.

  • avatar

    Comment by John — June 25, 2020 @ 10:57 am

    I have to go back and look at this match again and see if the algorithm changed things up over the last 3 years.

    I do agree that it is hard to lose faith in matching: I would have thought that by now with the boom in the sizes of each company’s database some of my harder to find parts of the tree would have had some serious matches.

    I have had quite a few closer known cousins test, which should really boost identifying things – but really, I have gotten nowhere with those ancestors.

    Meanwhile, on a couple of parts of the tree that I already had a full picture going back quite far in time: lots of matches. I think at least part of the whole process is luck of the draw on which distant cousins get into testing. My wife’s Nova Scotia Irish/Scottish side is bursting at the seams with matches, but my darn Tierney/McTiernans are still hiding in the bushes.

  • avatar

    Comment by Nikky — July 14, 2020 @ 10:00 pm

    Yup. On top of these chance-based surprises that we long for but rarely get, imagine calling at least 5 different customer reps and getting mixed feedback. I addressed this exact problem you described in your article when your shown ethnicities contradict pure ethnic matches of unfamiliar ethnicity. No clear explanation whatsoever. For example, I was told that your ethnicity inheritance is random and yet your matching is not affected by that. Say again? That’s a contradiction! What it supposedly means is that you will match a 100% ethnically pure individual probably as close as 2nd cousins, yet your link to the person in question remains intact. Sorry, I do not get it. Not one rep would explain this dead end mismatch to me. And that’s in addition to regions being displayed in unlikely overlap, like an Irish person showing Balkan region more than 30%, which is a riddle if you ask me. If you trust the ethnicity estimate you will likely fail. Even when you compare yourself with matches.

    Also, these matches rarely respond. And when they finally do, it feels as if our English alphabet got smaller in size.

  • avatar

    Comment by John — July 24, 2020 @ 10:07 am

    Nikky, I detect a little stress in your tone that I share with you. 😉

    Yeah, I think the ethnicities reports are still very much a toy for newbies to see in most cases now.

    I guess my optimistic take is if they keep making the toys better, perhaps at least some small percentage of people will start to become more interested and we’ll have more people to share and compare with. Fingers crossed!

  • avatar

    Comment by Isaac — August 3, 2020 @ 6:44 pm

    There are several reasons why this may happen. Some families are known to practice the so-called endogamy. Meaning they marry within a starting set of say 20 people per given generation. As a result, you may get double cousins on some lines. Because you may be related to them in more than one way. This may exaggerate the relationship closeness, which shows a percentage of shared dna larger than what may be normally expected from this type of relationship. To see how this works, suppose your brother married your wife’s sister. Your children would have some common segments split into more segments while the total shared dna may not decrease by a lot.

    Norway and Ireland are close geographically speaking. If one of your Norwegian matches had an Irish ancestor common to you both 7-8 generations ago, or you had a Norwegian ancestor that long ago common to the match, it may be due to the common ancestor being possibly very endogamous. Some parts of Norway are so interrelated that matches from those areas tend to be 5th-6th cousins two or even three times. Similarly, the Norwegian ethnicity would probably not show up in your estimate if your match had a remote endogamous Irish ancestor. Every country displayed endogamy at some point in time in certain areas, usually they happen in rural communities.

    Finally, a NPE (non-paternal events) may exist in your tree or theirs. This means that genetic trees may be different from paper trail ones. This happens when somewhere in your trees there is a break of some sort. It may be paternal/maternal infidelity, secret baby adoption, unintentional baby switch at birth, among other possibilities. And it is very difficult to trace these mistakes or breaks that happened so far back in time. Research shows that every tree has a NPE somewhere. Some relatively close in time, some hundreds of years back.

    There is no rule set in stone, so personal research is probably the best bet to solve these inconsistencies.

  • avatar

    Comment by Josh Loflin — August 15, 2020 @ 6:25 pm

    Nice article. It seems the central issue addressed here is not often discussed, as most people just buy into ethnicity estimates and do not pursue genealogical relationships.

    Speaking of NPE’s, I do not understand how it can influence the discrepancy shown in your article. The question everyone should be asking is how these companies take sample reference populations to compare you with in the first place. This is where the NPE may come into play and in the end distort your ethnicity results. I mean within the last couple of hundred years the adoption documents were successfully concealed. Through time most of the records are lost or nearly impossible to retrieve or verify.

    Sure, there are multiple people they collect samples from. But when they openly suggest you having a NPE only because your results include ‘foreign’ influences, how can they be so sure they themselves did not fall into the same trap? And it does not look like they can overcome it just by improving their software. Technology only reads what you give it.

  • avatar

    Comment by John — September 14, 2020 @ 9:30 am

    Josh, I agree, I don’t see the NPE issue influencing what I’m (sort of) complaining about here:

    My point was that if the service is telling me we have shared DNA at some level, then there has to be *some* overlapping shared ethnicity, even if the service’s ethnicity estimate is completely off.

    Now that I’m thinking about it again, I suppose the answer might be that the DNA we share was just not a section that is used in ethnicity estimates. In any case, the ethnicity estimates are still more entertainment than having any research value. Guess over time that might change.

  • avatar

    Comment by Gabriel — September 17, 2020 @ 11:09 am

    Hello everyone!

    These frustrations may or may not intensify soon, as MyHeritage is planning a major ethnicity update in the coming weeks.

    Frankly I wish they did it as regularly as Ancestry does. As you probably heard, Ancestry removed tons of ‘relatives’ below 8 cM shared dna, so now I have way more matches on MyHeritage (and the latter does not allow matches below 8 cM). My guess is that most of my people did not feel like coming to the US yet lol

    By the way, John, your comment about the shared section of DNA not being in the reference panel makes total sense. I got a pure Baltic match on myheritage (3rd through 5th cousins) yet they do not assign this region on my end. Ancestry found it! Throughout the last two updates the baltics region changed percentages but did not disappear.

    So let’s wait for MyHeritage embrace or keep denying some of our valid regions lol

  • avatar

    Comment by Hugh — October 3, 2020 @ 9:33 am

    Hey there, John.

    How about using the AutoClustering tool on MyHeritage and seeing where this match shows up?

    I am surprised no one mentioned this to you.

    The tool gives clusters of matches, with each cluster pointing to a close or remote common ancestor.

    If the match is listed on a cluster, at least you will know which known ancestor or unknown ancestor this might be for your match!

    If you have done this and the match was not on there, disregard my message =)

    Regards,
    Hugh

  • avatar

    Comment by John — October 12, 2020 @ 3:54 pm

    Hi Hugh, Thanks for the tip!
    I’ve played with the Autocluster a few times, but never got anywhere useful with it. I’ll have to give it another crack and look for results where I have close known cousins… which might help me figure out the best analysis technique.

  • avatar

    Comment by Jude — November 16, 2020 @ 1:33 am

    Hi John

    I was wondering if you know much about autoclustering since you’ve mentioned it.

    I have 14 clusters only, and the max cM for them is 30 cM (min is 15 cM).

    Customers reps said myheritage decides which list of clusters is the most optimal for each user, including sizes and shared dna.

    How far would each cluster theoretically be? I know we each have 16 g-g-grandparents. So these 14 clusters are 14 ancestors out of these 16? Or further down the lines?

    I do not know much about my family history (especially dad’s side).

    At least 5 of these clusters are Finnish matches entirely, and most of them seem to overlap with grey squares – possible endogamy.

    I do not know nor recognize anyone in the clusters, haha. And my “playing” with ethnicity vs dna matches – and reported closeness – got me nowhere. So I concur with your observations. Stuck and need some help!

    Thanks
    Jude

  • avatar

    Comment by John — February 9, 2021 @ 10:19 am

    I have played with the autoclusters a bit, but I haven’t really dug into them at all. Sorry I can’t be of more help!

  • avatar

    Comment by Byron — March 7, 2021 @ 12:54 pm

    MyHeritage service does the cluster calculation for you automatically to supposedly give you the most accurate results.

    They say these clusters represent branches of your family and the report includes matches starting with 2nd cousins and further down. Is it something you know is true in your report? Do they generally include matches on all sides of the family?

  • avatar

    Comment by John — March 7, 2021 @ 1:51 pm

    Because MyHeritage results are based on autosomal DNA, they do include all sides of your tree. I am 50/50 Czech/Irish and in my own clusters I can identify a few clusters that are obviously for one side or the other. I do have a good deal more clusters that appear to be from my Irish side than my Czech side, but I just assume that is because a LOT more Irish cousins are in the database.

  • avatar

    Comment by Byron — March 11, 2021 @ 8:41 pm

    So you mean not all Czech branches are showing up. Some are missing based on your observation?

  • avatar

    Comment by John — March 12, 2021 @ 8:59 am

    Well, there are a few branches on both sides that are missing, but that is just a function of who happens to have been tested in the MyHeritage database and is related to me.
    There are undoubtedly many cousins from those Czech lines that are out there and haven’t tested. In fact, out of all of my known Czech cousins, there are a few in the US who have tested, but none of those that live in Czechia. Just don’t think testing has hit critical mass there as it seems to be in Ireland. 🙂

  • avatar

    Comment by Byron — March 12, 2021 @ 1:04 pm

    Wow.. Ok, it makes sense! Their database is full of native Europeans they claim. I guess it is not always the case for everyone.

    So you find these clusters are real and reliable/verifiable? I saw some screenshots online and was skeptical as to how they manage to use mere dna to put up your family branches – without aid of family trees, names, locations, etc. that just seemed too good to be true.

    I don’t think other companies have this feature. If that reliable, then actual adoptees could theoretically use these clusters to discover their heritage/ethnicity in the blink of an eye!

  • avatar

    Comment by John — March 12, 2021 @ 2:10 pm

    Just the luck of the draw I guess – even with a ton of Irish side people testing, I still have very few matches I can identify on my Tierney & McDonald lines, beyond my close cousins. (Annoying, because they are the ones I have the least info about.)

    But, for MyHeritage as a whole and not just DNA matches, I do find there are many more native Europeans. My known Czech cousins that also research are all on there, and I’ve had the most Czech tree matches there. Most likely because the multi-language support vs other sites makes it easier for them to use.

  • avatar

    Comment by Byron — March 13, 2021 @ 8:28 pm

    Interesting. I think I will take the test, curious what clusters or people I will find. Thanks for your insight!

  • avatar

    Comment by Vik — November 18, 2021 @ 3:44 pm

    Myheritage needs to update its ethnicity estimates. Or revise their algorithm for all tools they have. My dna clusters show people who do NOT share any of my ethnicities. And no way of knowing how they relate. Genetic groups look sketchy too.

  • avatar

    Comment by John — November 30, 2021 @ 8:59 am

    Vik, I agree there is weirdness in the whole ethnicities info.

    Obviously the various sites’ ethnicity assignments are based on comparisons to reference groups and do not use ALL of the DNA in our results, so it is completely understandable that some people you match will not appear in some or all of your ethnicities.

    But, if NONE of your matches share ANY of your ethnicities? That seems pretty unlikely to be accurate, I agree.

    IMHO, the ethnicity assignments are the candy store of the genetic side of family research: they are nice to look at, but don’t really add much value to our research unless we have no idea at all what our ethnicities are.

    Hopefully over time as things get more accurate there will be more value to us, but for now my research is more about paperwork, and I’m just hoping one day a few good Tierney & McDonald matches appear for me to help me focus my research in Ireland for them. (But, I’ve been waiting since early days of DNA testing and really expected that to happen once the databases started hitting the multi-million mark, so who knows how long that will be?)

  • avatar

    Comment by Daniel — December 18, 2021 @ 10:49 pm

    Hey everyone. Yes, Vik is right. Maybe some people are more lucky than others. On an individual level, some matches, including names in their trees and their ethnicities, not matching to yours is not completely unexpected. However, the reps assure us that clusters represent your family/genetic tree lines. Ok, in my case my clusters mostly consist of Finnish, German, Baltic and Slavic people (of North and South parts of Eastern Europe). Yet my estimate says I am at least 30% Greek, and more than nearly 10% Irish. And 0% Finnish, 0% Eastern European, 0% Balkan, and 0% Baltic. There is not a single person with Irish or Greek ethnicities in any of my clusters. And I got a whopping 23 of them. The rep said: “I would not trust the ethnicity estimates, because it may take a remote ancestor going as far back as 6 generations to mess up your estimate”. And he confirmed that clusters are never false positive, because the algorithm has a rigorous threshold system of selecting natch groups. My verdict is this: imagine how much money you can make from people wanting to see their ethnicity estimate and then you tell them not to trust it? I am not going to call it scam, but maybe ethnicity estimates should be held off from their selling front. They are definitely trying to swallow more than they can chew.

  • avatar

    Comment by Kiyomi — October 23, 2022 @ 8:20 pm

    Anyone know when they will update the ethnicity estimate? They’ve been the same for 5, 6 years now?

  • avatar

    Comment by Kiyomi — October 23, 2022 @ 8:42 pm

    John, I also wanted to bring up a totally obvious issue here, which actually strengthens your complaint. Based on the info coming from these companies, you will match with your grandfather with a 100% probability as a dna match, but you may get 0% of their ethnicity if the other 3 grandparents had a different ethnicity. I will never understand that! And neither company explains this 64k question ANYWHERE in their blog posts about ethnicities haha. Ripoff!

  • avatar

    Comment by Kenny — August 26, 2023 @ 10:42 am

    Myheritage is updating the Ethnicity Estimates right now for everyone! Please check out your personal page.

  • avatar

    Comment by John — August 28, 2023 @ 2:50 pm

    Thanks for the heads up! 🙂

RSS feed for comments on this post. TrackBack URL

Leave a comment