Start My Family Tree Welcome to Geni, home of the world's largest family tree.
Join Geni to explore your genealogy and family history in the World's Largest Family Tree.

A Y-DNA haplogroup defines a group of men by the shared features of their Y-chromosome. Members of the same Y-haplogroup have a patrilineal (direct paternal) ancestor in common. Y-DNA haplogroup assignments can help to identify recent and distant genetic ancestry, and provide insight into ancient human societies and their migrations. This project offers a thorough presentation on Y-DNA haplogroups, a forum for questions and discussion, and links to additional resources.

Completely new to genetic genealogy? You will find the DNA Primer project a better place to begin.

Looking for information about a particular Y-DNA haplogroup? Jump to the "Y-Haplogroup projects at Geni" section.

Have a specific question? Check the Contents below and jump right to that section. This page was specifically structured to answer questions that are commonly asked in discussion forums.

Do you just want a quick introduction to Y-DNA? This writing moves quickly through an introductory review of Y-DNA haplogroup foundations, but focuses on a more intermediate level of understanding for people who want to go deeper. However, a diverse collection of introductions is listed in the More Resources section, and here are a few that many people have found helpful.


Why is Y-DNA Testing Valuable?

Testing Y-DNA, the DNA of the Y-chromosome, reveals specific genetic information about a man and his relatives with the same patrilineal ancestor. Only genetic males carry a Y-chromosome and it is only passed from father to son. Therefore only patrilineal descendants from a common ancestor will carry the same Y chromosome (chrY) and be detected as Y-DNA matches. This is one way that Y-DNA testing is different from the more common autosomal (non-chrY, non-chrX) DNA testing, which compares many more chromosomes and can potentially identify all close cousins as DNA matches.

This is illustrated in the genealogical chart below, in which all the males in the 4th generation are closely related—2nd cousins or closer. Autosomal DNA testing will certainly report them as each others' DNA matches. However, only males with the same color symbol are related by patrilineal descent, are sure to share the same Y-chromosome, and will certainly be reported as matches by Y-DNA testing.


So why test Y? Although autosomal DNA testing is superior to Y-DNA testing for finding more close cousins, Y-DNA testing has special properties because Y-chromosomes don't typically change much between generations.

DNA swapping between autosomal chromosomes reduces relatedness by about half with each generation, so autosomal testing will reliably detect matches for only around half of 4th cousins, only around 4% of 6th cousins, and less than 0.01% of 10th cousins. In contrast, Y-DNA matches can be identified for relationships along patrilineal lineages extending over the whole range from a single generation to tens or even hundreds of thousands of years. This power makes Y-DNA testing a valuable tool for answering some questions that autosomal DNA tests can't.

Y-DNA for Women

Although they don't have a Y-chromosome, women do have a patrilineal lineage through their father. Those are their patrilineal ancestors, and Y-DNA testing is just as valuable for women who want to learn more about their ancestral heritage, or find distant cousins, as it is for men. A father's Y-DNA haplogroup is the daughter's patrilineal haplogroup. Women do have the inconvenience of needing to get their father, a brother, or another appropriate male relative to contribute DNA for testing, but the interpretation and usefulness of the result is the same.

What is a Y-DNA Haplogroup?

In the simplest practical terms, Y-DNA haplogroups are tools for grouping men into patrilineal ("direct paternal") lineages based on the pattern of inherited features in the DNA of their Y-chromosome. This is possible because every genetic male inherits a Y-chromosome that comes only from his father, so the presence of shared features on the Y-chromosome of each man in the group implies they have a patrilineal ancestor in common. That ancestor's descendants have shared features on their Y-chromosome because they all inherited that chromosome from him, the Y-haplogroup's founder.

The next sections go into some of the science behind Y-DNA haplogroups. This knowledge will help you gain a deeper understanding of how Y-DNA haplogroups are formed and why they are useful in practice. However, it's not needed for developing a basic understanding, and you can come back to it later if that seems helpful.

If you wish to skip to practical issues, feel free to jump to the section "Tests for Haplogroup Assignment" or "What Can My Haplogroup Assignment Tell Me?" For insight into what Y-DNA haplogroups really are, please read on.

DNA and DNA Markers

DNA in Brief

DNA is a chemical molecule that living organisms use to pass genetic information from one generation to the next. DNA is structured as a chain of subunits, sometimes called "letters" after the alphabetic symbols used to name the four different kinds of subunits that are used in the chain: A, C, G, and T. Therefore a DNA molecule can be described with a very long word, like "ATGCGTGCAGAATCGGAC," which shows the sequence of letters. Human cells carry 23 pairs of linear DNA chains, called "chromosomes." The longest is about 250 million letters long, and the shortest has about 48 million letters.

The Y-chromosome is about 57 million letters long. In humans, the Y-chromosome (chrY) and X-chromosome (chrX) form a pair, called the "sex chromosomes" because they determine the sex of a child. All the other chromosomes as a group are called "autosomes." In normal circumstances, if a human embryo has inherited a Y-chromosome, and only if it has a Y-chromosome, it will develop into a male. Therefore only males have a Y-chromosome, that each inherits from his father, who inherited it from his father, etc. This pattern corresponds to what in genealogy is called a person's "patrilineal lineage."

DNA is able to serve as the medium for genetic information because its structure allows it to be copied and distributed to an organism's offspring unchanged. However, changes ("mutations") may sometimes be introduced in the DNA sequence during copying, and then transmitted to a child, becoming part of that child's subsequent hereditary lineage. These changes accumulate over generations, as illustrated in the pedigree tree below, where each new mutation is given a name, like "L26" for example. Individuals who have a new mutation are marked on the tree in red.


Mutations as DNA Markers

Each lineage, a chain of parents and offspring, will accumulate new mutations, and also continues to inherit the mutations that appeared earlier in the lineage. These mutations are testable, inherited DNA markers, and those that accumulate in the Y-chromosome's DNA sequence identify a patrilineal lineage.

When a person does DNA testing and is found to have a mutation at a particular DNA marker site, we say the person is "positive" for that marker, or has inherited the "derived" version. Those who carry the unmutated version are "negative" for that marker, which is called the "ancestral" version. Being positive for a marker is sometimes called "having" that marker, although you could also say that everyone has that marker in either its ancestral or derived version.

There are two major types of mutation that are used as DNA markers by genetic genealogists to study lineages: SNPs (single nucleotide polymorphisms) and STRs (short tandem repeats). Although each is described in more detail below (see: STRs and SNPs), for the purpose of understanding haplotypes and haplogroups it's only necessary to know that they are types of inherited DNA markers that can be tested for in individuals and used to identify lineages. When these DNA markers are on the Y-chromosome they're called Y-SNPs and Y-STRs.

Y-Haplotypes and Y-Haplogroups

A "Y-haplotype" is a set of Y-DNA markers found in a particular man or a patrilineal lineage. Because related patrilineal lineages have related sets of Y-DNA markers, they have related Y-haplotypes. A group of related Y-haplotypes is called a "Y-haplogroup." The individuals who belong to a Y-haplogroup may all have slightly different Y-haplotypes, but the Y-DNA markers they share in common define their Y-haplogroup. That set of Y-DNA markers is the genetic heritage they've received from the original person of their lineage in whom the complete set of those markers first appeared, called the haplogroup's "founder."

By the strictest definition, a haplogroup is a set of related haplotypes and not a set of people. In practice those haplotypes don't exist apart from people who exemplify them. Therefore, it is common practice and neither confusing nor unreasonable to refer to a set of people as belonging to some defined haplogroup.

Following the current best practice, Y-haplogroups are primarily named by what's called the "shorthand" system, consisting of a tree branch label from the older "hierarchical" or "longhand" system, then a hyphen and then the name of the most recently appearing mutation of those DNA markers that define the haplogroup. This marker is usually a Y-SNP, so for example the Y-haplogroup name "J-M172" means "in branch J, the Y-haplogroup defined by the appearance of the derived version of the M172 SNP marker."

The figure below shows the same pedigree and mutations as the earlier figure, but this time labelled with the haplogroup names associated with each founder and lineage. A name like J2a1b2-M166 can also be written in the simplest shorthand version as J-M166. We can write out a Y-haplogroup's accumulated mutations in "breadcrumb trail" format to illustrate the history and defining mutations of that Y-haplogroup. For example, the history of the Y-haplogroup founded by the man in the lower right corner of the figure is: J-M410 > J-L26 > J-M67 > J-M166. The breadcrumb format is sometimes simplified by specifying only SNPs rather than haplogroups, e.g. M410 > L26 > M67 > M166.


Y-Haplogroup Naming: Details and history

The details of why and how Y-haplogroup names are constructed may not be of interest to everyone. If you wish, you can skip ahead to Y-Phylogenetics and the Y-Phylogenetic Tree: The history of the Y-chromosome.

Hierarchical Naming System

Historically, Y-haplogroups were primarily identified by Y-STR mutation patterns, and the SNP mutations that could define Y-haplogroups were not yet known. For this reason, Y-haplogroups were named just with a Y-phylogenetic tree branch label, built by giving each branching a new letter or number appended to the name of its parent branch. The major branches known at that time were given capital letter names, like "R," and then its branches were named R1, R2, R3, etc. Each of those had branches, and the branches of R1 were named R1a, R1b, R1c, etc. This is called the "hierarchical" naming system, in which the name is constructed by lineage. Examples are given in the figure above, where the portion of each Y-haplogroup name to the left of the hyphen is the hierarchical name (e.g. "J2a1b1" in the haplogroup name "J2a1b1-M92").

The hierarchical names quickly grew unwieldy (e.g. "R1a1a1b1a3a1a") as more recently formed Y-haplogroups were identified. Even worse, when new branches were discovered and the tree needed to be restructured, the name for an established Y-haplogroup might change dramatically, so that it sometimes became impossible to know which Y-haplogroup a name referred to unless you also knew the date that name was given and which authority assigned it. Fortunately, the spread of SNP testing allowed a much simpler way to name haplogroups: by the name of a SNP that defines the haplogroup regardless of its history.

Note that hierarchical names are still very useful in some situations, for example where a set of related haplogroups are being compared. The name alone tells you the relationship between the groups (e.g. J2a1, J2a1a, J2a1b, J2a1b1). Problems only arise when a hierarchical name is used without any supporting statement of which naming authority and authorization version or date is being used. However, if a SNP is indicated for each Y-haplogroup then ambiguity is eliminated. So rather than deal with these issues, for common usage a shorthand name including a SNP (as described next) works fine and is always unambiguous.


The illustration above has the same structure and shows the same haplogroups as the earlier pedigree figure. It's been simplified to more clearly show the progression of new mutations that form new Y-haplogroups. Each Y-haplogroup still includes all the mutations of all the earlier Y-haplogroups in its ancestral lineage. A tree diagram showing the appearance and lineage relationships of inherited genetic traits is called a "phylogenetic tree" (sometimes abbreviated as "phylotree"). When those traits are human Y-DNA markers, the result is the human Y-phylogenetic tree. The illustration above shows a small section of the total human Y-phylogenetic tree.

Shorthand Naming System

Because naming a Y-haplogroup with a SNP name alone gives no direct information about its place in the Y-phylogenetic tree, a hybrid naming scheme called the "shorthand" system is now in common use. The shorthand system names haplogroups "by mutation" rather than solely "by lineage" as the hierarchical system does. The shorthand name begins with one or more letters from the beginning of the current hierarchical name for the Y-haplogroup, and appended to that is a hyphen and then a SNP name. You may occasionally see the hyphen replaced with a space, but this is not the current best practice nor recommended.

  • J2a-M410
  • R1a1a1b1a3a1a-FGC11904
  • E1a2a2-Z5991

Because hierarchical names can be long and are still subject to change, particularly very long names, it's recommended for general use that only the first one, two or three characters be used within a shorthand name since those rarely change. More characters can be added if useful for a specific purpose, else the full hierarchical name can be given after the shorthand name. If a hierarchical name longer than three characters is used, it should be accompanied by the naming authority and a version and/or date that the hierarchical name was authorized.

  • J2a-M410 should be written as:
    • J2a-M410 or
    • J-M410
    • although adding the naming authority and version would be acceptable also.
  • R1a1a1b1a3a1a-FGC11904 should be written as:
    • R1a-FGC11904 or
    • R-FGC11904 or
    • R-FGC11904 (R1a1a1b1a3a1a, ISOGG 12.1 2017) or
    • R1a1a1b1a3a1a-FGC11904 (ISOGG 12.1 2017).
  • E1a2a2-Z5991 should be written as:
    • E1a-Z5991 or
    • E-Z5991 or
    • E1a-Z5991 (E1a2a2, ISOGG 12.1 2017) or
    • E1a2a2-Z5991 (ISOGG 12.1 2017).

The shorthand naming system is not without its own confusions, though. Sometimes the exact same SNP, pointing to the exact same DNA sequence location, has been given two different names by two different discoverers. These are called "synonymous SNPs" or "identical SNPs" and when both SNPs are included in a Y-haplogroup name they're typically written with a slash separating the synonymous names, e.g. E-M96/PF1823. The same Y-haplogroup might also be named as E-M96 or E-PF1823, and each name designates exactly the same haplogroup. Usually one SNP name becomes preferred over time, and is used alone.

Also, it is often the case that multiple, independently located SNPs have been identified that all define the same Y-haplogroup, because it is not yet known in which order they appeared. These are called "phylogenetically equivalent SNPs" or "equivalent SNPs." This is best indicated within a shorthand Y-haplogroup name by separating the equivalent SNP names with a comma, e.g. J-CTS900,CTS3261. In typical practice, only a single SNP from the set of equivalent ones is used in the name, or two if each one is in common use. Equivalent SNPs did not appear at the same time, and it may turn out that some people are positive for one and not the other—just not in any person tested so far. Once further research has revealed which of those SNPs appeared first, the Y-haplogroup will be split into parts, each identified by a single SNP.

You may still come across a few Y-haplogroups that are named by a combination of SNPs and STRs, when a specific identifying SNP hasn't yet been determined. For example, Y-haplogroup "J2a-PH4970,L1064 DYS391=9" is the J2a branch Y-haplogroup defined by the presence of both the PH4970 and L1064 SNPs (which are phylogenetically equivalent by current understanding), and also the DNA marker "DYS391=9" which is the STR "DYS391" with 9 repeats.

When a Y-haplogroup with multiple equivalent SNPs is being subdivided, or just for added clarity when defining a Y-haplogroup, you may sometimes see "x" names used. For example, "J-CTS900(xY11200)" or "J-CTS900(xCTS6804)" or "J-CTS900(xCTS6804,Y11200)" where the "x" notation means the Y-SNP's listed within the parentheses are specifically not present in members of the specified Y-haplogroup. In the last example, the designation means, "the haplogroup in branch J of people who are positive for the CTS900 SNP but negative for both the CTS6804 and Y11200 SNPs." This is a well-defined way to indicate when a person does not belong to any of a Y-haplogroup's known descendant lineages.

On the Y-phylogenetic tree, the most recently formed, currently recognized Y-haplogroup of any lineage is called the terminal haplogroup of that lineage, and if defined by a SNP then that is the lineage's terminal SNP. Terminal haplogroups form the tips of the tree. A man's Y-DNA test results may indicate that he is best assigned to a Y-haplogroup that is not terminal—he tests positive for the markers defining that Y-haplogroup, but he is negative for the markers of all the currently known lineages descending from that Y-haplogroup. In this case, he is said to belong to the paragroup for that Y-haplogroup (short for "paraphyletic haplogroup"), which is sometimes indicated by adding an asterisk to his assigned Y-haplogroup's name. In the example from the previous paragraph, a person who is CTS900+ but is negative for for the markers of both known J-CTS900 descendant lineages (he is CTS6804- and Y11200-), can be said to belong to the "J-CTS900*" paragroup. However, because what is currently known about descendant lineages will advance over time, using the "x" notation is a less ambiguous way to specify a paragroup.

Best Practices for Shorthand Naming

As of 2017, the only naming authority still issuing hierarchical names is the International Society for Genetic Genealogy (ISOGG), but in the past there were several others (e.g. YCC, Family Tree DNA), and they sometimes assigned different hierarchical names to the same Y-haplogroup. Also, the ISOGG hierarchical names still change as often as annually. For this reason, specifying the naming authority and authorization date is necessary to completely avoid ambiguity if a long hierarchical name is going to be included with a Y-haplogroup name. For most people in most situations, including just an abbreviated hierarchical name (the first one to three characters) within a shorthand format name is ideal.

Y-Phylogenetics and the Y-Phylogenetic Tree: The history of the Y-chromosome

DNA testing on a group of men can determine which Y-DNA markers they have. When these are examined, some will be positive for the same set of Y-DNA markers, which can define a Y-haplogroup and suggests they inherited those positive markers from the Y-chromosome of a common ancestor. Because differences accumulate over generations of time, the more DNA markers the group has in common, and the fewer the differences, the closer the relationship between members of that Y-haplogroup. A closer relationship implies that their common ancestor probably lived fewer generations ago than the common ancestor of a group of only distantly related men.

In theory, we could examine every member of a lineage to determine exactly which individual a particular mutation appeared in. In practice, we mostly only have access to living people for testing, with just a few ancient peoples' DNA being sampled at this time. However, we know mutations are cumulative, and we see that groups of related people inherit related sets of mutations. Therefore we can use logical analysis to develop a Y-phylogenetic tree that's implied by and consistent with the pattern of positive and negative Y-DNA markers we find present in those related individuals.

The figure below illustrates that development process starting with a set of individuals, each with a different but related set of mutated Y-SNP markers. We can examine each individual's pattern of Y-DNA mutations (their Y-haplotype) to see which mutant markers are shared with others and which are unique. Note that because all the individuals shown have both the M410 and L26 SNP mutations, we can't determine which of those two appeared first, based on the data presented here alone, i.e. they are "phylogenetically equivalent."


Because the Y-phylogenetic tree is constructed based only on patterns of newly appearing Y-DNA markers, we don't include every individual man in the human Y-phylogenetic tree. In fact, we can construct the tree without even knowing in which specific individual a mutation first appeared. The human Y-phylogenetic tree displays only patrilineal inheritance and only the appearance of new Y-DNA mutations, presenting a simplified portion of the complete human genealogical history that's convenient for further study.

Because the individual represented by each branching of the tree has inherited all the DNA markers of his past lineage, each branch point of the Y-phylogenetic tree represents the founding of a new haplogroup. So the Y-phylogenetic tree is not just a tree of DNA mutations but also a tree of the historic appearance of Y-haplogroups. When we do Y-DNA testing on a man, we use the revealed pattern of DNA markers to place him on the Y-phylogenetic tree of all men, in the most recently formed Y-haplogroup that his test results imply -- his "terminal haplogroup." He is also a member of all the Y-haplogroups that came before his in that lineage.This connects him genealogically to the ancient individuals who were members of those earlier haplogroups.

Human Y-phylogenetics is the development and study of the human Y-phylogenetic tree. The figure below shows that tree's "backbone," the oldest known Y-haplogroups in the human Y-phylogenetic tree, structured as it was understood in January 2017.


"Y-chromosomal Adam," shown as the root of the tree, is the most recently born male from whom all living men's Y-chromosomes are descended (the "Y-chromosomal most recent common ancestor," or Y-MRCA). If a living man is newly discovered whose Y-chromosome indicates a patrilineal ancestry tracing back to an earlier man than the current Y-chromosomal Adam, then the label will be reassigned to the earlier ancestor. The Y-MRCA label can also jump to a more recent ancestor, since it's tied to the ancestry of the current living population.

Although "Adam" is a reference to the biblical Adam of Abrahamic religions, the term "Y-chromosomal Adam" is just a humorous shorthand way to refer to a conceptual point in the patrilineal ancestry of all living human males. It is entirely possible that the current Y-chromosomal Adam lived before Homo sapiens appeared, and was a member of Homo heidelbergensis, for example. As of February 2017, the currently assigned Y-chromosomal Adam happens to have lived roughly around the same time as our species appeared, in the vicinity of 200,000 years ago.

Tests for Y-Haplogroup Assignment

Three different types of Y-DNA test are commonly used today (January 2017) for predicting and verifying an individual's Y-haplogroup assignment, and exploring further: STR tests, SNP tests, and next generation sequencing (NGS). In recent years, STR tests have been primarily used to predict a person's ancient haplogroup assignment and discover distant cousins, SNP tests are used to confirm the haplogroup prediction and extend the assignment's depth within the Y-phylogenetic tree, and next generation sequencing is used to determine a Y-haplogroup assignment with even greater precision and to discover new haplogroups.

Besides providing more information about Y-DNA tests, here are some resources that will also suggest places where you can get specific Y-DNA tests performed:

About STR Testing

Short tandem repeats (STRs) are 2-6 letter DNA sequences that repeat end-to-end, i.e. "in tandem." For example, the DNA sequence "ACGACGACGACG" is a 4-fold repeat of the "ACG" sequence. Tandemly repeated DNA sequences are prone to increasing or decreasing their number of repeats across generations, although an STR can also pass through many generations unchanged. When it does change, that change is inherited and can serve as a DNA marker.

There are thousands of STRs in Y-DNA (Y-STRs), but in a typical Y-DNA STR test between 37 and 111 of them are assessed for how many repeats each contains. STRs have labels like "DYS391" and "DYS391=9" means a test showed a person's DYS391 STR contains 9 repeats. "DYS391=9" serves as a DNA marker, and if it mutates the result may be the derived version "DYS391=8" or "DYS391=10" for example. A rough measure of the genetic distance between two men is given by the total number of their STRs tested that show different repeat counts. A son will occasionally show one STR different from his father in a Y111 test panel (111 STRs tested), much less commonly two differences, and extremely rarely three. For any single STR, a change between generations is infrequent.

For example, if two men have a panel of 111 of their Y-STRs assessed, and for every STR tested each man has the exact same number of repeats as the other man (a genetic distance of zero), they are certainly descended from a common patrilineal ancestor, who almost certainly lived less than a dozen generations past (roughly 360 years). In contrast, if two people show a genetic distance of 7/67 (in a panel of 67 STRs, 7 show a different value, and 60 the same value), then they probably have a common patrilineal ancestor, but he most likely lived more than 12 generations ago, but less than 24 (roughly 720 years).

Y-STR testing is a simple way to get a general assessment of how close two men's patrilineal lineages are, and has been in use for many years. Because any pattern of DNA markers can be used to define a haplogroup, STR testing was once commonly used to define meaningful Y-haplogroups. However, SNPs have proven to be more useful and reliable for defining Y-haplogroups with precision. You may still occasionally see a haplogroup that is defined with a combination of STR and SNP values, like "J2a-PH4970,L1064 DYS391=9" for example.

Today, a common approach to Y-DNA testing is to first perform a Y37 or Y67 STR test, then use those results to tentatively assign that person to an ancient haplogroup based on similarity to already defined haplogroup members. These ancient haplogroups were typically formed 5,000 to 40,000 years ago. Then, optionally, an appropriate SNP panel test can be used to confirm the haplogroup prediction and extend it to more recently formed haplogroups. A Y37 STR test is usually sufficient for assigning an ancient haplogroup, but may do a poor job of determining if any matching people are related within genealogical time or only more distantly. Also see "Would testing more STRs or upgrading my STR test (e.g. from Y37 to Y111) be useful?" below.

About SNP Testing

Single nucleotide polymophisms (SNPs) are mutations in a single letter of a DNA strand, changing one letter to another at that site. "CGTAG" becomes "CGAAG" for example. These changes happen at a very low frequency per generation for any single chromosome location, but have been accumulating in the population since Homo sapiens appeared. As of 7 Nov 2016, 154,206,854 SNPs had been identified and listed in the NCBI dbSNP database.

SNPs can be used as DNA markers, and have proven to be excellent for defining ancient and modern Y-haplogroups as described earlier. Most SNPs are named with one or more letters indicating the discoverer followed by a number, eg. M172. When a person has the mutated ("derived") version of the marker we say they are positive for that marker, written M172+, or have the derived value in contrast to the ancestral value.

Of the millions of SNPs that have been identified in human DNA, a typical broad SNP test panel on a DNA microarray biochip will assess around 700,000, mostly on autosomal chromosomes. This is the basis for modern genealogical autosomal DNA testing. The biochips may includes a few hundred Y-SNP sites, though unfortunately not mostly the same ones that have proven in practice to define meaningful Y-haplogroups. Some tools have been developed to use those biochip Y-SNPs to predict Y-haplogroup assignment, but they are not always successful.

It is also possible to test individual SNPs or a panel of many SNPs. Once a person's ancient Y-haplogroup has been predicted based on STR tests, an appropriate panel of SNP tests for that haplogroup can be used to confirm the assignment, and concurrently test for SNPs that define a more recently formed subclade within that haplogroup. This provides a precise and refined assignment, typically to a haplogroup that formed 1,000 to 10,000 years ago, though sometimes as recently as several hundred years ago. Confirming and extending your own predicted Y-haplogroup using SNP tests can help advance Y-haplogroup research if your haplogroup is rare, and may aid your genealogical research.

In general, after Y-STR testing and before doing SNP testing it's a good idea to first join a Y-haplogroup project based on the haplogroup predicted by your Y-STR test. The project administrators will usually be able to help you decide if learning more detail using SNP testing would be valuable, and recommend an appropriate selection of SNPs or a SNP panel.

About Next Generation Sequencing

The price of DNA sequencing has dropped precipitously during the last decade, and some people choose to skip the tests that only report STR or SNP results, and go directly to next generation sequencing (NGS). An NGS assessment of the Y-chromosome or entire genome will typically provide nearly all the results of an extensive Y-SNP plus Y-STR assessment, plus many more SNPs and STRs that aren't routinely assessed.

NGS provides the highest possible haplogroup assignment resolution, typically to a previously defined haplogroup that formed 500 to 6,000 years ago but sometimes more recently. NGS is also used to identify new SNPs and the new haplogroups defined by them, carrying the Y-phylogenetic tree's branching all the way to the present day.

However, it's worth noting that different technologies are used for NGS, STR and SNP testing, and sometimes a particular SNP or STR can't be successfully assessed using NGS technology. This is rarely an obstacle to successful Y-haplogroup assignment, but in some cases a supplementary individual STR or SNP test is needed to answer specific questions.

Although STRs and SNPs are the most common types of mutation used as DNA markers, there are many other types of DNA changes that are also inherited and can serve as genetic genealogy tools. We don't commonly use them because there aren't relatively inexpensive tests for them like there are for SNPs and STRs, and because SNPs and STRs have demonstrated properties (like mutation rate) that make them especially useful for genealogy. But DNA sequencing reveals every type of DNA change, such as simple insertions and deletions (together called "indels"), inversions, multi-nucleotide polymorphisms (MNPs), multiallelic SNPs, recombinant loss of heterozygosity (RecLOH), and copy number changes in minisatellite DNA. There is an entire zoo of mutational classes that DNA sequencing can reveal, few of which are routinely being used as DNA markers at this time.

The results generated by NGS are rather sizable and complex. For this reason, services have appeared that will perform further analysis on your NGS data to provide more clear information and insight beyond what the original testing company offers. Two commercial examples that charge a a modest fee are YFull analysis and the "Interpretation of BAM files" product by Full Genomes. Some volunteer-run Y-haplogroup projects also help by providing additional analysis, which may be even more accurate and insightful because they are specialists for that haplogroup. In general, most people who do NGS testing find it helpful to consult with their appropriate Y-haplogroup project administrators, who may sometimes recommend submitting the BigY results for YFull analysis.

In the future, DNA sequencing may be so inexpensive that it will routinely become the first DNA test done for genealogy. As of 2017, however, a combination of Y-STR and Y-SNP tests is most frequently used for Y-haplogroup assignment, with NGS used to discover new SNPs and more recent haplogroups, and to better define the less well characterized parts of the Y-chromosome's phylogenetic tree. NGS can also identify very recently appearing SNPs ("private SNPs") that will help to precisely define the Y-phylogenetic tree of a family within genealogical time.

What Can My Haplogroup Assignment Tell Me?

I've just been assigned to a Y-haplogroup. Now what?

Congratulations! A haplogroup assignment is your key to a personal journey into the past—the stories of your distant patrilineal ancestors and the long path their Y-chromosome took that led to your father. This is a journey that will require some effort on your part, but you'll meet others on similar quests who will help, including distant relatives—your fellow haplogroup members.

Your first step should be to join at least one Y-haplogroup project, selected to match your core haplogroup assignment. For example, if you have taken a Y37 or Y67 Y-STR test and based on the results are assigned to Y-haplogroup R-M269 (also called R1b-M269), then you would look for an "R" Y-haplogroup project at the place where you did Y-DNA testing. You may find there are several haplogroup R projects, but hopefully their descriptions will help you figure out which project or projects fit your Y-DNA test result the best. The primary goal of Y-haplogroup projects is to better understand the structure of the human Y-phylogenetic tree for that branch, as described below.

Second, you may wish to join a regional, surname, or ethnic Y-DNA project depending on your own specific circumstances. These projects will usually include people with several or many different Y-haplogroups. The projects typically focus on issues of anthropology or genealogy. However, you'll find in every project people who are interested in a variety of questions and the associated discussions can range widely.

Some primary areas where a deeper exploration of your haplogroup might take you are:

  • Genetic anthropology, looking mostly at the period from 200,000 to 2,000 years ago.
  • Y-chromosome phylogenetics, looking mostly at the period from 20,000 to 200 years ago.
  • Your distant genealogy and surname's history, looking mostly at the period from 1000 years ago to today.

Depending on your interests, any one of these can become a major journey of learning, discovery and insight.

What can you tell me about my Y-haplogroup?

There are a few general things that are certainly known about the Y-haplogroup to which you've been assigned. This includes the list of mutations that define it, which all members of your Y-haplogroup share, and which indicate that you all have a common patrilineal ancestor who first displayed this set of mutations. We also know which Y-haplogroup your assigned one was derived from, and which that one derived from, all the way back to Y-chromosomal Adam. Your patrilineal lineage includes all those ancestral haplogroups too—you have ancestors that belonged to them and they are part of your family's ancient history.

We know approximately how long ago your Y-haplogroup was formed, i.e. roughly when that common ancestor lived, based on the number of accumulated mutations. Depending on how many people are in your haplogroup and the data they've provided, we may have some clues about the region in which your Y-haplogroup's most recent common ancestor lived. We probably know more about the people of your more ancestral Y-haplogroups, including the geographical region in which they lived and when, and may even have some clues about which ancient culture or cultures the members of those ancient Y-haplogroups may have belonged within.

All that, of course, is speaking generally. The specific answers for your particular assigned Y-haplogroup are for you to discover though your own research and by asking others.

Anthropology: The lives of your ancient ancestors

Exploring your haplogroup through anthropology can give you an understanding of your ancestors' ancient societies, and the roles that migration, culture, conflict, technology, and regional circumstances may have played in their lives. Genetic anthropology is on the cutting edge of science research, so new discoveries are being made regularly, and old theories are constantly being revised.

As of 2017, a major application of Y-haplogroup information to anthropology has been to answer questions about ancient human population migration. For traditional archeological artifacts like stone blades, archeologists can get clues about the expansion and migration (and trade networks) of an ancient culture by looking at the time and place where a particular culture's characteristic artifacts can be found. Similarly, genetic anthropologists look at the modern distribution of Y-haplogroups, the estimated time those haplogroups first appeared, and the Y-phylogenetic tree of haplogroup evolution. Combined, these make possible some reasonable speculations about where and when ancient migrations may have carried the ancestral haplogroups' Y-chromosomes around the world to generate the patterns we see today.

It is only in the last few years that archeologists have been able to reliably do Y-DNA testing on ancient bones to assign Y-haplogroups to the men who actually lived thousands of years ago in societies previously known only by their skeletons and artifacts. The haplogroups of these ancient people can be placed on the same tree where your own haplogroup appears, and haplogroups ancestral to yours are those of your ancestral relatives. Have any of your ancient relatives been discovered by archeologists yet? As this research proceeds, the answer for more people will be yes, but this work has just begun.

Genetic Anthropology Questions to Pursue

Here are some questions that may motivate and advance your exploration. But DO NOT just paste these questions into a discussion forum and expect other people to do your research for you! Learning the answers to these questions, and understanding why the answers are valuable, is part of your personal journey. They lead not just into the past but right up to the genetic anthropology research being conducted today.

  • What is the ancestry of my Y-haplogroup—the Y-haplogroups and associated DNA markers leading from Y-chromosomal Adam to my own Y-haplogroup?
  • Are any of the Y-haplogroups in my ancestry particularly well studied, and characterized by existing anthropology research?
  • Around what time did my assigned Y-haplogroup and its ancestral haplogroups appear?
  • In what geographical place or region did those Y-haplogroups appear?
  • What human civilizations or cultures were in that place at that time, to which my ancestors might have belonged?
  • What were their lives like? What did they eat? What distinguished them from their neighboring cultures, and from earlier or later ones in the same region?
  • Have any ancient Y-DNA samples been recovered from these cultures yet?
  • If so are any of their Y-haplogroups the same, ancestral to, or closely related to mine?
  • Do any of the ancestral Y-DNA samples (anywhere) that have been analyzed so far belong to my haplogroup or a haplogroup in my Y-chromosome's ancestry?
  • Are any of the Y-haplogroups in my ancestry a subject of intensive current debate by the genetic anthropology community, and why?
  • Is my haplogroup associated with, or found among, many ancient cultures or mostly restricted to just a few or only one?
  • Where do people with my haplogroup live today? Is there a particular place most seem to have originated from?
  • What migration paths did ancient people with my haplogroup follow?
  • What cultures arose later in places along those paths?
  • Do I trace my heritage to any of the places where those migrations led, or the cultures that arose there?

Why is there so little information about my assigned Y-haplogroup?

In part due to exaggerated or outright fraudulent claims by some DNA testing companies, people may have an inflated expectation that DNA testing will tell them the specific ancient tribes or cultures their ancestors belonged to. The reality is rarely that simple. Having spent the effort and money on testing to learn your Y-haplogroup assignment, you may do an Internet search and discover that very little information seems to exist about it. What is going on?

Here are four considerations that will help provide a useful perspective and improve your research: (1) the study of ancient Y-DNA is only recently expanding, (2) you are member of all the ancestral haplogroups of your lineage, (3) much more is known about older Y-haplogroups than newer ones, and (4) because of human migration and admixture, a Y-haplogroup is rarely found in just one ancient society or culture except shortly after its formation.

Ancient DNA. A great deal of the speculation about ancient "tribal" Y-haplogroups that you'll find in online discussions has used evidence from Y-DNA testing of living people to make inferences about the presence and migrations of ancient peoples. Although this approach generates some intriguing hints, it is not a substitute for actual ancient DNA, and discussions with weak supporting data too easily turn into arguments over whose speculation is more believable for all the wrong reasons.

Relatively few ancient Y-DNA samples have been collected so far— less than 200 as of January 2017—and mostly they've just provided crude data focusing on the oldest Y-haplogroups. Until very recently, intact ancient Y-DNA was rarely collected successfully. In contrast, ancient mitochondrial DNA (mtDNA) is much easier to collect, so genetic anthropologists have been able to get more reliable information about mtDNA, and generally focused on mtDNA preferentially. However, technical innovations have changed this in the last couple of years, and we can expect much more reliable data about ancient Y-DNA haplogroups to appear in scientific research papers.

Ancient Haplogroup Membership. Be aware that you are not just a member of your assigned Y-haplogroup. Your assigned Y-haplogroup is the known Y-haplogroup that matches your test results, and that has formed most recently in your Y-phylogenetic tree lineage. But you are also a member of every Y-haplogroup ancestral to your assigned haplogroup. To learn as much as you can about your patrilineal anthropology, you need to research your assigned Y-haplogroup to learn what Y-haplogroups are ancestral to it, and then explore what is known about each of them too.

Ancient Haplogroup Knowledge. Much more is known about the older Y-haplogroups in your lineage than the newer ones. This is partly because haplogroups that formed long ago have had more time to expand and develop a larger descendant population today, and also because ancient DNA samples have mostly provided information about the older Y-haplogroups. As you work backward in your Y-haplogroup lineage, you will find more research has been done and more information is available.

Migration and Admixture. When a new Y-haplogroup is formed, it's represented by one person in whom a new mutation is found, who is living in a particular location and society. But to still exist today, the Y-haplogroup will probably have expanded to become represented by a considerable population. Some of those people migrated, and some joined and had descendants in societies and cultures other than the one they were born into. This appears to be a significant aspect of human nature, then and now.

It's important to know that by the beginning of the Bronze Age (roughly 5,500 years ago) and afterward, there was so much migration in Europe and western Asia that most cultures in that area will have included several different major Y-haplogroups. Although less well studied, the same may be true for other human populations worldwide. In this case, within a few dozen generations after any Y-haplogroup was formed, there would be only few ancient Y-hapogroups limited to only one location or culture. At best we can say "Of the ancient DNA samples that have been tested from various cultures, these are the cultures that included this particular Y-haplogroup of interest." We will be able to say more when we have data from thousands of ancient DNA samples.

Genetic Anthropology Resources

The resources below and in the section "How Can I Learn More or Get Help?" all offer paths for fruitful exploration, both for information and contact with others on a journey similar to yours. The Y-DNA discussion forums especially are full of anthropology discussions, and they are well-populated by professional and skilled amateur genetic anthropologists (and also some people with odd personal agendas they promote!).


Phylogenetics: Your place in Y-chromosome history

As described in detail earlier, "Y-chromosome phylogenetics" is the scientific study of Y-chromosome history using the DNA changes that it accumulates over time. Because those changes accumulate, a diagram of the relationships between groups (haplogroups) with the same set of accumulated changes takes the form of a tree or network. When someone talks of the "tree" of Y-chromosome history, they're referring to Y-phylogenetics. This is sometimes also called "haplogroup research."

Y-phylogenetic research has already identified the major ancient branches in the tree of human Y-chromosome history, and is now discovering the more recent branches through testing by people like you. The history of your own patrilineal Y-chromosome is part of the history of your patrilineal relatives, modern and ancient. By comparing your Y-DNA test results with those of others, your patrilineal line can be placed within the ever more detailed and developing Y-phylogenetic tree, and even help to extend it.

The Y-chromosome's phylogenetic tree can be determined by testing modern and ancient DNA, without needing to know who the exact person was in which a new Y-DNA change appeared, where they lived, or anything else about them. But we can also combine that tree with what we do know of the history of our family, our societies, and humankind, to link those histories and begin aligning them into a composite picture. This is the grand story that includes our family as part of all humanity.

You may have noticed there is a very large gap between the anthropology story of ancient human migrations, tens of thousands of years ago, and the genealogical time period starting 500 years ago. The leading edge of Y-phylogenetics research is working to fill in that gap. For example, the current goal of ISOGG's Y-DNA phylogenetic tree project is to define all the Y-haplogroups that arose up until around 1500 C.E., and their ancestry back to Y-chromosomal Adam. More deeply exploring your Y-haplogroup's phylogenetics can give you:

  • An assignment to your patrilineal lineage's most recently formed known Y-haplogroup in the human phylogenetic tree.
  • The opportunity to learn more about and contribute to current research, discovering new haplogroups and extending the Y-phylogenetic tree to the genealogical time period.
  • A tool for more accurately assessing patrilineal relationships in your genealogical research.
  • Potential new relatives identified, if your refined haplogroup was very recently founded.
  • Potentially a more specific association with a particular genetically characterized cultural group.

Why does Y-phylogenetic research matter? Knowing the phylogenetic history of your Y-chromosome points to the historical populations to which your patrilineal ancestors belonged. Although it may be true now that little is known about any particular Y-haplogroup population as it existed in the year 500 BCE, for example, ongoing anthropological and historical research is going to change that as more data accumulates. Your test results, and the information you provide to Y-phylogeny research projects about your patrilineal family history, are essential elements of a process that is making everyone's data more meaningful as research continues. Y-phylogenetic research creates the foundation for ongoing advances in both genetic anthropology and distant genealogy.

Y-DNA haplogroup projects are hotbeds of research into chrY phylogenetics. The project administrators are commonly either doing cutting edge research or are in contact with and providing data to scientists who do that work. They can be terrific sources of additional information, and can advise you on what additional DNA testing (if any) could be helpful. They may be able to direct you toward people that specialize in the study of your Y-haplogroup or those close to it. However, you should first do as much of your own research as you can, because they are volunteers and there are often many people asking for their time.

Y-Phylogenetics Questions to Pursue

Here are some questions you may want to answer as part of your personal Y-haplogroup exploration of phylogenetics. Please DO NOT just copy the questions below into an email to a project administrator—do your own research first! Understanding why the answers to these questions are valuable is also part of your journey.

  • Is my haplogroup assignment only predicted by STR testing or has it been confirmed by SNP testing or next generation sequencing (NGS)?
  • If my haplogroup assignment is only predicted, what SNP tests or SNP panel would best be used to confirm and extend my haplogroup assignment?
  • What haplogroups are ancestral to my assigned haplogroup, going back to Y-chromosomal Adam?
  • How old is my haplogroup—when did it first appear? Is the range estimated for that appearance wide (many thousands of years) or narrow (hundreds to a few thousand years)?
  • Are there more recently appearing haplogroups, derived from my assigned haplogroup, that have already been discovered?
  • If there are, was I not assigned to one of those because I don't have their defining SNPs or because those SNPs haven't yet been tested on my DNA?
  • If I might belong to one of those more recent haplogroups but haven't had the appropriate SNP testing done to determine that, what SNP tests or SNP panels will best help me discover a more recent haplogroup assignment? Or should I do NGS testing?
  • Are there a lot of other people known to be in my assigned haplogroup and its subclades, or does it appear to be relatively rare?
  • If my haplogroup appears to be very rare, is it because there really aren't many people alive who belong to it, or because most of them live in a place or culture that hasn't done much DNA testing?
  • Has at least one person in my assigned haplogroup already done next generation sequencing, or would it be valuable for me to do that and contribute the results to ongoing haplogroup research?
  • Have my haplogroup, its defining SNP, or any of its phylogenetically equivalent SNPs been placed on the official ISOGG Y-DNA tree yet, or is the haplogroup still considered experimental by ISOGG?
  • If my haplogroup is not yet on the ISOGG Y-DNA tree, what additional research will be needed to meet ISOGG's strict standards, and can my DNA testing contribute to that?

Y-Phylogenetics Resources

Because Y-phylogenetics is a subset of the more general fields of phylogenetics and phylogeny, some resources about those are also provided below.

Genealogy and Surname Studies

A great power of Y-DNA testing is that it allows us to peer far back in time, even to the first appearance of our species. However, Y-DNA also has applications relevant to genealogical time, commonly considered to be within the last 500 years or so (some would say 1000 years). The two most common such uses are for identifying cousins whose relationship distance is outside the reliable reach of current autosomal testing (5th cousins and beyond), and to get data supporting or disproving suspected patrilineal relationships.

Will Y-DNA testing help me discover new relatives?

Almost certainly, although most or all will probably be very distant relatives, and your most recent common ancestor will have lived long ago. Sometimes people do get lucky with finding closer cousins through Y-DNA testing, but an autosomal DNA test is much more likely to be fruitful. However, if you want to search for distant cousins just beyond the range of reliable autosomal testing (5th cousins and farther), as many people do, then Y-DNA STR testing is still the best tool.

Note that finding cousins this way is dependent on whether any of your distant relatives have done Y-STR testing, and it may take years before a relative in the range you're looking for takes a Y-STR test and shows up among your matches. People do sometimes get lucky from the start, if their cousins tested first. Usually patience is required, and while you wait an exploration of your Y-haplogroup's anthropology or phylogenetics can be rewarding.

Y-DNA testing places my ancestor in a particular ethnic or cultural group—so am I one too?

There is widespread misunderstanding about this issue, fueled in part by misleading advertising and the expectation that DNA testing will "tell you who or what you are." There are three major problems with this idea.

First, note that Y-DNA testing only reveals information about your patrilineal lineage. If your patrilineal 7th great grandfather was Irish, but his children migrated elsewhere, then you have a very tiny claim to "being Irish" if your only connection is that one of your 512 7th great grandparents was Irish. Y-DNA can provide a very powerful telescope for looking into the past, but its field of vision is tiny—the patrilineal lineage only. Patriarchal notions that the male line defines you are outdated, and recognized now as historical social constructs with toxic consequences for women.

Second, how much some particular culture or ethnicity from long ago might be contributing to "who or what you are" today, can be highly debatable. Even if you could show your entire ancestry from centuries back all come from the same place and culture, you are a product of the modern version of that culture. For example, just because all your ancestors of centuries ago were Vikings, doesn't make you a Viking. It may make you a modern Dane, born of a culture that still shows some connections to an ancient Viking past. But even if you've grabbed your armor and sword and sailed off to pillage English coastal villages, the idea that you're a Viking because your ancient ancestors were Vikings is rather ridiculous.

Third, "who or what you are" is a complex combination of your social and cultural history, your personal and family story, your character, and your genetics, all seen through the potentially distorting lenses of whatever social biases people might happen to be applying. DNA history is just a record of genetic parentage. The history of your DNA is probably the least important component of "who or what you are," although it's part of the total picture and can inspire you to look more deeply into what's really a much larger issue of identity. That journey into identity can be fascinating and wonderful, but don't be quick to oversimplify it, particularly based just on DNA.

Will Y-DNA testing tell me where my ancestors were from?

No need to DNA test ... your ancestors came from Africa like all of ours, if you look back far enough. But if you want to know about recent, genealogical time, then you may get lucky and discover a lot of living, close Y-DNA matches who all come from the same place, implying that your patrilineal ancestors may have come from there too. Most people's results are rarely so simple, and typically only provide hints for further research. But hints are good!

However, at intermediate time spans—6,000 to 40,000 years ago—most of the major haplogroup lineages alive today were founded in specific, known regions. And some more recently formed but still ancient haplogroups are associated with fairly defined ancient societies. This is the basis for genetic anthropology.

Does Y-DNA testing prove paternity?

No. Even fairly distant relatives with the same patrilineal ancestor can and often do all show the same Y-DNA test results. However, Y-DNA tests can disprove paternity if tests show only a distant relationship. Matching Y-DNA provides only supporting evidence, which must be interpreted and combined with other evidence to reach an informed judgement.

How related am I to my STR test matches?

The closeness of your relationship can be very roughly estimated based on how many DNA differences there are between the two Y-chromosomes. This measure of genetic distance according to the "stepwise model" is calculated as the total number of mutational steps between the results of two STR tests. If at a single STR site the number of repeats differs by one then the genetic distance is one, if by two then the distance is two based on the (usually valid) assumption that number of STR repeats goes up or down by only one in a generation. The total genetic difference is the sum of the count difference at each STR location that has a different count.

TiP analysis is a somewhat more refined method of estimating relationship difference, and gives probability estimates for the number of generations between two people who've taken an STR test. In practice it tends to overestimate people's relatedness, outrageously in the case of Y37 or lower STR tests, and suggest a precision greater than this type of analysis can actually deliver. However, it can be used comparatively, to better identify which are the closer relationships among a set of matches that have an identical or very near genetic distance. When used with a Y67 test or higher, TiP analysis can distinguish matches who are probably related within genealogical time from those with only an ancient relationship.

In addition to incorporating the number of STR changes, the TiP analysis tool at Family Tree DNA also takes into account the characteristic frequency of change for each individual STR to give a more accurate result. Running a TiP analysis on a pair of STR test results gives a cumulative probability estimate for the number of generations back that two people's common ancestor lived. The predicted range tends to be rather broad, e.g. for a pair with a genetic distance of 4, the TiP analysis might report a 50% likelihood that the common ancestor lived within 8 generations ago or less, and 95% likelihood that he lived within 16 generations ago or less.


Three real TiP analyses are presented in the above graph, comparing a person of interest to three different individuals all tested to 67 STR markers, but one with a genetic distance of 2, one at 4, and one at 7. For the pair with a genetic distance of 4 (the green line), the TiP analysis reports a 20% likelihood their common ancestor is within 5 generations or less, 40% within 7, 60% within 9, 75% within 11, and 90% within 14 generations or less (420 years, at 30 years per generation). Comparing the three lines, the graph shows that if the genetic distance between two people is higher (more STR mismatches), then their most recent common ancestor is predicted to be more generations back in time. Also, the more distant the relationship, the broader the range and more uncertain is the predicted number of generations back to a common ancestor.

One effective way to do pairwise comparisons among a set of distantly related people is to note for every pair at what number of generations the TiP analysis reports more than 50% likelihood of a common ancestor. Another measure is to note for each pair the percentage likelihood that their most recent common ancestor is within 5 generations. Be aware that these calculated probabilities tend to be very loose estimates despite their apparent precision. Taken as guides rather than exact answers, these two measures can serve reasonably well to compare relatedness between people among the group.

Would testing more STRs or upgrading my STR test (e.g. from Y37 to Y67) be useful?

Testing more STRs may allow a more accurate haplogroup prediction, or assignment to a more recent haplogroup, in phylogenetic branches with poorly defined or complex structure. Testing more STRs will also allow a slightly more precise TiP estimation of the number of generations to a common ancestor, distinguishing closer from very distant relatives, but only if the person you're comparing to has tested that same number of STRs or more. A Y-STR test may be best followed by doing Y-SNP testing to confirm and extend the haplogroup assignment predicted by the STR test.

Generally, testing fewer STRs than 37 is discouraged as too crude by current standards. A Y37 test is usually good enough for a basic haplogroup prediction, but if you do get close Y37 matches then it doesn't always distinguish well between distant and ancient cousin relationships. Y67 is more reliable for haplogroup prediction and much better for finding cousins related within genealogical time. Y111 is mostly only useful for establishing fine distinctions between people who are related within a dozen generations or so. These are very general guidelines, and the results for individual lineages will vary.

Therefore, for example, if you do a Y37 test and get no people who are close matches (roughly 3/37 genetic distance or less), then upgrading to Y67 usually won't provide much insight or many additional matches. If you do get some close Y37 matches, then upgrading to Y67 will let you distinguish the distant, possibly ancient, matches from those who might be related to you within genealogical time. If you have several matches that are predicted by a Y67 test to be related to you within genealogical time (roughly 5/67 or less), then upgrading to Y111 can be valuable, but only if at least some of those matches have Y111 tested already or will do so. For most people, a Y67 test is going to be the most useful.

Y-SNP testing is often an appropriate followup to Y-STR testing, but it's a good idea to first join a Y-haplogroup project based on the haplogroup predicted by your Y-STR test. The project administrators should be able to help you decide if an upgraded STR test, SNP testing, or both would be valuable given your specific circumstances and priorities.


If you do have matches after a Y37 DNA test, upgrading to test additional Y-STRs can lead to a more accurate prediction of the distance to your common ancestor. In two example graphs above, TiP analysis has been used to compare one person to a closer relative (left) or a more distant one (right). For the closer relative, as the number of STRs tested goes up, the increasingly more accurate TiP analysis predicts a shorter and shorter distance to their most recent common ancestor. In contrast, for the more distant relative, increasing the number of SNPs tested predicts that person as being even farther away than the initial prediction based on fewer STRs. But these two cases can't be generalized for all close vs. distant relatives. It's not possible to know in which direction, if any, the prediction will move for a particular comparison unless the higher STR test is actually done.

Why do I have so few Y-STR matches?

There are two main reasons why some people discover few Y-STR matches in their test results. One is if you have a very rare Y-haplogroup, which is not an uncommon situation. There are a few haplogroups that are very common, and then a lot of haplogroups that are today found only with relatively moderate or low frequency.

Another possibility is that your Y-haplotype is common in a particular population, but hardly anyone from that population has done DNA testing or is living in a place where many people have done DNA testing. As of 2017, North America and Europe other than France are where most people taking DNA tests are located. For example, self-purchased DNA testing in France is illegal, so the only French lineages represented in DNA databases are from emigrant families. The likelihood of getting Y-STR matches depends on both the rarity of your Y-haplotype, and the testing frequency of your root population.

Although it can be disappointing to see few or no matches, what this ultimately means is that you are a DNA pioneer. Your DNA test and any further testing you do will explore and characterize a part of the Y-phylogenetic tree that is little studied so far. What is learned will increase the scientific body of knowledge for Y-chromosome studies, and also help your future DNA test matches more quickly find their place on the tree of all Y-chromosomes.

How related am I to other members of my haplogroup?

All members of a Y-haplogroup share a common patrilineal ancestor whose birth marked the first appearance of that haplogroup. For any two living members of a haplogroup, their most recent common ancestor may have lived as long ago as the time when the haplogroup was founded (the founder himself), but could be closer (a patrilineal descendant of the founder).

If the Y-haplogroup founder lived only a hundred years ago then all his living descendants are close cousins, but if he lived thousands of years ago then the relationship between two haplogroup members could be very distant or very close. Additional precision would require more information than just knowing their shared haplogroup. Y-STR testing with TiP analysis can suggest a rough estimate of how many generations back the common ancestor probably lived, or additional Y-SNP testing can place people in a more recent haplogroup and narrow the possible range.

  • YFull tree, calculates an estimated haplogroup age with a range (95% confidence intervals), roughly @ YFull
  • Haplogroup ages @ ISOGG
  • Also, your Y-haplogroup project administrators may have specific information for the haplogroups in your branch, either from their own calculations or by reference to the latest published scientific research.

Why don't my Y-DNA test matches have the same surname as I do?

Your assigned Y-haplogroup probably arose many hundreds to thousands of years ago, but hereditary patrilineal surnames have only been in use for a few hundred years. Also, even if your haplogroup was founded within the last few hundred years, in a region that used hereditary surnames, there are a lot of circumstances that lead to surname changes.

Does everyone with my surname descend from a common patrilineal ancestor?

Unless you have an extraordinarily rare surname, probably not. But Y-DNA testing can be used to group patrilineal lineages into those probably related within genealogical time and those whose relationship is only ancient. This is the work of Y-DNA "surname projects," which seek to answer the questions, "Are all people with this surname descendants of a common patrilineal ancestor, and if not then how many unique lineages with this surname exist?"

Within a surname lineage whose branches are already characterized as close by STR and standard SNP testing, but whose common ancestors are unidentified by traditional genealogy, NGS tests can create an ordered phylogenetic tree for that surname using new ("private") Y-SNPs that appear in a lineage on average once every 3-6 generations.

Can Y-DNA testing prove I'm descended from <insert famous person's name here>?

No. Even if you do show a close Y-DNA match and the same Y-haplogroup as living people proven to be descendants of that famous person, you might be descended from their brother, uncle, great grand-uncle, or any of the dozens to thousands of men living at that time who also shared that same haplogroup. However, Y-DNA can disprove your relationship to that famous person, if the results show you are not in their Y-haplogroup or a close DNA match.

How Can I Learn More or Get Help?

There are many places where you can find help or learn more about a particular haplogroup or Y-haplogroups in general. For general information, see the "More Resources" section.

Besides the encyclopedic resources, haplogroup projects, and discussion forums listed below, you may also be interested in online blogs dedicated to genetic genealogy or anthropology. The Eurogenes Blog and Dienekes' Anthropology Blog are noted sources for relatively reliable anthropological information. Finally, don't forget that your local genealogy society branch may include people with expertise in using Y-DNA for genealogy.

Encyclopedic resources

Online encyclopedic resources can be excellent sources of summary information about Y-haplogroups, although it can be important to note how old the article's references may be—if they are even provided! If they aren't provided then that information may be out of date. Articles for the major haplogroups at Eupedia and Wikipedia among the most frequently updated, but still can be many months behind the latest research. Be aware that Eupedia is a personal project, so it may sometimes be a bit biased toward the author's personal opinions. The individual Y-DNA haplogroup projects here at Geni should (ideally) provide links to specific, relevant articles at Eupedia and Wikipedia. Here are two general gateway pages:

Y Haplogroup projects at Geni

Y-DNA haplogroup projects at Geni can be an excellent place to learn more about a specific haplogroup, make contact with others in that haplogroup who may know more about it, and to ask questions and discuss the latest research. Y-DNA haplogroup projects are automatically created at Geni when any person is assigned a haplogroup that doesn't already have an existing project. These newly created projects have only bare descriptions, and it is up to the Geni community to expand their texts appropriately. All Geni projects include a discussion forum.

Geni members who have done DNA testing are identified with a DNA helix symbol on their pedigree diagram profile, and a "DNA Markers" line on their profile page. If a person's Y-haplogroup can be inferred because a relative of theirs has done DNA testing, then they won't have the DNA helix symbol, but will have a DNA Markers line on their profile page. The DNA Markers line shows the associated terminal Y-haplogroup for that person, meaning the known Y-haplogroup that is consistent with the Y-DNA markers that have been tested, and which has formed most recently. Note that a person is also a member of all the Y-haplogroups ancestral to their terminal one. Depending on the specific markers that have been tested, two people who are actually in the same Y-haplogroup can be assigned to different haplogroups within a single lineage of the Y-phylogenetic tree, and the two different haplogroup assignments are both correct.

The Y-phylogenetic tree below provides links to the projects for ancient Y-haplogroups here at Geni. These (soon!) should in turn provide links to more recently founded subclades. A comprehensive "Y-DNA Geni Totals" page lists all the individual Y-haplogroups created as of the date it was last updated. A simplified Y-phylogenetic tree backbone diagram is found in the Y-Phylogenetics section of this writing. There are also good diagrams of the Y-chromosome phylogenetic backbone structure at Wikipedia, the Genographic Project and YFull.

Technical notes for this tree

The automatically generated Geni projects provide links to profiles of all people here who have been tested or are inferred to be in that terminal haplogroup. However, it's important to note that those people are also therefore members of every haplogroup ancestral to that one, and with further testing may be re-assigned to a more recently formed haplogroup. Also, there are at this time haplogroup projects with only hierarchical names (e.g. "J2a"), that are essentially duplicates of projects with SNP-based, shorthand names (e.g. J2a-M410). Over time these duplications will be resolved, and in the meantime projects with only hierarchical names will typically carry a text that points to the properly named project.

The tree above shows haplogroups that are now considered part of the "trunk" of the Y-chromosome's phylogenetic tree. Some intermediate haplogroups have been omitted for simplicity. This tree is based on ISOGG v.12.4, 4-Jan-2017, the hierarchical names are from that tree version (e.g. "K2b" in the haplogroup name K2b-M1221), and the defining SNPs for each haplogroup are considered representative as of that date (e.g. "M1221" in K2b-M1221). Hierarchical names can change with tree structure updates. Note that "haplogroup A" is not defined by a specific mutation, but is commonly used to mean "not-BT and its subclades" even though BT-M91 arose from a subclade of A.

Y-Haplogroup projects elsewhere

Besides Geni, Y-haplogroup projects are hosted by several of the companies that perform Y-DNA testing or analysis. These projects are generally only accessible to that company's paying customers, although Family Tree DNA allows the public to view project descriptions and some projects make their data public. If you are a customer of any of these companies you should consider joining the relevant haplogroup projects there, particularly because their administrators are often very knowledgeable.

Discussion forums

There are several online discussion forums, other than those at Geni or as part of haplogroup projects, where people typically discuss issues of genetic anthropology and Y-chromosome phylogenetics.

Haplogroup Research

Y-Haplogroup research is is very active and ongoing. Archeologists have now collected several dozen ancient DNA samples to better understand Y-chromosome phylogenetics and the history of ancient peoples, and this dataset is sure to expand. The dropping price of next generation sequencing is allowing regular people to obtain the total sequence of their Y-chromosome or entire genome, share it with phylogenetics researchers, and participate as citizen scientists.

Why Did My Haplogroup Assignment Change?

It is not uncommon to do a Y-DNA test, get assigned a Y-haplogroup and then to later learn the assigned haplogroup is no longer named the same. Why is this? It's primarily because ongoing research is constantly refining our understanding of Y-chromosome phylogenetics and how to best understand, present data, and improve testing for the Y-chromosome. Some reasons why your haplogroup assignment might change (or seem to) are:

  • The hierarchical name, but not the defining SNP, for your haplogroup has changed when new research indicated the phylogenetic tree needed to be updated. You are still in the same haplogroup.
  • A synonymous name for the SNP that defines your haplogroup has gained wider usage and everyone is standardizing on the new name. You are still in the same haplogroup.
  • A different, but phylogenetically equivalent SNP has been chosen to define your haplogroup. For now you are in the same haplogroup, but additional research may reveal the fine structure of the haplogroup and it will be split up.
  • New research has determined that your haplogroup should be split into two, and you've been assigned to one of the two segments. You are in a newly discovered haplogroup.
  • You've extended your STR testing to include more STRs (e.g. upgraded from a Y37 to a Y111 SNP test), and the additional information allowed a more precise prediction of your haplogroup assignment. Your predicted haplogroup assignment has changed.
  • You've done only STR testing and received a predicted haplogroup assignment based on similarity to other people's STR tests, but with new results from others a revised prediction has been made. Your predicted haplogroup assignment has changed.
  • You've done additional SNP or NGS testing, revealing new SNP results that place you into a more recently formed or precisely identified haplogroup than your earlier results could reveal. Your haplogroup assignment has changed, and you may be in a newly discovered haplogroup.

How Can I Participate in Research?

Some of the ways you can participate in and help advance Y-DNA research are:

  • Get Y-DNA tests done on yourself (or a male relative if you are female), to the depth you are able. Identify your assignment to the youngest Y-haplogroup that's characterized.
  • At the place where your Y-DNA tests were done and elsewhere, complete the associated data forms that provide additional information about your family history, including surnames, most distant known patrilineal ancestor, and geographical origins of the paternal lineage. This gives your test results important contextual information.
  • Join the relevant haplogroup projects at the places you've had Y-DNA testing or analysis done, to share your results with the administrators there who are, or who collaborate with, Y-DNA researchers.
  • Get next generation sequencing (NGS) done to reveal new SNPs and STRs that will help phylogenetics researchers identify new Y-haplogroups.
  • Participate in online discussion forums, including Y-haplogroup projects at Geni and elsewhere, to learn more and help others understand the results of their own tests.
  • Find out what work has already been done to bring your branch of the Ychr phylogenetic tree up to the genealogical time period. If it's not there yet, then work with a project administrator to help make that happen.
  • Contribute money to Y-haplogroup project funds to help pay for people with particularly valuable chrY phylogenetic placement to get advanced testing.
  • Find out if your local genealogical society branch has been encouraging their members to consider Y-DNA testing, and if not then help them understand why they might be interested.
  • Ask questions! If you have a question then there are probably many other people with the same question but who have been too shy to ask. Questions don't always get answered—sometimes noone on the forum knows the answer. But seeing what questions get asked and what answers are available may point out gaps where research needs to be done, and always helps guide the creation of writings like this page.

More Resources

These are online educational materials, providing introductions to Y-DNA genealogy and related subjects at various levels of depth.


  • ancestral (haplogroup). A haplogroup that arose earlier within a lineage. Other terms used are parent haplogroup (sometimes meaning the immediately ancestral known haplogroup) or superclade.
  • ancestral (SNP). Having the SNP letter value that matches the earliest human ancestors. Any later changes are "derived" values. Also called "testing negative" for that SNP, i.e. not having a mutation there.
  • autosomal. Any of the 22 chromosome pairs that aren't the Y or X sex chromosomes. Also describes a gene on an autosomal chromosome.
  • back mutation (SNP). Within a lineage defined by SNPs, a mutation from a derived value back to the ancestral value, making that test result suggest the person isn't in that lineage even though they are.
  • chromosome. A circular or linear length of DNA that is characteristic of an organism and carries genetic information. Humans normally have 23 pairs of linear chromosomes in each cell nucleus, and one circular chromosome in each of a cell's mitochondria.
  • clade. From the field of cladistics, a group that shares a common set of lineage-defining characteristics. Has a parent (ancestral) superclade, and may have child (derived) subclades. When defined by a haplotype, a clade is a haplogroup.
  • cladistics. A technique of classifying biological organisms into groups ("clades"), based on their shared characteristics. Originally a method used within phyogenetics, now mostly just represented by the use of terms like "superclade" and "subclade" to refer to the ancestral or derived phyogenetic haplogroups of a haplogroup. A haplogroup is a type of clade.
  • daughtered out. In patrilineal lines, when a line goes extinct because a man's children are all daughters. In matrilineal lines, when a line goes extinct because a woman's children are all sons.
  • deletion (DNA mutation). The removal of one or a continuous span of several letters from the DNA sequence.
  • derived (SNP). A SNP value that is not the same letter as the one found in the earliest human ancestors, because of a mutation. Also called "testing positive" for that SNP, i.e. having the mutation.
  • divergence time. The time, usually in years before present, when two phylogenetic lineages diverged, which is usually taken as the founding time of the younger lineage. The year when a new lineage's founder was born. When a haplogroup was formed.
  • DNA. DeoxyRibonucleic Acid. The chemical name for the primary molecule that stores genetic information preserved across generations of Earth life.
  • DNA marker. A DNA sequence variation that's reliably inherited between generations, and therefore can define a lineage. Examples include a single letter change, an insertion or deletion (indel), or an inversion.
  • equivalent SNP. See: phylogenetically equivalent SNP.
  • extra mutation/extra SNP. A mutation or SNP beyond those used to define a particular haplogroup. Includes private SNPs but also those mutations or SNPs not used to define haplogroups because they are too unstable.
  • genetics. The scientific study of heredity, and the mechanisms and molecules that underlie it.
  • haplogroup. A group of related haplotypes, or, casually, the group of people who carry those haplotypes. Because haplotypes are composed of DNA markers, which are inherited, haplogroups are formed as a result of related lineages sharing a common ancestor in which all those markers were first found.
  • haplogroup assignment. A prediction or confirmation that determines to which haplogroup a person belongs, done by assessing their DNA markers.
  • haplogroup project. A group project that seeks to better define the structure of a haplogroup (the structure of the phylogenetic tree), as part of genealogical and anthropological research.
  • haplotype. A linked set of genes or DNA markers inherited together from a single parent. Because the Y-chromosome is a single chain of DNA on which all Y-DNA markers are connected, and is inherited from only one parent (the father), any man's particular pattern of Y-DNA markers is a haplotype.
  • hierarchical name (haplogroup). The assigned name of a haplogroup as given in the tree hierarchy style of LETTERnumberLetterNumber, etc. E.g. R1a1a1b1a3a1a.
  • identical SNP. Within genetic genealogy, another term for synonymous SNP.
  • indel (DNA mutation). A DNA mutation that is either a simple insertion or deletion.
  • insertion (DNA mutation). The insertion of one or more additional letters into the DNA sequence at a single point.
  • ISOGG. The International Society of Genetic Genealogy (website). Maintains a major Y-phylogenetic tree and serves as an authority for hierarchical haplogroup names.
  • inversion (DNA mutation). The excision, inversion, and replacement of a continuous span of letters in a DNA sequence. When inverted, the sequence is reversed and also changed such that each letter becomes its complementary opposite (A>T, T>A, C>G, G>C). For example, if the sequence AGCGTCC gets inverted, then it is replaced by the sequence GGACGCT.
  • longhand name (haplogroup). See: hierarchical name.
  • matrilineal. A hereditary lineage extending through females. Your matrilineal lineage is your mother, your mother's mother, your mother's mother's mother, etc. Sometimes called a "direct maternal" lineage.
  • most recent common ancestor (MRCA). In the genealogical history of two people, the ancestor they have in common that lived most recently. For a haplogroup, the most recently living person with all the DNA markers that define the haplogroup. Note that the older the haplogroup, the more likely it is that the haplogroup's founder (divergence time), who was the first person with that set of DNA markers, lived long before the MRCA, since most Y-lineages eventually go extinct.
  • most distant known ancestor (MDKA). The earliest born person who's known in a specific patrilineal or matrilineal lineage.
  • MDKA. See: most distant known ancestor.
  • minisatellite DNA. A DNA feature in which a sequence 10-60 letters long is repeated, typically 5-50 times in a row. In contrast to microsatellite DNA (aka STRs).
  • microsatellite DNA. Another term for short tandem repeats (STRs), in which there are 2-6 letters per repeat, in contrast to minisatellites.
  • MRCA. See: most recent common ancestor.
  • multiallelic SNP. A SNP for which more than two different variants have been seen in the human population, as opposed to biallelic SNPs which are more commonly used as DNA markers for reasons of simpler analysis.
  • multi-nucleotide polymorphism (DNA mutation) (MNP). A change to the DNA sequence of several adjacent "letters" (nucleotide bases), usually 2-3. In contrast to SNPs, where only a single letter is changed.
  • next generation sequencing (NGS). Any of a set of techniques that allows high-throughput DNA sequencing. Extends the older, slower, and more expensive Sanger sequencing technique.
  • NGS. See: next generation sequencing.
  • patrilineal. A hereditary lineage extending through males. Your patrilineal lineage is your father, your father's father, your father's father's father, etc. Sometimes called a "direct paternal" lineage.
  • phylogeny. In biology, the inferred evolutionary relationship between biological entities (e.g. individuals, species, Y-chromosomes, haplotypes) based upon similarities and differences in their inheritable traits. Groups of related entities are called clades. There are subtle differences between analyses by phylogenetics, cladistics, and phenetics. In common usage these distinctions are mostly ignored, and the term "phylogenetics" is frequently used to conflate all of them.
  • phylogenetically equivalent SNP. Also called equivalent SNP. One of a set of SNPs that define a haplogroup but are never present separately in that haplogroup's individual members tested to date. Therefore it's not possible to determine their order of appearance, which is the detailed phylogenetics sometimes called the "fine structure" of the haplogroup. Among equivalent SNPs, one may be chosen by an authority to represent and define each haplogroup, typically based on it giving reliable results in testing, but different authorities sometimes choose different equivalent SNPs to define the same haplogroup.
  • phylogenetics. A branch of phylogeny that uses genetic information to define clades.
  • private SNP. A mutation that happened later in a lineage than the currently defined haplogroups.
  • recombinant loss of heterozygosity (RecLOH). An unusual type of DNA change that can happen when near-identical duplicates of a DNA sequence exists, such that a long segment from one copy can replace the equivalent segment of the other copy. Several adjacent DNA markers can be changed in this single event, resulting in two identical copies of that segment.
  • recurrent SNP. A SNP that appears in more than one phylogenetic tree branch, and therefore is not by itself diagnostic of any haplogroup. Typically named with an extension (e.g. P7_1 or P7.1 instead of just P7), and sometimes distinguished by specifying the haplogroup branch to provide context (e.g. B-P7.1).
  • Sanger sequencing. An older DNA sequencing technology than NGS, sometimes used to confirm the results of NGS and other DNA sequence assessment technologies.
  • shorthand name (haplogroup). Haplogroup name that incorporates only part of the full hierarchical haplogroup name, plus a characteristic defining SNP for that haplogroup.
  • single nucleotide polymorphism (SNP). A single DNA letter (nucleotide base) location that is in some people different from the ancestral one found in most humans. Also see: phylogenetically equivalent SNP, synonymous SNP, recurrent SNP, multiallelic SNP, and private SNP.
  • SNP. See: single nucleotide polymorphism.
  • SNP test. An assessment of which letter (A, G, C, or T) is present at a SNP site.
  • STR. Short Tandem Repeat. A repeating DNA sequence with 2-6 letters per repeat.
  • STR test. An assessment of the number of repeats found at one or more STR sites.
  • subclade. A group within a haplogroup, representing a more recently appearing haplogroup. Also can be referred to as a "child" or "derived" haplogroup.
  • synonymous SNP. In the context of genetic genealogy, another name for the same exact SNP but named by an independent discoverer, in contrast to a phylogenetically equivalent SNP. In a broader context, the term is used to refer to a coding-region single-base mutation resulting in a different codon that translates to the same amino acid, in contrast to the term identical SNP that refers to another name for the same exact mutation.
  • terminal haplogroup. The identified haplogroup of a person's patrilineal lineage with the youngest estimated age, i.e. that was formed most recently. Note that further testing and research may extend the Y-phylogenetic tree or the person's assignment in the tree, in which case an even more recently formed Y-haplogroup will become that person's new "terminal" haplogroup.
  • terminal SNP. The SNP that defines the most recently formed haplogroup that a person has been assigned to (their terminal haplogroup).
  • TMRCA. Time to Most Recent Common Ancestor. See: most recent common ancestor.
  • UEP. See: unique event polymorphism.
  • unique event polymorphism (UEP). A DNA marker that only has occurred once since humans appeared, and therefore reliably identifies a lineage. For example, a SNP that is never found outside the phylogenetic lineage in which it was first discovered (i.e. is not a recurrent SNP). A term that's somewhat falling out of favor because it presents an ideal, and in practice any DNA marker can only be considered provisionally unique to a lineage.
  • Y-chromosome. One of the 23 pairs of chromosomes humans normally carry, but only found in males.
  • Y-chromosomal Adam. The most recent common male ancestor of all living humans. The male to whom all living men's Y-chromosome can be traced.
  • Y-haplogroup. A haplogroup defined with a set of DNA markers on the Y-chromosome. The patrilineal haplogroup, as contrasted with the matrilineal mitochondrial DNA haplogroup (mtDNA-haplogroup, or mt-haplogroup).
  • Y-DNA. The DNA sequence of the Y-chromosome.
  • Y-STR. An STR on the Y-chromosome.
  • ybp. Years Before Present. Years ago.