Monday, February 1, 2021

An Analysis of the Ackley Y-DNA Haplotype

This post is maybe a little geekier than normal, but hopefully those readers who have been following the Y-DNA articles will find this of interest.

Surname projects on Family Tree DNA exist in large part to help members answer genealogical questions about their surname. The main tools available are Y-DNA tests such as STR tests (Y-12, Y-25, Y-37, Y-67, and Y-111) and SNP tests (Big Y-700). One challenge for surname project administrators is to recommend testing options for members that will help answer their questions at the least cost. In general, the more STR markers tested, the better the information gained, but the cost for testing goes up as the number of markers increases. Some projects have identified markers and values that seem to uniquely define the haplotype for their surname and thus can recommend testing levels that have a high probability of identifying members who are likely to be related to others who share their surname.

The goal of this analysis is to determine if there are any Unique STR Patterns (USPs) [1] that help to identify Ackley men. The approach will be to compare the ancestral values for Ackley men to ancestral values for upstream haplogroups, specifically R-M269, R-L21, and R-S1051, to determine if any of the values are particularly more prevalent in Ackley men than in those larger groups.

Y-DNA STR Testing and Convergence

Before taking an in-depth look at individual markers, a short discussion about the various levels of testing is in order. As mentioned above, STR testing at 12, 25, 37, 67, and 111 markers has been available at various times from Family Tree DNA. Currently only 37, 67, and 111 marker tests can be purchased. Although 12 and 25 marker tests are no longer available, there were many men who tested at those levels in the past, so tools exist to look at matches for all levels. The reason for discontinuing the lower marker tests is that in many cases they just don’t provide enough information to draw useful genealogical conclusions. For example, there is a non-Ackley man in the Ackley Surname Project who has nearly 4,000 Y-12 matches. Not one of those matches has his surname, there are nearly 3,000 different surnames among his matches, and the most frequently occurring surname occurs only 52 times. Moving up to 25 markers, this man has almost 1,600 matches with almost 1,000 surnames, none of which are his surname. This information is decidedly unhelpful in answering genealogical questions.

One of the reasons for this issue is a phenomenon called convergence, which is the idea that “over long periods of time a series of markers can through random change adopt patterns that make them look more closely related than they actually are.” [2] Fortunately for members of the Ackley Surname Project, this does not appear to be an issue. Of the 15 men in the project, only one has more than three non-Ackley matches at the 12-marker level. The one man who does not fit this scenario has 55 non-Ackley matches, and he differs from the other 14 men in that he is the only one in the project who has a value of 30 for marker DYS389ii, while all other men in the project have a value of 31. At 25 markers, all of the 55 “excess” matches drop off of his match list, and for the other 14 men one of the three “excess” matches drops off. The two remaining “excess” matches show up on some of the match lists all the way up to and including 111 markers, suggesting that these two men might have an NPE somewhere in their ancestry and are in fact closely related to the Ackley men. Otherwise, all of the matches for every member at 37, 67, and 111 markers have the Ackley surname. This situation seems to indicate that the Ackley haplotype is unique enough even at 12 markers to be useful in answering Ackley genealogical questions, and a Y-37 test is likely to be sufficient to determine if an individual is related to Ackley men in the project.

The Ackley Haplotype

The following are the STR values for the Ackley men who have taken Y-37 (5 men), Y-67 (2 men), or Y-111 (8 men) DNA tests. For ease of presentation, the values are split up by panel. Given the numbers of men who have tested at each level of markers, the number of samples for each panel are:

Panel 1 – 15

Panel 2 – 15

Panel 3 – 15

Panel 4 – 10

Panel 5 – 8

  

Panel 1 - Markers 1-12

 
Panel 2 - Markers 13-25

Panel 3 - Markers 26-37

 

Panel 4 - Markers 38-52

Panel 4 (cont) - Markers 53-67

 

Panel 5 - Markers 68-82

 
Panel 5 (cont) - Markers 83-97

 

Panel 5 (cont) - Markers 98-111

 

The mode of a set of data is the most frequently occurring value. For STR markers it is assumed that the mode is the “original” value, i.e., the value of that marker for the common ancestor of the testers in a particular group. Modal values are also referred to as ancestral values. The values do not mean much by themselves but take on meaning when compared to other haplotypes for analysis purposes.

As mentioned above, Ackley data will be compared to data from upstream haplogroups. R-M269 data (also known as R1b) was captured from the “R_R1b ALL Subclades” haplogroup project at Family Tree DNA (FTDNA) [3], R-L21 data was captured from the “R L21 and Subclades” haplogroup project at FTDNA [4], and R-S1051 data was captured from the “R-S1051” haplogroup project at FTDNA [5]. The relationships of these haplogroups are shown in the table below. Note that R-M269 is the oldest haplogroup, having formed about 13,000 years ago. R-L21, a subclade of R-M269, formed about 4,500 years ago, while R-S1051 formed about 3,900 years ago. Six of the 15 Ackley men who have done STR tests have also done SNP tests that place them in the last two haplogroups in the table, R-FGC52286 and R-FGC52300.

 

SNP Formation Dates [6]

As with the Ackley surname project, the number of men who have joined each of the haplogroup projects mentioned above have tested different levels of markers, from Y-12 through Y-111. As a result, the number of data points available for each marker can be different depending on how many men in the project tested at each different level. There were 25,631 members of the R1b project, 9,213 members of the R-L21 project, and 264 members of the R-S1051 project. It should be pointed out that there is some overlap between the projects – since there is a hierarchical relationship between the haplogroups there are men who joined more than one project. This was accounted for in the analysis, and only unique data points were used in the calculations that will be presented later.

A Closer Look At Some of the Markers

The table below gives examples of the data collected for each of the markers 1-111. The main part of the table gives the frequency of occurrence for the values for each marker for each haplogroup project. The ancestral value (mode) for each marker for each project is highlighted in green. The bottom part of each table gives the min, max, and mode for each marker for the Ackley men who have been tested for that marker. The numbers in the mode row give the difference between the Ackley ancestral value and the haplogroup ancestral values; 0 means they match, a positive number means the Ackley ancestral value is above the haplogroup ancestral value, and a negative number means the Ackley ancestral value is below the haplogroup value.


 

In these examples, the ancestral values for the haplogroup projects are well-defined, i.e., they are clearly the most frequently occurring value for that marker, and other possible values are much less frequent. This is not always the case, and some specific examples will be discussed later. For the first marker, DYS393, MIN=MAX=MODE, indicating that all 15 Ackley project members had the value 13. The Ackley ancestral value of 13 matches each of the haplogroup project ancestral values. This situation holds true for 81 of the 111 markers; these markers will not be analyzed further as they would not contribute to the stated goal of identifying USPs for Ackley men.

Marker DYS390 is an example where the Ackley ancestral value of 25 is 1 greater than the ancestral value for each of the haplogroup ancestral values. This situation, where the Ackley ancestral value differs from the haplogroup ancestral values (which are equal to each other), is considered a good candidate for further analysis. Of the 30 markers where the Ackley ancestral value differed from the haplogroups ancestral values in some way, 17 of them fit these criteria, and will be discussed further below.

The table below shows examples of markers where the Ackley ancestral values differ from the ancestral values for R1b and R-L21 (the more distant haplogroups) but are equal to the ancestral value for R-S1051. This pattern suggests that the mutation in this marker could be a USP for the R-S1051 haplogroup rather than the Ackley haplotype. As such, these markers will not be researched further.


There are four markers in the table below with a similar pattern, but the Ackley ancestral values differ only with the R1b ancestral value and are equal to the ancestral values for both R-L21 and R-S1051. This would indicate that the mutation in the marker could be a USP for R-L21, and the R-S1051 group and Ackley both stayed at that value. Likewise, these markers will not be investigated further.


The markers in the next example are interesting in that they show a somewhat unusual mutation pattern. In both cases, the Ackley ancestral value matches the ancestral value for the oldest haplogroup, R1b, and in the case of DYS712 also matches the ancestral value for R-L21 but does not match the more recent haplogroup R-S1051. This could be the result of a back mutation, where the Ackley men mutated from the more recent R-S1051 value back to the original R1b value. It is also worth noting that for CDYa, there is not a clear modal value; the frequencies for the values of 36 and 37 are very close in all three haplogroups. Even the Ackley men have some members with a value of 36 and some with a value of 37. For DYS712 the mode is a little more clear, but in this case the Ackley men have several values. There are two men at 19, five at 20, and 1 at 21. For this reason, these markers will not be analyzed further.


 

The final two markers that have not yet been discussed are similar to the above example in that the Ackley ancestral values are closer to the more ancient haplogroups (R1b and R-L21) than they are to the more recent R-S1051. In both cases, the Ackley values are +1 compared to the values for R1b and R-L21 and are +2 compared to R-S1051, and the +2 occurred because the R-S1051 marker mutated in the opposite direction from the Ackley marker. These markers would not lend themselves to the type of analysis discussed above and thus will not be included in further analysis.

Analysis

To summarize what has been discussed so far: the Ackley ancestral values are equal to the ancestral values for all three upstream haplogroups on 81 of the 111 Y-STR markers, meaning these 81 markers would not provide any insights into the Ackley haplotype. As discussed above, for various reasons 13 of the remaining 30 markers would also not be good candidates for further study. That leaves the 17 markers in the table below that will be the subject of the remainder of this study.



To reiterate, for all 17 markers in this table, R1b value = R-L21 value = R-S1051 value ≠ Ackley value, so the Ackley value likely represents a true variation from the value found in the rest of the upstream haplogroups. Note that the “Total Records” column gives the number of unique records having values for that particular marker; any tester who was a member of multiple projects was only counted once. The figures in the “No. at Ackley Value” column give the numbers of testers for which the value of each marker is equal to the Ackley ancestral value. Likewise, the figures in the “No. at Ancestral Value” column give the number of records for which the value of each marker is equal to the more ancient haplogroup values (which are equal to one another).

The approach for this analysis is to look for marker values or combinations of marker values that are present in Ackley men but appear to be relatively rare in the larger R1b population. An obvious example of this is marker DYS617. From the table above we can see that the Ackley value for that marker is 11, which is present in only 187 (.86%) of R1b men, while over 91% of R1b men have a value of 12 for that marker. Another way to look at this is to observe that while Ackley men make up only .04% of the R1b men who have tested to at least 67 markers (10/21647), they make up over 5% of the men who have a value of 11 for marker DYS617 (10/187). The practical implication of this situation is that in cases where a Y-37 test is not conclusive in determining whether a tester is related to the Ackley men but is “close”, a Y-67 test can be recommended to see if the tester has a value of 11 for DYS617. None of the other markers appear to be unique enough on their own, but in combination with other markers could provide some insights.

The table below shows the number of testers whose values were equal to the Ackley values for all possible pairs of the 17 markers being studied. Each cell in the table gives the number of testers who had the Ackley values for the intersection of the row and column that form the intersection. For example, 384 testers had a value of 25 for DYS390 (the row) and 15 for DYS19a (the column).


 

Note that these numbers are typically much smaller than the numbers for single markers presented in the previous table, implying that combinations of these marker values are much less common and could provide insights into the Ackley haplotype. Of particular interest are the values for DYS617; the combination of a value of 11 for this marker with the Ackley value for any of the other 16 markers in the study looks to be extremely rare when compared to the R1b population. For example, a value of 25 for DYS390 and 11 for DYS617 occurred in only 24 of the 21,647 men who had values for both of those markers, and 10 of those 24 men were Ackleys. As a percentage, only 0.1% of R1b men had a value of 25 for DYS390 and 11 for DYS617, and 41% of those were Ackley men. This situation seems to confirm that a value of 11 for DYS617 in combination with any of the Ackley values for the other 16 markers are good candidates for USPs. Again, the practical implication of this is that an upgrade from Y-37 to Y-67 would be a reasonable suggestion for a tester whose relationship to other Ackley men is “close” but over the FTDNA threshold of 5 for 37 markers (say 6 or 7).

While there is undoubtedly more that could be learned by further study of the Ackley haplotype, there are two main takeaways from what has been learned so far: (1) a Y-37 test is probably sufficient to establish relatedness for most testers, and (2) for testers who are “close”, a Y-67 test, with particular attention to the DYS617 marker, could provide more information to draw a more definitive conclusion.

A note of thanks

Thank you to Dave Vance for being kind enough to review this information before I published it. Dave is an expert in using Y-DNA testing for genealogy -- in fact he wrote a book on it -- and I appreciate his willingness to answer my questions and keep me straight.

Sources


1. Gleeson, Maurice. "How to group Surname Project members", DNA and Family Tree Research, 28 Jan 2021.
2. Vance, David. The Genealogist’s Guide to Y-DNA Testing for Genetic Genealogy. Self-published, 2020, p. 73.
4. Family Tree DNA. "R L21 and Z290 Subclades Project - Y-DNA Classic Chart", accessed online 12 Jan 2021.
5. Family tree DNA. "R-S1051 Project - Y-DNA Classic Chart", accessed online 12 Jan 2021.
6. Spencer, Rob. "SNP Tracker", a tool on the website Tracking Back: a website for the exploration of genetic genealogy and population genetics, accessed online 29 Jan 2021.

Link of the Day


This is a link to the results page for the Ackley Surname Project at Family Tree DNA:

Quote of the Day


“I have no special talent. I am only passionately curious.”

-- Albert Einstein

No comments:

Post a Comment