5. Historical note on L555 SNP and BigY Tests


Contents

1 Introduction

2 L555 and the single L555 SNP test

3 The Geno2 test

4 BigY results and analysis

5 L555 SNP Panel Test

1. Introduction: SNP tests and haplotrees

SNPs and haplotrees have been introduced at “Interpreting yDNA test Results” section 3, and findings relevant to this Study at “Latest Results Analysis” especially section 3.1. Recommendations taking SNP tests are given in “Choosing further tests section 3. This page gives some additional background to this subject.

2. L555 and the single L555 SNP test

The SNP L555 was first identified in 2011 in the WTY (Walk the Y) test by William Rector Irwin II (65048), which showed it to be one of an increasing number of "descendants" of R-L21. Thanks to him then making a generous General Fund donation we learnt that most if not all members of our Borders branch were likely to test L555+. At the same time an independent study by Robert Casey showed that many of the "sons" of L21 have tested L555-, confirming that L555 was probably another son of L21. Robert suggested that we seek ISOGG qualification for L555. For this we had to show that two criteria were met:

  1. A representative of all of the other recognised "sons" of L21 tested L555-.

  2. Within the L555+ testees a diversity of 15% had to be shown. At 37 markers a Genetic Distance of 6 was required (37 x 0.15 = 5.5), and within our L555+ 37-marker testers four pairs already have a GD of 6. But at 67 markers a GD of 11 was required (67 x 0.15 = 10.1), and Robert has shown 78458 and 74798 to have a GD of 12.

For details see http://www.rcasey.net/DNA/R_L21/R_L21_Private.html >Analysis > L555. L555 is now recognised by ISOGG, FTDNA and National Genographic.

This development brought the following benefits:

  1. All members of our Borders branch with the surname Irwin or similar and a TiP Score of over 60% could now reasonably assume they are L555+, and other Irwin members of our Study with a TiP Score of less than 20% could assume that they are L555-. It followed that no further Irwin members need order the single L555 test.

  2. All members of our Borders branch who do not have the surname Irwin, or variant spelling, (i.e. NPEs), and Irwins with a TiP Score of between 20% and 60%, should take a L555 test. One non-Irwin member with a TiP Score of 72% has been found to be L555- and has had to be reclassified as a False Positive, at least for the time being.

At present no one outside our Study has tested L555+. Whether this situation will endure is unclear. Guidance by Robert Casey and a donation by John Hamblen helped to identify the boundary values of this SNP, in other words to confirm whether some of the non-Irwin close matches with other surnames are L555+ or -, for example FTDNA testees with surnames such as Dunbar or Blackburn. However to date alas no progress has been made on this feature.

3. The Geno2 test

The Geno1 test was launched by National Geographic in 2005. Testees could transfer their ySTR test results to FTDNA's database, and 17 members of our Study took advantage of this facility, identifiable by test kit nos. prefixed by "N".

The Geno2 test was launched by National Geographic's Genographic Project in late 2012. It costs $199.95 (+$20 shipping for Europe). It is much more powerful than Geno1 and the former Deep Clade and WTY tests of FTDNA, all of which it superseded, but it is aimed primarily towards those interested in Deep Ancestry, and so has no ySTR data. It tests for about 146,000 SNPs, but only (!!) about 12,000 of these are y-SNPs. Of these 6,153 have been placed on a new phylogenetic tree but the remainder are "novel", although not necessarily "young". Geno2 has been taken by several members of our Study, and includes L555, but it does not discover SNPs downstream of L555 and as I don’t think it represents value for money I cannot recommend it.

4. BigY test results and analysis

FTDNA launched this test in September 2013. It now costs $449, and less during sales or if the testers have already had an STR test. In mid-2015 BigY covered 7,000 SNPs; by mid-2019 it coverered over 160,000 SNPs, including many "novel" SNPs younger than L555. BigY had a "rocky" start, for several reasons, but has now matured into anincreasingly popular test and I can strongly reocmmend it for those members who are in our Borders branch and can afford its now much more reasonable price.

Although initially I did not recommend our members should order these tests, results relevant to our Study started to become available during 2014. It was quickly apparent that the tools provided by FTDNA for their analysis – their haplotree, Matches, .csv files and .vcf files were of little use to the layman, and the BAM files, from which all this secondary data was derived, were very cumbersome to handle and even when opened not easy to comprehend. Nor were the analyses of BigY results by FGC and YFull for c.$50 much better, and I was not impressed by their probablistic STR data.

Intelligible and informative analyses were available from Alex Williamson’s BigTree, albeit only for SNPs downstream of P312 (i.e. including L555), but Alex was threatening to withdraw his service. I therefore felt I ought to try to learn how to handle BigY BAM data, and with help from Dennis Wright and support from Alex and Mike Walsh I eventually learned the procedures and by late 2014 was able to produce a haplotree containing our Study’s L555 BigY results.

Concurrent with this learning process it became apparent that the expense of BigY tests was such that few members could afford such tests, while the cheaper bespoke Pack tests that FTDNA were planning would be of little use for our Study if we had not first identified the SNPs that characterised our Border Irwin sub-groups. So early in 2015 the main outstanding Border Irwin sub-groups were identified and a further six relevant BigY tests were funded by generous donations from other members of the Study.

With 12 L555 BigY test results available by mid-2015 several features were becoming clear:

  • I was able to replicate Alex Williamson’s Big Tree.

  • Our 12 L555 results was one of the bigger surname blocks in Alex’s tree.

  • The L555 SNP still seemed unique to Irwins (including variant spellings, plus some NPEs).

  • With the exception of a single intermediate variant (21368012-G-A) the basic structure of the L555 sub-clade implies a very horizontal structure, aka a “starburst”, with the immediate “sons” of L555 being near-contemporaries. This mirrors the structure emerging from our STR results that the Border Irvings are comprised of several “old” sub-groups, and with the concept that conventional genealogy is identifying as the Irvings of Dumfries, Eskdale, Gretna, Hoddam, Luce and Pennersax being contemporaries of the Irvings of Bonshaw. One day, hopefully, we’ll be able to match more of these genealogical families with our genetic sub-groups, as identified by STRs and defined by BigY data.

  • TMRCAs (Time to Most Recent Common Ancestor). It is possible to estimate the age of SNPs, though care has to be taken on how they are counted; for example the same rate should not be used for SNPs identified by BigY and FGC tests, and by implication not for those identified both the “loose” quality probably inherent in BigY .vcf files and those retained after more stringent assessment of .bam files. However I have largely followed the quality criteria adopted by Dennis Wright – to whom I am immensely indebted – and he uses 130 years per SNP. Applying this ratio to the L555 phylogentic tree as above gives

- the age of the Border Irwin “starburst” as c.AD1330;

- the duration of the “bottleneck” that includes L555 as 3,000 years, a remarkably long time;

- the age of L21 as 4050bp (before present).

  • Given the nature of the calculations these ages must be regarded as very approximate, but the 4050bp date is within what I understand to be the 4,000-5,500bp bracket popularly associated with the age of L21, and it is also noteworthy that the c.AD1300 date is remarkably consistent with calculations from STR data (see Supplementary Paper 3), and with genealogical research being conducted by Kent Irvin. I believe it entirely coincidental that this date is also compatible with the Dr Christopher Irvin/Robert the Bruce traditions, but readers may of course dissent!

  • The mix in the Border Irwin phyologentic tree of multifurcation (aka starbursts) (at L251 and post L555), large blocks aka boxes aka bottlenecks (including L555 itself, and the “private” boxes of individual testees), and the more expectable simple bifurcations between Z251 and L555 (shown in the Clan Irwin Phylogenetic tree) are typical of many other sub-clades. The significance of these features is unclear; some suggest starbursts can be associated with population explosions and bottlenecks with plagues or economic hard times, but factors such as paucity of data and phylogenetics are probably also relevant.

  • SNPs derived by NGS tests are probabilistic, and there is little consensus on what constitutes a phylogenetically significant SNP, and whether or not a marginal SNP should be included on a halpotree or included in a TMRCA calculation. For example the ISOGG haplotree criteria are more stringent that the FTDNA criteria, which in turn are more stringent than the YFull criteria. Alarming as this may seem, in practice however it is not of great consequence except when calculating TMRCAs.

  • Thanks to tuition by Dennis Wright I started doing BAM analyses of Irwin BigY results in November 2014

  • Between May 2014 and December 2015 I issued five detailed Bulletins for our BigY testees. I can e-mail a copy of the last on request. But now that the FTDNA haplotree is more comprehensive these Bulletins are no longer critical. See also Supplementary Paper 9.

  • In early 2016 FTDNA radically improved their haplotree and the need for BAM analyses became much less critical, and I gave up doing them in April 2017. I did however still differ from FTDNA on details of a few of the most downstream SNPs.

  • Meanwhile by November 2015 it was possible to use our L555 BigY findings to propose to FTDNA the SNPs that should be included in a new L555 SNP Pack test (see below)..

  • In September 2018 FTDNA published their haplotree and this is being updated and refined continuously, so the need for BAM analyses has lapsed.

5. The L555 SNP Pack (or Panel) test

FTDNA launched their SNP Pack tests in 2015. They combine about 100 SNPs that have been identified as being relevant to a selected part of the halpotree on the basis of BigY findings. Each SNP included is given a simple binary, i.e. yes/no test.

The L555 Pack was designed from the findings of 12 L555 BigY tests. It includes 86 SNPs that have been identified as being contemporary with or downstream of L555. At $119 this works out at about $1.40 per SNP. The Pack test is thus much cheaper than a BigY test and, per SNP, also much cheaper than a single SNP test. And although Pack tests cannot identify “new” SNPs, an unexpected bonus is that they can, from negative results, detect the presence of hitherto unknown SNPs and show where on the haplotree that these unknown SNPs occur. However it does not discover new SNPs, or identify private SNPs unique to each tester. WE have also now learnt that the 86 SNPs it tests for donot prevode a comprehensive ppiture of the L555 haplotree. For this reason I now recommend the BigY700 test for those who can afford its price.

I issued analyses of our L555 Pack test results in May and September 2016 for our BigY and L555 Pack testees, but apart from demonstrating how the SNPs within the Pack were selected and how the test results make up the L555 haplotree, their content is now included on this website.


The following websites give additional background information: