CONTENTS
1 Introduction
2 L555 and the single L555 SNP test
3 The Geno2 test
4 BigY results and analysis
5 L555 SNP Panel Test
1. Introduction:
SNP tests and haplotrees
SNPs and
haplotrees have been introduced at “Interpreting yDNA test Results” section 3, and
findings relevant to this Study at “Latest Results Analysis” especially section
3.1. Recommendations taking SNP tests are given in “Choosing further
tests section 3. This page gives some
additional background to this subject.
2. L555 and the single
L555 SNP test
The SNP
L555 was first identified in 2011 in the WTY (Walk the Y) test by William Rector Irwin II (65048),
which showed it to be one of an increasing number
of "descendants" of R-L21.
Thanks to him then making a generous General Fund donation we learnt that most if not all members of our Borders genetic family were likely to test L555+. At the same time an independent study by
Robert Casey showed that many of the "sons" of L21 have tested
L555-, confirming that L555 was probably another son of L21. Robert
suggested that we seek ISOGG qualification for L555. For this we had to
show two criteria were met:
- A representative of all of the other recognised "sons" of L21 tested
L555-.
- Within the L555+ testees a diversity of 15% had to be shown. At 37
markers a Genetic Distance of 6 was required (37 x 0.15 = 5.5), and within our
L555+ 37-marker participants four pairs already have a GD of 6. But
at 67 markers a GD of 11 was required (67 x 0.15 = 10.1), and Robert has
shown 78458 and 74798 to have a GD of 12.
For details see http://www.rcasey.net/DNA/R_L21/R_L21_Private.html
>Analysis > L555. L555 is now recognised by ISOGG, FTDNA and
National Genographic.
This
development brought the following benefits:
- All participants
in our Borders genetic family with the surname Irwin or similar and a TiP Score
of over 60% could now reasonably assume they are L555+, and other Irwin
participants in our Study with a TiP Score of less than 20% could assume that
they are L555-. It followed that no further Irwin participants need order
the single L555 test.
- All
participants in our Borders genetic family who do not have the surname
Irwin, or variant spelling, (i.e. NPEs), and Irwins with a TiP Score of
between 20% and 60%, should take a L555 test. One non-Irwin
participant with a TiP Score of 72% has been found to be L555- and has had to
be reclassified as a False Positive, at least for the time being.
At
present no one outside our Study has tested L555+. Whether this situation will endure is
unclear. Guidance by Robert Casey and a donation by John Hamblen helped to identify
the boundary values of this SNP, in other words confirm whether some of
the non-Irwin close matches with other surnames are L555+ or -, for example
FTDNA testees with surnames such as Dunbar or Blackburn. However to date
alas no progress has been made on this feature.
3. The Geno2 test
The Geno1
test was launched by National Geographic in 2005. Testees could transfer
their ySTR test results to FTDNA's database, and 17 participants of
our Study took advantage of this facility, identifiable by test kit nos.
prefixed by "N".
The
Geno2 test was launched by National Geographic's Genographic Project in
late 2012. It costs $199.95 (+$20 shipping for Europe). It much, more
powerful than Geno1 and the former Deep Clade and WTY tests of FTDNA, all
of which it superseded, but it is aimed primarily towards those interested
in Deep Ancestry, and so has no ySTR data. It tests for about 146,000
SNPs, but only (!!) about 12,000 of these are y-SNPs. Of these 6,153
have been placed on a new phylogenetic tree but the remainder are
"novel", although not necessarily "young". Geno2 has
been taken by several participants in our Study, and includes L555, but it does
not discover SNPs downstream of L555 and as I don’t think it represents value for money I cannot recommend it.
4. BigY test
results and analysis
FTDNA
launched this test in September 2013. It now costs $449, and less during sales. It covers
over 30,000 SNPs, including many "novel" SNPs younger than
L555. Although initially I did not recommend our participants should
order these tests, results relevant ot our Study started to become available during
2014. It was quickly apparent that the
tools provided for by FTDNA for their analysis – their haplotree, Matches, .csv
files and .vcf files were of little use to the layman, and the BAM files, from
which all this secondary data was derived, were very cumbersome to handle and
even when opened not easy to comprehend. Nor were the analyses of BigY results
by FGC and YFull for c.$50 much better, and I was not impressed by their probablistic STR data.
Intelligible
and informative analyses were available from Alex Williamson’s BigTree, albeit
only for SNPs downstream of P312 (i.e. including L555), but Alex was
threatening to withdraw his service. I
therefore felt I ought to try to learn how to handle BigY BAM data, and with
help from Dennis Wright and support from Alex and Mike Walsh I eventually
learned the procedures and by late 2015 was able to produce a haplotree containing
our Study’s L555 BigY results.
Concurrent
with this learning process it became apparent that the expense of BigY tests was
such that few participants could afford such tests, while the cheaper bespoke Pack
tests that FTDNA were planning would be of little use for our Study if we had
not first identified the SNPs that characterised our Border Irwin sub-groups. So
early in 2015 the main outstanding Border Irwin sub-groups were identified and
a further six relevant BigY tests were funded by generous donations from other
participants in the Study.
With 12
L555 BigY test results available by mid 2015 several features were becoming
clear:
a) I was able to
replicate Alex Williamson’s Big Tree.
b) Our 12
L555 results was one of the bigger surname blocks in Alex’s tree.
c) The L555 SNP still seemed unique to Irwins (including variant spellings,
plus some NPEs).
d) With the exception of a single
intermediate variant (21368012-G-A) the basic structure of the L555 sub-clade
implies a very horizontal structure, aka a “starburst”, with the “descendants”
of L555 being near contemporaries. This mirrors the structure emerging
from our STR results that the Border Irvings are comprised of several “old”
sub-groups, and with the concept that conventional genealogy is identifying as
the Irvings of Dumfries, Eskdale, Gretna, Hoddam, Luce and Pennersax being
contemporaries of the Irvings of Bonshaw. One day, hopefully, we’ll be
able to match more of these genealogical families with our genetic sub-groups,
as identified by STRs and defined by BigY data.
e) TMRCAs
(Time to Most Recent Common Ancestor). It is possible to estimate the age
of SNPs, though care has to be taken on how they are counted; for example the
same rate should not be used for SNPs identified by BigY and FGC tests, and by
implication not for those identified both the “loose” quality probably inherent
in BigY .vcf files and those retained after more stringent assessment of .bam
files. However I have largely followed the quality criteria adopted by
Dennis Wright – to whom I am immensely indebted – and he uses 130 years per
SNP. Applying this ratio to the L555 phylogentic tree as above gives
- the age of the Border Irwin “starburst” as c.AD1330;
- the duration of the
“bottleneck” that includes L555 as 3,000 years, a remarkably long time;
- the age of L21 as
4050bp (before present).
Given the
nature of the calculations these ages must be regarded as very approximate, but
the 4050bp date is within what I understand to be the 4,000-5,500bp bracket
popularly associated with the age of L21, and it is also noteworthy that the
c.AD1300 date is remarkably consistent with calculations from STR data (see
Supplementary Paper 3), and with genealogical research being conducted by Kent
Irvin. I believe it entirely coincidental that this date is also
compatible with the Dr Christopher Irvin/Robert the Bruce traditions, but
readers may of course dissent!
f) The mix in the Border Irwin
phyologentic tree of multifurcation (aka starbursts) (at L251 and post L555),
large blocks aka boxes aka bottlenecks (including L555 itself, and the
“private” boxes of individual testees), and the more expectable simple
bifurcations between Z251 and L555 (shown in the Clan Irwin Phylogenetic
tree) are typical of many other sub-clades. The significance of
these features is unclear; some suggest starbursts can be associated with
population explosions and bottlenecks with plagues or economic hard times, but
factors such as paucity of data and phylogenetics are probably also relevant.
g) SNPs derived by NGS tests are
probabilistic, and there is little consensus on what constitutes a genealogically relevant SNP, and whether or not a marginal SNP should be
included on a halpotree or included in a TMRCA calculation. For example the ISOGG haplotree criteria are
more stringent that the FTDNA criteria, which in turn are more stringent than
the YFull criteria. Alarming as this may
seem, in practice however it is not of great consequence except when calculating
TMRCAs.
h) In early 2016 FTDNA radically improved their
haplotree and the need for BAM analyses became much less critical. I do however differ from FTDNA on details of a few of the most downstream SNPs.
i) Between May 2014 and December 2015 I
issued five detailed Bulletins for our BigY testees. I can e-mail a copy of the last on request. But now that the FTDNA haplotree is more
comprehensive these Bulletins are no longer critical. See also Supplementary Paper 9.
Meanwhile by November 2015 it was possible to use our L555 BigY findings to propose to FTDNA the SNPs that should be included in a new L555 SNP Pack test.
5. The L555 SNP Pack (or Panel) test
FTDNA
launched their SNP Pack tests in 2015.
They combine about 100 SNPs that have been identified as being relevant
to a selected part of the halpotree on the basis of BigY findings. Each SNP included is given a simple binary,
i.e. yes/no test.
The L555
Pack was designed from the findings of 12 L555 BigY tests. It includes 86 SNPs that have been identified
as being contemporary with or downstream of L555. At $119 this works out at about $1.40 per
SNP. The Pack test is thus much cheaper
than a BigY test and, per SNP, also much cheaper than a single SNP test. And although Pack tests cannot identify “new”
SNPs, an unexpected bonus is that they can, from negative results, detect the
presence of hitherto unknown SNPs and show where on the haplotree that these
unknown SNPs occur.
I issued
analyses of our L555 Pack test results in May and September 2016 for our BigY
and L555 Pack testees, but apart from demonstrating how the SNPs within the
Pack were selected and how the test results make up the L555 haplotree, their
content is now included on this website.
The
following websites give additional background information: