Page 1 of 1

MAJOR CHANGE TO THE CLADOGRAM

Posted: Thu, 2020-Jul-23 8:16 pm
by Webmaster
After correspondence with Thomas Krahn of YSEQ, I have decided to remove all STR mutations from the DCG Cladogram over the next few weeks, with one very special exception. That exception is the case where a unitary segment in the reference genome gets one or more repeats inserted; e.g., xxxxxxxx-1A-2A, or xxxxxxxx-1ATC-3ATC, etc.

The rationale behind this change is that ALL STRs are volatile in nature, even unnamed or non-standard ones, and as such are not good for building a stable Haplotree. This will affect a few clades from The Big Tree, as FTDNA does not use STRs on their Haplotree already. This is one of the very few situations I agree with how FTDNA does things. It will mainly affect individuals who have done NGS or WGS testing and have a unique terminal clade. These people may see one or two variants disappear from their phylogenetic nodes.

Re: MAJOR CHANGE TO THE CLADOGRAM

Posted: Sat, 2020-Jul-25 3:54 pm
by Webmaster
UPDATE 1:

The first round has been completed. All STRs on the main page of The Big Tree should now be eliminated from the DCG Cladogram. The next step will be to do the same for all unique terminal clades. That will take a while to go through all ~775 men.

Re: MAJOR CHANGE TO THE CLADOGRAM

Posted: Sun, 2020-Jul-26 8:33 am
by Webmaster
UPDATE 2:

The "DCG" named mutations have ALWAYS been intended as temporary names until more "official" names are assigned. In order to simplify the maintenance of the DCG database, the DCG names have been renumbered to keep all active mutations in a continuous sequence. This will be the procedure from now on, so please bear in mind that any DCG named mutation is subject to change at any time.

Typically FTDNA assigns an "FT" name to all new SNPs they discover and report those to the YBrowse database maintainers, Thomas and Astrid Krahn. However, they do not typically do that for INDELs and SUBs, so if you have one of those and you want a more permanent name, I recommend you contact Thomas or Astrid at YSEQ and request they assign an "A" name to the mutation so it will have a permanent "official" name.

Re: MAJOR CHANGE TO THE CLADOGRAM

Posted: Tue, 2020-Jul-28 12:53 pm
by Geoff Melloy
Well, I don't understand all that, but I'm sure it's a good idea!

Re: MAJOR CHANGE TO THE CLADOGRAM

Posted: Tue, 2020-Jul-28 1:59 pm
by Webmaster
Geoff,

There are 4 main types of mutations on the Y chromosome.
  1. The best known are STRs - Short Tandem Repeats, where a single nucleotide or a short sequence of nucleotides are repeated consecutively multiple times. E.g., 10001000-AAAAA-AAAAAA, which means that beginning at location 10001000 on the Y chromosome there was a sequence of 5 consecutive A nucleotides that changed to 6 consecutive A nucleotides. This is written as 10001000-5A-6A.
  2. The next best known are SNPs - Single Nucleotide Polymorphisms, where a single nucleotide is replaced with one of the other 3 nucleotides. E.g., 10002000-C-T, which means that at location 10002000 on the Y chromosome the C nucleotide changed to a T nucleotide.
  3. Then there are INDELs - insertions or deletions of a single nucleotide or a short sequence of nucleotides is either inserted into the Y chromosome OR deleted from it. E.g., 10003000-ATGC-del, which means that beginning at location 10003000 on the Y chromosome there was originally a short sequence of ATGC nucleotides that was completely deleted from the Y chromosome. This is obviously a deletion. An insertion is the opposite of that - 10004000-del-ATGC. So originally the sequence of ATGC nucleotides at location 10004000 did not exist on the Y chromosome, but was inserted.
  4. The SUB is a substitution of a short sequence of nucleotides for a completely different sequence. E.g., 10005000-ACGT-TCCGAAT; so where originally at location 10005000 on the Y chromosome there was a sequence of ACGT nucleotides, it was completely replaced by the new TCCGAAT sequence of nucleotides.

    Of these 4 types of mutations, SNPs, INDELs, and SUBs VERY rarely change back to their original sequence. STRs are much more volatile and can change their repeat count fairly rapidly in genealogical time frames. This is why they are good for checking the closeness of recent cousin lineages; however, because of their volatility, they are NOT good for establishing the basic Haplotree spanning centuries.

Re: MAJOR CHANGE TO THE CLADOGRAM

Posted: Wed, 2020-Aug-12 7:11 pm
by ChrisMcLain132906
I think BY82279's disappearance from the blocktree coincided with this. Also, as I've seen A5902 written as A5902/03, are these two different SNPs or different names for the same mutation (As well as FGC55176/7). I'm basically trying to figure if all the individuals on the blocktree for A5902 decend from the *A5902 Individual* or 3 SNPS down the road from the *A5902 Individual*, which if my ballparking of average SNP ages is correct (at least what FTDNA decides is significant) could be as much as 250 years.

Re: MAJOR CHANGE TO THE CLADOGRAM

Posted: Wed, 2020-Aug-12 8:18 pm
by Webmaster
Chris,

There are 4 distinct SNPs, so the R1b-A5902 phylogenetic node spans ~249 years. I hope that helps.