BigY 700 Results Analysis

This forum is for general discussion about the Dál Cuinn.
Post Reply
User avatar
Webmaster
Site Admin
Posts: 1574
Joined: Wed, 2019-Jun-26 2:47 pm

BigY 700 Results Analysis

Post by Webmaster »

It is a known fact that FTDNA heavily filters the results from BigY 700 tests, so the Private Variants they post are oftentimes incomplete. There are other types of mutations to the Y chromosome than SNPs; there are: INDELs (insertion or deletion of one or more nucleotides), non-standard STRs, and SUBs (substitution of one fragment of nucleotides with a completely different sequence). ALL of these are important for developing a complete and accurate Haplotree. FTDNA typically ignores everything except SNPs. That leaves an incomplete result posted by FTDNA in their Private Variant calls for most people; as well as an incomplete Haplotree. It is a shame not to extract all, or at least more than FTDNA posts, of the information possible from an expensive investment like the BigY 700 is.

Up until recently, Alex Williamson has provided the free The Big Tree service where he re-analyzed the BigY results and recovered the missing mutation information from the BAM file, although he used the less complete VCF and BED files if that was all that was available. That is why The Big Tree has mutations that FTDNA does not. Unfortunately, Alex had announced a few months ago that he was going to scale back his efforts on The Big Tree, and I have noticed dramatic delays in getting some kits finalized on The Big Tree.

One of the more crippling moves that FTDNA has made in recent months was the decision to charge an additional US$99 for the BAM file from a BigY 700 test. This is a more "raw" data file of the BigY 700 test results that can be used to better analyze and find ALL mutations. They still provide the more processed VCF and BED files for free, which can still be used to look for the FTDNA filtered mutations, just not as thoroughly as the BAM file allows. As YFull notes on their website:
NOTE: The data extracted from the VCF file is incomplete. ~ 50-70% of data that can be taken from the BAM file.
I have been searching for alternatives to The Big Tree that can use the VCF and BED files, although as seen above, they are not as ideal as the additional US$99 BAM file would be. YFull is one of the more well known services, but it is fee based: US$49. Most people who have used their service have noticed they call mutations that FTDNA does not. Again, these additional mutations are important for building the most accurate Haplotree as possible. They also provide clade age estimates, a service that FTDNA has announced it will offer in the near future. I, personally, am hesitant to use YFull's service since they are a Russian based company, but many people use them quite happily, and I think their technical expertise is good.

I have a query into YSEQ about taking the free BigY 700 VCF and BED files and doing what Alex has been doing on The Big Tree for a nominal fee. I have not heard back from them yet. I have done some searching and found these 2 other possible American based alternatives.
  1. Genetic Genie
  2. Enlis Genomics
If anyone knows anything about these 2 services, or knows of other services, please post a reply. It is critical that we find an acceptable alternative to The Big Tree as soon as possible so that we can maintain the most accurate Haplotree as possible for R1b-DF104+ men. Thank you.
Image
User avatar
zackdaugherty
Site Admin
Posts: 75
Joined: Thu, 2019-Jul-18 8:57 pm
Contact:

Re: BigY 700 Results Analysis

Post by zackdaugherty »

Yes, there is gold still in those BAMs and FTDNA doesn’t even cover INDELs. This is understandable when they maybe in problematic recombining areas of the Y or adjacent to STRs (STRs are technically an Insertion or Deletion of base pairs). However, there are actually a number of solid single point INDELs that can be very clade defining and on even ground to SNPs. Since FTDNA doesn’t call them, but will add them if you point it out having access to the BAMs becomes important and will be very important as the testing numbers grow in time.
User avatar
Webmaster
Site Admin
Posts: 1574
Joined: Wed, 2019-Jun-26 2:47 pm

Re: BigY 700 Results Analysis

Post by Webmaster »

Thanks, Zack!

Further investigation has revealed that YFull appears to be the only service that reanalyzes the free VCF / BED files FTDNA provides for BigY 700 tests. But again, as their website points out:
NOTE: The data extracted from the VCF file is incomplete. ~ 50-70% of data that can be taken from the BAM file.
Both YSEQ and FGC, as well as YFull, offer fee-based services that appear to extract ALL unique mutations from the BAM file. But again, FTDNA charges an extra US$99 to access the BigY 700 BAM file, and the third party companies charge US$25 to US$50 for their services.

Compare this to most 30X WGS tests, which not only test the Y chromosome, but all the others as well, and provide AT-DNA, MT-DNA, and medically relevant data, and ALL of your raw data from the FASTQ file to the BAM, VCF, and BED files. And this is all included in the initial test cost. However, there is typically no ancestral analysis of your 30X WGS test results; that requires a third party. There are the fee-based ones like FGC and YFull, but there are also free project websites independent of FTDNA that can help with such analyses. Also, at least one of the 30X WGS testing companies has offered a portal to export your data to FTDNA. It is unclear at this time what fee FTDNA may charge to import your 30X WGS Y-DNA data into their database.

It may be time to rethink the DNA testing paradigm.
Image
Post Reply