Polygenic info used to identify risk of AD

Post by **MarcR** » Wed Oct 17, 2018 4:40 pm

circular wrote:Another thing to keep in mind is that Ancestry and 23andMe data files don't include the whole genome, so I have to wonder what effect the SNPs not included could have were they also included?

Presumably Dash is doing imputation to fill in the holes. I wonder how its method compares to those available via the University of Michigan imputation service. Perhaps xingxu can comment?

How long does it take to get the report once the data is uploaded?

Results are immediately available.

xingxu wrote:

circular · Post by **circular** » Wed Oct 17, 2018 4:41 pm

MarcR wrote:
circular wrote: I don't think downloading the same data from 2013 would give a different result. Did you test again using the more recent version and then use that data?
I have only tested once, in 2013. The Change Log subarea of the Raw Data Download area clarifies. For example, here's the entry pertaining to the 7/27/17 update:
23andMe wrote:As part of our continuous efforts to improve the quality of data present in your raw data download, the number of SNPs available in your download may have changed.
I used the Unix diff program to compare the 2013 and 2018 downloads and confirm that the 2018 download contains more information.

Thanks Marc, that is enlightening! While containing more SNPs, do you know if the later version contains all the SNPs in the earlier one?

Post by **MarcR** » Wed Oct 17, 2018 4:55 pm

NF52 wrote:Can you clarify what this second Hazard Score is?? Is this from another individual who is now 78, or is this a projection of you at age 78?

That entry is for my father. I had hoped his would be much higher than mine. If Dash's methodology stands up to our collective scrutiny, only lifestyle factors offer me much hope.

Dad damaged his heart by training for and running marathons continuously in his 40s and early 50s until idiopathic atrial fibrillation sidelined him. After drugs failed to control his arrhythmia, he underwent ablation, received a pacemaker, and commenced warfarin therapy to prevent the otherwise inevitable strokes. A nurse friend says that she and her coworkers refer to warfarin as "rat poison for people".

There are other lifestyle differences, but I think warfarin is the one that offers the most plausible reason why my fate might be much different from his.

Post by **MarcR** » Wed Oct 17, 2018 4:57 pm

circular wrote:While containing more SNPs, do you know if the later version contains all the SNPs in the earlier one?

I was just skimming, but I only noticed additions.

Jlhughette · Post by **Jlhughette** » Wed Oct 17, 2018 6:15 pm

Although 23andMe now exhibits more SNPs, I believe they also removed some previously contained in earlier reports. I was told this when I downloaded my 23andMe data onto Ben Lynch’s “ Strategene” program, featuring seven important gene complexes elaborated in his book, ‘Dirty Genes’. I was wondering if the polygenic SNPs predicting development of AD were the same or related to at least some of Ben Lynch’s picks. As a lay person trying both to understand some of the biochemistry involved In AD development and simultaneously help myself and others avoid getting it, I wonder what value it might or might not be to submit to the knowledge of our risks relative to others with similar genetics, without an accompanying strategy moving forward. I understand the purpose of the polygenetic data analysis stands alone as a piece of the puzzle. It seems to be crying out for experts to make something more of the information, and perhaps for the rest of us to become even more motivated to improve upon our lifestyles.

Post by **MarcR** » Wed Oct 17, 2018 7:13 pm

Jlhughette wrote:Although 23andMe now exhibits more SNPs, I believe they also removed some previously contained in earlier reports.

After a more careful comparison of my 2013 and 2018 raw data downloads, I can confirm. I see all permutations - additions, subtractions, and changes.

xingxu · Post by **xingxu** » Sat Oct 20, 2018 9:15 pm

We talked to the researchers behind the newly improved algorithm. They plan to have a white paper out in Nov. Stay tuned.

circular wrote:
This model is based on research that originally looked at 33 loci near 28 genes and the list of genes/loci evaluated as part of this research can be found in the work published by UC San Diego and UC San Francisco. The HealthLytix team has since expanded the algorithm to look at up to approximately 600,000 high quality loci that are commonly found in 23andMe and Ancestry.com data files. [Emphasis added]
It appears this substantial update hasn't been peer reviewed? They link to the science but I'm not the one to digest it. I think I'll hold back until smarter minds assess this, and I wish they would report each loci with its weighted effect (is that the right word?). If they did that I'd be interested to run my parent's 3/3 info, since that side of the family has the AD, and look for the sites with the strongest effect to see if I have the same and there's anyway to intervene.

xingxu · Post by **xingxu** » Sat Oct 20, 2018 9:21 pm

Dash Genomics is able to process all the versions of raw data files from 23andMe and Ancestry.com.

To accommodate the difference among the raw data files of different versions or different platforms and to meet our quality standards, our algorithm uses loci that are generally shared by both of the 23andMe and Ancestry platforms and that were present in our library of raw data test files. We also require that the allele frequencies of those loci are consistent with the allele frequencies in 1000 Genomes Projects and between 23andMe and Ancestry platforms. We have thus identified 600,000 high confident loci as the input for the Alzheimer's risk model. Because we only use those high confident loci for our analysis, instead of using every single locus in the raw data file, we don't believe minor changes between different versions should affect the result significantly.

For individuals who have both 23andMe and Ancestry data, we recommend using the 23andMe file. The primary reason is that the Ancestry.com raw data file often fails to include or may misreport the genotypes of rs7412 and rs429358, 2 key loci for the APOE status (see details of this known issue at https://www.snpedia.com/index.php/APOE “Word of caution” section). Thus, we have to first impute (infer) the genotypes of these loci based on other loci in the genome and identify the correct APOE status, then perform the analysis. We have compared the results for several users with both 23andMe and Ancestry raw data, and have observed only marginal differences.

circular wrote:
xingxu wrote:While the researchers have not yet published an updated journal article regarding their use of ~ 600,000 high-quality SNPs based on 23andMe and Ancestry, this updated algorithm was developed by the same UCSF and UCSD researchers using the same peer-reviewed methodology that has been published in ...
Thanks very much for joining the conversation xingxu! It sounds like there's some hope that publication of the additional data is planned. I hope so.

Can you also comment on the question about how uploading different versions of results from Ancestry or 23andMe might affect the outcome?

For those who don't know, every so often these companies test with a newly updated version. I'm not an expert on this, but I think this might mean they are using a different genotyping chip and/or testing a different but largely overlapping set of SNPs. I don't think re-downloading one's data obtains results from the latest version; ie, the version applies to the front end of what SNPs were tested and on what genotyping chip, not to the results when you request a new download. To get the latest version one would need to be tested afresh on it. Is that an accurate statement?

The problem I'm having with the test is that without a report of the specific loci that most increase my risk, say the top three or top five, there's not much to do with the result?

Part of me is tempted to run it now in the event the FDA puts the quash on it?

xingxu · Post by **xingxu** » Sat Oct 20, 2018 9:28 pm

@MarcR @circular

It’s worth noting that in the context of the polygenic score, those high-impact loci does NOT necessarily mean causation. They only suggest a high correlation between genotypes of loci and phenotypes (the risk of Alzheimer’s disease in this case).

Although the current model does not include all the loci from the original published paper due to the coverage of 23andMe and Acestry.com data, the researchers used 600,000 high confident loci to retrain the model and revalidated the results. The new approach only imputes APOE status if necessary (e.g. Ancestry data). The new model is actually considered to be more extensive and robust than the previous one because it includes wider coverage of the genome and can tolerate more missing loci.

MarcR wrote:
circular wrote:Another thing to keep in mind is that Ancestry and 23andMe data files don't include the whole genome, so I have to wonder what effect the SNPs not included could have were they also included?
Presumably Dash is doing imputation to fill in the holes. I wonder how its method compares to those available via the University of Michigan imputation service. Perhaps xingxu can comment?
How long does it take to get the report once the data is uploaded?
Results are immediately available.
xingxu wrote:

circular · Post by **circular** » Sat Oct 20, 2018 9:38 pm

Many thx xingxu!

Polygenic info used to identify risk of AD

Re: Polygenic info used to identify risk of AD

Re: Polygenic info used to identify risk of AD

Re: Polygenic info used to identify risk of AD

Re: Polygenic info used to identify risk of AD

Re: Polygenic info used to identify risk of AD

Re: Polygenic info used to identify risk of AD

Re: Polygenic info used to identify risk of AD

Re: Polygenic info used to identify risk of AD

Re: Polygenic info used to identify risk of AD

Re: Polygenic info used to identify risk of AD