Polygenic info used to identify risk of AD

Insights and discussion from the cutting edge with reference to journal articles and other research papers.
J11
Contributor
Contributor
Posts: 3351
Joined: Sat May 17, 2014 4:04 pm

Re: Polygenic info used to identify risk of AD

Post by J11 »

Yes, this appears to be the impute2 group.
I have downloaded the software before and ran the program.
The problem is that the imputation files needed are humongous and getting more humongouser all the time.
Best to simply upload the 23andme file to their server or the cloud.
The VCF might not even be needed. Most of the research including that in the PHS article only impute off a
gene chip.

I have been on the record on this forum for years advocating for a commercial imputation service.
With this PHS in AD, I suspect that there might finally be a big push for it.
The article notes that EVERYONE is at risk for AD: it is only a question of when.
When this concept fully penetrates mainstream consciousness, then quite a few more people will want to quantify their AD risk (i.e. PHS) and begin thinking about how to modify their future dementia risk.

It would probably be best to run the imputation through the same servers and procedures that the AD researchers are using.
Going directly to the research community for the imputation makes sense. Yet, it was somewhat troubling that in the url you gave they appeared worried that too much non-research imputation would stretch their servers. This seems odd. Pushing it to the cloud would give you unlimited CPU.

The scientists have really not been as out front of this as they probably should have been.
I mean how many million people have now been genotyped?
How tough would it be to go to say 23andme and run their million files?
Or people could simply upload their genechip files to the Oxford server etc. do the imputation and then run the PHS algorithm.


Governments might now have had enough of carrying the load on GWAS studies.
We might need a million person GWAS to have a really good grasp of AD genetics.
This is probably not going to happen.
Time for We the People to step up and make this happen.
We could upload gene chip files and provide phenotype info perhaps in exchange for an imputation run and a PHS.
J11
Contributor
Contributor
Posts: 3351
Joined: Sat May 17, 2014 4:04 pm

Re: Polygenic info used to identify risk of AD

Post by J11 »

For uploading to GenomeBrowser I did that so long ago that it is difficult for me to remember how I accomplished it.

In fact just last night it took me a bit of figuring to add a new track.
Seems like below File on top left click Add, then possibly click the local folder and then select a path to your VCF or bam files.
You can always experiment with the Public Annotations subfolders (e.g. choose Variation and Function and perhaps GWAS Catalog 2015 then click Plot on the lower right of the Add Data Sources Window). Golden Helix is actually quite good about documenting their products so they probably have a video that describes this in detail.
Last edited by J11 on Sun Mar 26, 2017 3:54 pm, edited 2 times in total.
J11
Contributor
Contributor
Posts: 3351
Joined: Sat May 17, 2014 4:04 pm

Re: Polygenic info used to identify risk of AD

Post by J11 »

Been messing around with the numbers from the article.

Ran some simulations and it appears that the straight mean of the betas is about 0.59.
I have attached a file with a simulation run of 1000 and a chart to show the distribution.

The second column of figures is an ordered polygenic hazard score for the individuals simulated in the first column.
I then marked off the bottom 1% and 10% of risk and top 1% and 10% of risk.

So, the median (50%) risk person has a PHS of 0.59, the bottom 1% is -0.59, bottom 10% is -0.07, the top 10% 1.19 and the top 1% is 1.74.

The assumption made was that the SNPs are independently inherited which seems fairly reasonable.

I am not sure about how the beta log used in the article would change the simulation. I just started this off with a simple addition.
PHS run 1.ods
Reran the simulations using 10,000 person sets.

1% lowest risk was below -0.51.
10% lowest risk was below -0.03.
50% median risk was at 0.59.
10% highest risk was above 1.24.
1% highest risk was above 1.78.

Wanted to make sure that what I did above was correct.
Looks like this might be OK.
They probably want a simple method to pop out of all the theoretical calculations.
Weighting the allele count by the log hazard and then summing across all SNPs for each individual is quite manageable.

Here's what another study using a polygenic approach had to say:
"In the target sample, we calculated the total score for each individual as the number of score alleles weighted by the log of the odds ratio from the discovery sample. Scores are additive across SNPs on the log odds scale and therefore multiplicative on the odds of disease scale."
PMID: 19571811
Supplementary page 20
You do not have the required permissions to view the files attached to this post.
Brugman4
Contributor
Contributor
Posts: 13
Joined: Mon Oct 17, 2016 2:39 pm

Re: Polygenic info used to identify risk of AD

Post by Brugman4 »

J11 wrote:For uploading to GenomeBrowser I did that so long ago that it is difficult for me to remember how I accomplished it.

In fact just last night it took me a bit of figuring to add a new track.
Seems like below File on top left click Add, then possibly click the local folder and then select a path to your VCF or bam files.
You can always experiment with the Public Annotations subfolders (e.g. choose Variation and Function and perhaps GWAS Catalog 2015 then click Plot on the lower right of the Add Data Sources Window). Golden Helix is actually quite good about documenting their products so they probably have a video that describes this in detail.


Thanks for your suggestions, J11, I will give it a go. Also need to generate an index file for the BAM/SAM, so lots to learn.

Turns out the VCF file includes genomic location and variant name, so I manually compared the table of risk genes to my VCF results. Nary a match. To do the comparison, then, I suppose I'd need to find proxy genes.
e3/e4
User avatar
LG1
Senior Contributor
Senior Contributor
Posts: 364
Joined: Fri Dec 18, 2015 10:32 am
Location: Dallas, TX

Re: Polygenic info used to identify risk of AD

Post by LG1 »

J11 wrote:Why wasn't the table in a text friendly format?
Surely no one could be lazy as to complain about that! (I wouldn't say no one!)

Here's what I am seeing.
The beta log HR column is giving you the risk weightings.
epsilon 4 and 2 are the highest with a positive beta.

rs543293 in PICALM has the next highest absolute magnitude.
The SNP should indicate increased risk. So, the variant should be the minor allele and it should increase risk ( 0.30 is positive and should move in the same risk direction as epsilon 4.)

I checked my 23andme file and rs543293 is there (GG).
dbsnp has A=0.2923/1464. So, A is the minor allele and it should confer risk.
GG looks like the low risk combo.

When I tried the imputation with rs1237999 it went in the same direction.
rs1237999 is the one before rs543293 in our 23andme file.
http://archive.broadinstitute.org/mpg/snap/ldsearch.php
J11, have you been able to find any other variants of your own other than these couple? If so, which raw data source and which software did you use? We've discussed quite a lot about different options but I wasn't clear on whether you were able to get results. Obviously if you are unable to I might as well forget it. ;)
ε4/ε4
User avatar
LG1
Senior Contributor
Senior Contributor
Posts: 364
Joined: Fri Dec 18, 2015 10:32 am
Location: Dallas, TX

Re: Polygenic info used to identify risk of AD

Post by LG1 »

This was posted under the article in the comments section yesterday:

http://journals.plos.org/plosmedicine/a ... 0a8935934e


PHS in clinical use

Posted by rdesikan122 on 26 Mar 2017 at 23:56 GMT

Over the past few days, I have received personal emails from individuals asking about getting our polygenic hazard score (PHS) for themselves and physicians asking about clinical availability of this test for their patients. On the other hand, in several news stories, I have seen comments from researchers and clinicians stating that our PHS 'is nowhere near ready for clinical use' and the need for 'additional validation in large independent cohorts'.

As a practicing clinician, my main concern would be improper use and interpretation of our new PHS. However, I think it is important to better understand whether and in what context our PHS can be currently used, next steps needed for broad clinical use and dispel any false claims about usability/non-usability of our test.

First, our research study likely represents one of the largest genetic studies ever conducted in the United States and we used all available genetic data from US-based, individuals of European descent. Importantly, we developed our PHS in one sample and independently replicated our score in three independent samples (replication n > 20,000 individuals). Although our findings need replication in non-US, non-Caucasian populations (see below), a blanket statement that 'these findings still need to be replicated in larger independent samples' is incorrect and misleading.

Second, perhaps the most important issue for clinical use is prospective validation. That is, can you use our genetic score prospectively in non-demented older individuals to predict cognitive and clinical decline? We have just completed our follow-up study suggesting that beyond APOE, we can indeed use our PHS to prospectively identify nondemented older folks (from US based memory clinics) who decline cognitively and clinically over time.

Third, another important point is the need for evaluating our score in community based samples. As we mention in our paper, our PHS was developed and validated on individuals from specialized memory clinics from the US. Given that folks from specialized clinics have faster rates of progression to AD than older folks from the general community, our score needs to validated on a prospective sample from the general community. We are in the process of working on this.

Finally, we need to replicate our findings on non-US, non-Caucasian cohorts. For non-US cohorts, we actually have preliminary evidence suggesting that our PHS works well at predicting Alzheimer's age of onset in older individuals from the general population from Iceland and Norway. For non-Cauciasian populations, this is much harder as currently do not have access to large US or non-US studies with genotype, Alzheimer's age of onset and clinical decline data.

In conclusion, our PHS right now is primarily intended for research and clinical trial use. However, given the results within our Plos Medicine paper and our recent findings is using PHS to identify non-demented older individuals at greatest risk for clinical decline, statements such as 'we're so far away from a test that can be used clinically' (http://www.cbc.ca/news/he...) or 'Warning: False health news?' (http://www.cbc.ca/news/he...) are misleading and incorrect.

Bottom line, if you are an individual who is similar in make-up to the older, US-based, Caucasian participants of European descent we evaluated in our study, we are working on options for offering PHS scores for Alzheimer's disease in clinical routine, and through executive health clinics.

Rahul S. Desikan, MD, PhD
ε4/ε4
User avatar
LG1
Senior Contributor
Senior Contributor
Posts: 364
Joined: Fri Dec 18, 2015 10:32 am
Location: Dallas, TX

Re: Polygenic info used to identify risk of AD

Post by LG1 »

...By the way I emailed him anyway this morning. Don't ask - don't get. :D
ε4/ε4
User avatar
SusanJ
Senior Contributor
Senior Contributor
Posts: 3059
Joined: Wed Oct 30, 2013 7:33 am
Location: Western Colorado

Re: Polygenic info used to identify risk of AD

Post by SusanJ »

Great find, LG! Let us know what you here back.
J11
Contributor
Contributor
Posts: 3351
Joined: Sat May 17, 2014 4:04 pm

Re: Polygenic info used to identify risk of AD

Post by J11 »

LG1, if possible try and ask about the statistical properties of the PHS. Are my numbers above correct? This is very important because only having the 31 genotypes is not sufficient. We need to move from genotypes to the PHS to a percentile score. From what I have read these polygenic scores typically sum up weighted alleles, though the model used in the current article might use a different method.

Also we need to be very clear about which alleles they are using for increased and decreased risk.

The article probably did not want to make these items transparent as they are still developing it.
J11
Contributor
Contributor
Posts: 3351
Joined: Sat May 17, 2014 4:04 pm

Re: Polygenic info used to identify risk of AD

Post by J11 »

Obviously there is (will be) a large amount of interest in the results from this article.

Here are a few points that have not as yet been raised.

- In the article there were some individuals shown in a Figure that appeared to be quite far off the regression line.
Future research might do full genome sequencing on these individuals to establish why they deviate so much. It is not impossible to imagine that the fit using the PHS could tighten up quite a bit more.

- The PHS could also help to explain why some ethnic groups do not appear to be at high AD risk even when epsilon 44.
{G might debate this point.} Simply going to dbsnp and looking up genotype frequencies for the PHS in different human populations might give a first impression whether this idea might be correct.

- Epsilon 44s start off with 2 *1.03 = 2.06 in their PHS. The PHS score ranges from -0.5 to 1.8 in the general population (if the above number crunching is accurate.) So, it is possible that some epsilon 44s would luck out with perhaps a PHS of 1.5. (This would apply above to the comment about different human populations and AD risk.) Clearly many on the thread would be extremely interested in determining whether they were in the lucky group or not.

Might be able to work the numbers backwards by asking how many epsilon 44s in the community do not develop dementia ( or at least the instanteous risk approach used in the article). The 31 SNPs should be independent of epsilon 4 status,
Post Reply