Talk:Polygenic score

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

30th October, 2016. Page created[edit]

I wrote this page today because it seemed important to have and was missing. I have filled in a broad selection of literature, but from human studies and from animal studies. There are many things that could be improved:

  1. Adding more example papers. In particular, there are more relevant papers about various SES measures that show that these can be predicted from polygenic scores, in line with the non-zero heritability of such traits. Likewise, there are related findings for cognitive ability.
  2. There are other uses of polygenic scores rather than prediction/breeding: Examining the genetic architecture of traits, in particular the degree of polygenicity, estimating genetic correlations directly by correlating the weights vectors, examining genetic causes of group differences, and detecting (natural) selection.
  3. One could add more about the methods used. For many of the standard methods, there exist Bayesian equivalents which also tend to fare well in comparisons.
  4. Discussion of how the polygenic prediction relates to heritability. The heritability sets the upper limit of what is possible using polygenic scores. Since heritabilities are in units of variance, one should take the square root. Note that due to the ^2 transformation, 50% of the maximum predictive validity is attained already when one can only account for 25% of the genetic variance. One can also discuss the nature of the polygenic score construction, e.g. additivity vs. non-additive and nonlinear models.

--Deleet (talk) 21:01, 30 October 2016 (UTC)[reply]


Mendelian randomisation?[edit]

Mendelian randomisation is interesting, but why is it mentioned as something that PRS can be used for? I mean sure, you can use "many SNPs as instruments" which is kinda like PRS - but still PRS is not really in itself a MR tool is it? Yinwang888 (talk) 01:18, 8 February 2020 (UTC)[reply]

Unclear sentence[edit]

The discussed sentence appears to have been removed or edited at this point --Stal potaten (talk) 22:52, 13 April 2021 (UTC)[reply]

In a genome-wide association study (GWAS), polygenic scores having substantially higher predictive performance than the genome-wide statistically-significant hits indicates that the trait in question is affected by a larger number of variants than just the hits and larger sample sizes will yield more hits; a conjunction of low variance explained and high heritability as measured by genome-wide complex trait analysis (GCTA), twin studies or other methods, indicates that a trait may be massively polygenic and affected by thousands of variants.

To start with: "hit indicates" or "hits indicate", but the problem clearly runs deeper than that. — MaxEnt 01:00, 7 May 2020 (UTC)[reply]

Edited the above quote to reflect my last quick edit (properly glossing GCTA). The text above dates almost all the way back to article inception. Flagged that sentence as unclear and the article as confusing. But this is not my kettle of fish, so that's a drive-by. Feel free to revert those flags if annoyed. — MaxEnt 01:09, 7 May 2020 (UTC)[reply]
I was wrong. There is a correct parse, it's just not a happy one. "having [more] hits indicates that the trait in question is affected by a larger number of variants than just the hits ... and [that] larger sample sizes will yield more hits". You can't cue up a massive "having" clause in that way outside of an academic journal, as it's tremendously subjunctive and the common reader has no bearings yet. — MaxEnt 01:16, 7 May 2020 (UTC)[reply]
Well, I took another run at this having finally made my way through parsing the above, and not much joy. While I don't have a graduate degree in this field, I'm not an unsophisticated reader in this subject area. It really should not be this difficult. Additionally flagged for expert-subject genetics. — MaxEnt 01:30, 7 May 2020 (UTC)[reply]
I'd be inclined to agree that that sentence is very hard to parse, but also that it's perhaps not even appropriate to appear so early in the introductory text of the article. The point it's making is important to a technical understanding of polygenic scores, but does not relate directly to explaining what the score means, where it comes from, and other obvious lede-relevant facts, and so the point itself may not belong in the lede at all. I'll see if I can revise it a bit.
I'm not responsible for that bit of unparsable jargon but I have contributed a fair amount to the rest of the article, so please feel free to point out any other concepts or phrasings that might be difficult for nonprofessionals. -Ferahgo the Assassin (talk) 19:50, 7 May 2020 (UTC)[reply]

Larger page edit, April 2021[edit]

I have in a series of edits added a few sections and (in my eyes) improved many of the already existing ones. The page may still not give an entirely coherent read but is more comprehensive, somewhat easier to read and contains more updated results. Although not perfect, I think the page does no longer merit the flags.

Unless a discussion here starts within a week, I intend to resolve the old flags. --Stal potaten (talk) 22:59, 13 April 2021 (UTC)[reply]

Sounds good to me. I'm happy to try and tackle any sections that are still considered less than coherent, too. -Ferahgo the Assassin (talk) 23:38, 23 April 2021 (UTC)[reply]

Updating Polygenic Score wiki[edit]

As part of my wiki course, I am proposing to do the following changes: Restructure to include the following sections:

  • History
  • Background (GWAS, and trait associated loci)
  • Method for constructing a PRS
  • X remove validation methods
  • Research application of polygenic risk scores
  • Clinical potential of polygenic risk scores

Details of changes I am thinking of: 1) Rework the very first section introduction polygenic risk score: - Simplify language and increase accuracy of the statements - Provide brief outline of the remaining sections - Detail the future potential of PRSs 2) Update the history section with more contemporary findings. - Avoid claims of primacy. - Remove jargon and esoteric detail 3) Simplify methods of construction PRS and remove unscary detail. 4) Discuss various non-clinical applications of polygenic risk scores 5) Discuss the clinical application of prs risk scores 6) Include paragraph on generalizability across ancestries 7) Discuss limitations

I am considering removing the references to non-human polygenic risk scores and having that as a seperate page

69.138.56.118 (talk) 17:04, 2 November 2021 (UTC) 02:41, 2 November 2021 (UTC) — Preceding unsigned comment added by Aa2021dna (talkcontribs) [reply]

I think it sounds like a good idea, I'd love to help. Let me know if you'd like me to work on anything specific Yinwang888 (talk) 13:26, 6 November 2021 (UTC)[reply]
So, are you going to remove that "validation" section as you suggest above? I think it's a good idea. It'd kill a lot of the research-paper vibe that the article currently has. Yinwang888 (talk) 13:05, 11 November 2021 (UTC)[reply]
Yeah I am thinking of removing the validation section and either (1) adding it to the methods paragraph or (2) make it a section after describing the potential uses of prs — Preceding unsigned comment added by Aa2021dna (talkcontribs) 21:47, 11 November 2021 (UTC)[reply]
I saw you did already, and also restructured. Good work. I think the overall structure is much better now. Yinwang888 (talk) 05:09, 12 November 2021 (UTC)[reply]

Storyminusthes (talk) 17:12, 15 November 2021 (UTC)[reply]

Peer review[edit]

Lead:

It does not appear you have edited the lead at all, so I did not devote a lot of attention to it. However, it does look like the last paragraph of the lead could use some work, as there are some somewhat arbitrarily chosen references to primary papers that probably aren't necessary.

Content:

  • Is the content added relevant to the topic? Overall, it appears to be mostly relevant. It looks like you mostly focused on the "Calculating a polygenic risk score" and "Methods for developing polygenic scores" sections. I think one thing that could be useful is better explaining why those two things are different sections. I think a casual reader would see the calculation section and assume that was how to build a score and not realize why the more complicated methods are necessary. I think a simple way to do that would be to change the phrasing of "A key consideration in developing polygenic scores is which SNPs and the number of SNPs to include" to explicitly say that is what you mean when you talk about developing a risk score.
  • Is the content added up-to-date? It looks like you've done a good job of finding pretty recent articles without relying too heavily on primary sources - you mostly appear to have review articles, which is good.
  • Is there content that is missing or content that does not belong? I'm not sure if you added the background section on DNA, but I'm not sure it needs to be quite as basic as it is (for example, I definitely don't think it needs to discuss the four bases found in DNA). The background section is also very human-focused, and could probably be edited to just include a basic overview of GWAS.

Tone and Balance

  • Is the content added neutral? Overall, the content is fairly neutral, though there are some areas where I think there are subtle "judgments". For example, saying "Many other creative applications" is a bit biased, because it is your own personal opinion that these approaches are creative.
  • Are there any claims that appear heavily biased toward a particular position? I don't see any heavily biased sections.
  • Are there viewpoints that are overrepresented, or underrepresented? Overall, the article is pretty human-focused, but I know those are the sections you were most interested in editing, so I think that's okay to leave the non-human things for other people to update.
  • Does the content added attempt to persuade the reader in favor of one position or away from another? Nothing you've added seems to be trying to argue in any particular direction.

Sources and References

  • Is all new content backed up by a reliable secondary source of information? I think you've done a pretty good job of using secondary sources of information. The methods section does have some primary sources, but I think that's probably appropriate for the topic. I would encourage you to re-use some citations that are good secondary sources in the methods section. For example, I think the Choi et al. article you have as the current reference 24 has a lot of general information on different approaches and could probably be re-cited in the methods section as a secondary source to reduce the reliance on primary articles only. I also think the end of the "Calculating a polygenic risk score" section could use some additional citations, particularly the sentences starting with "Most often, SNPs have two possible alleles".
  • Does the content accurately reflect what the cited sources say? (You'll need to refer to the sources to check this.) Are the sources thorough - i.e. Do they reflect the available literature on the topic? Are the sources current? Are the sources written by a diverse spectrum of authors? Do they include historically marginalized individuals where possible? I think the sources are very appropriate - they are current articles, and you've linked to a lot of the more general secondary sources/reviews where available.
  • Are there better sources available, such as peer-reviewed articles in place of news coverage or random websites? (You may need to do some digging to answer this.) Again, I think you've done a really good job of using review articles, which are peer-reviewed but secondary sources.
  • This is a minor thing, but there are a lot of instances in which the citation appears before the period at the end of the sentence, and I think Wikipedia formatting standards are typically to put the citation after the punctuation.

Organization

  • Is the content added well-written - i.e. Is it concise, clear, and easy to read? This is a pretty technical topic, so it can definitely be hard to break it down for a non-technical audience. I think you've done a good job of trying to keep it basic and of providing a simple explanation of techniques. I did find the sentence "The effect size, or weight, must therefore be specified as a difference from the other allele, the non-risk increasing allele, as is usually done in regression analysis" a bit confusing - I'm not entirely sure what you were trying to say in that sentence. I also think the last two sentences in the methods section are a little vague and could use a little more explanation.
  • Does the content added have any grammatical or spelling errors? I did not notice any glaring errors. I do think this sentence in the history section is a little awkward grammatically - "an individual's breeding value was the sum of single nucleotide polymorphism weight by their effect on a trait". I think you can replace "single nucleotide polymorphism" with SNP, since that abbreviation has already been introduced, and I think you need to restructure the end of the sentence a little to clarify what you mean by "SNP weight by their effect on a trait".
  • Is the content added well-organized - i.e. broken down into sections that reflect the major points of the topic? I like the organization. It looks like you've added a separate section for the applications in non-human species, which I think makes a lot of sense.

Scientific jargon[edit]

It’s highly likely this article is incomprehensible to the average reader. This is an encyclopedia not a scientific journal. It needs to be rewritten top to bottom in vernacular English. A good example of this can be found here: http://polygenicscores.org/explained/index.html Alcmaeonid (talk) 06:40, 2 October 2021 (UTC)[reply]

I agree. I feel I kinda know a little about the field, but this is way beyond me. Why does it start by staying advances in machine learning was a prerequisite for polygenic risk scores. Those simple clump and threshold formulas are hardly machine learning. Sure it's been made more complicated later, but claiming machine learning and Bayesian statistics is a prerequisite is just needlessly complicating. So fully agree on need for simplification Yinwang888 (talk) 13:22, 6 November 2021 (UTC)[reply]

Wiki Education Foundation-supported course assignment[edit]

This article was the subject of a Wiki Education Foundation-supported course assignment, between 26 October 2021 and 19 November 2021. Further details are available on the course page. Student editor(s): Aa2021dna. Peer reviewers: Storyminusthes.

Above undated message substituted from Template:Dashboard.wikiedu.org assignment by PrimeBOT (talk) 02:31, 18 January 2022 (UTC)[reply]