Understand the basis of genomewide association studies
How polygenic risk scores can be derived from genome wide association studies
What is a ‘candidate gene’?
Any single gene thought to be likely to cause a disease.
Given the role of the gene in a distinct biological pathway or findings from previous studies
A priori selection (hypothesis driven)
Candidate genes: Critiques
A priori selection: doesn’t allow for novel gene discovery
Very small effect size on mental health: does not tell you very much about the disorder
Studies tend to be small and do not replicate: evidence is inconsistent
Caspi et al. (2003) – Candidate gene study of depression
HOWEVER...
Border et al’s Study (2023):
Analyzed data from over 600,000 participants using advanced genetic methods.
Found no significant link between 5-HTTLPR and depression or other psychiatric conditions.
Highlighted flawed methodologies in earlier studies with smaller sample sizes.
Pharmacogenomic Testing:
Genetic tests for antidepressant response (e.g., GeneSight) rely on debunked markers like 5-HTTLPR.
Tests lack transparency and independent validation, raising concerns about their scientific validity.
What is a genome wide association study (GWAS)?
A study of a genome-wide set of genetic variants (500,000+), examining how many genetic variants are associated with a mental or physical health problem
GWAS compare the DNA of participants having a disorder to those who do not have the disorder (i.e., case-control design)
Due to consortiums, where researchers combine their individual studies, there can be over 1,000,000 individuals in a GWAS. This results in better replication than candidate gene studies (i.e., evidence is more consistent)
GWAS: Critiques
Most studies have been done in WEIRD samples (W = white; E = European; I = Industrial; R = rich; D = democratic), so generalizability to other populations (e.g., Asian or African genetic backgrounds) is not known
In very large consortiums, not all the independent studies have used the same method or questionnaire to measure the outcome (e.g., depression)
Howard et al. (2019) - Depression GWAS on >800,000 people
Authors note that depression is partially heritable (twins)
Difficult to find genes due to sample size and polygenicity (not candidate)
N = 807,553 (246,363 cases, 561,190 controls)
102 independent variants ~ of prefrontal brain regions.
In an independent replication sample of 1,306,354 individuals (414,055 cases and 892,299 controls)
87 of the 102 replicated
DepressionGWAS on >800,000 people
How do we get to the genes in a GWAS?
Collecting Genetic Material
Genotyping with Gene Chips
Genome-Wide Association Studies (GWAS)
Identifying Genes
Collecting Genetic Material
Saliva Sample Collection:Participants provide saliva samples, which contain epithelial cells. These cells are a source of DNA used for genotyping.
DNA Extraction:DNA is extracted from the saliva sample using a purification process. This involves breaking down the cell membranes and isolating DNA for analysis.
Genotyping with Gene Chips
Gene Chip Technology:A gene chip (or SNP array) contains pre-selected single nucleotide polymorphisms (SNPs)—specific genetic markers across the genome.
DNA Hybridization:The extracted DNA is fragmented and tagged with fluorescent labels. These fragments are then hybridized (matched) to the SNPs on the chip.
Reading Genetic Data:A scanner detects which SNPs are present in the individual’s genome based on fluorescence, generating a genetic profile.
Genome-Wide Association Studies (GWAS)
Comparing Groups:Researchers collect genetic data from two groups: those with a specific trait/disease and those without it.
SNP Association:Statistical analyses identify SNPs that are significantly more common in the group with the trait/disease, suggesting a potential genetic association.
Identifying Genes
Mapping SNPs to Loci:The SNPs identified in the study are mapped to their corresponding loci (genomic regions).
Candidate Genes:Genes located near the associated SNPs are flagged as candidates that may influence the trait/disease.
Functional Validation:Further experiments or studies are conducted to confirm the role of these genes in the trait/disease.
However...
What about out of samplepredictions??
How can GWAS help other (smaller) studies?
What is a polygenic risk score?
A genetic score based on the results of a GWAS, which can combine thousands of genes that associate with an outcome (e.g., depression).
These loci all have very individual small effect sizes, but when you sum into a single score, the effect size increases (explain more about the mental health problem).
These scores can be created in smaller independent studies, but they are informed by larger studies, sso they replicate better than candidate genes (show better consistency)
Polygenic risk score: Critique
The same as the GWAS from which these scores are derived (see GWAS critiques)
Weighted sum of loci at sub-threshold value of significance
Many uses, e.g., predicting onset of depression
BUT: how good is the prediction of depression liability