Part - Introduction
Josie, a sophomore biology major at Tufton University, was currently enrolled in a genetics class. The class had recently
been learning about genome-wide association studies (GWAS) and Josie was intrigued by the massive amount of
information that could be obtained from such studies. In the last lecture, her professor talked about several companies
that offered genomic testing to the general public. Josie was surprised to find out that one of the U.S. companies was
shut down by the FDA in November 2013, but then resumed its services in 2015.
Josie's professor assigned the students a poster presentation on how current knowledge of DNA has impacted modern
life. Fascinated by GWAS studies and the technology that enables the general public to access their genomes, Josie
decided to focus on these topics for the poster presentation. She planned on researching a few of these companies in
depth to find out the different traits that each company tests for. Knowing that she had a family history of Alzheimer's
disease, Josie went a step further and decided to have her genome sequenced by one of these companies.
Genome-Wide Association Studies (GWAS)
GWAS studies are rapidly becoming the norm for identifying genes or polymorphisms that may be involved in human
diseases and phenotypes. These studies rely on single nucleotide polymorphisms (SNPs) (Figure 1). Although some
SNPs are in genes, the majority of SNPs are found in non-coding regions of the genome.
Person 1: AATACGTGAGTGAGGCCTTA
Person 7: AATACGTGAGTGAGGCCTTAA
Figure 1. Example of a SNP. DNA sequences from seven individuals are
pictured above. Notice in the third nucleotide of the sequence, Person 3 has
a C and Person 6 has a G whereas everyone else has a T at this position.
SNPs are present in more than 1% of the population. All known SNPs are catalogued in a database called dbSNP
() and given an unique identifier called an rs number (e.g., r11998853). The rs
number is an accession number used by researchers to refer to a specific SNP.
Case copyright held by the National Center for Case Study Teaching in Science, University at Buffalo, State University of New York. Originally
published December 31, 2017. Please see our usage guidelines, which outline our policy concerning permissible reproduction of this work.
Image in title block is a GWAS representation of all SNP-trait associations on chromosome 3 with p-value 5.0 x 10-8, published in the GWAS
1. If the human genome is three billion base pairs long, estimate the average number of SNPs in the human genome.
2. Predict the different effects that a SNP would have if it were located in an exon, an intron, or in a intragenic region.
GWAS studies compare the frequency of SNPs in a control population against the frequency of the same SNPs in
an affected population. Typical GWAS studies recruit thousands of individuals and look at thousands to millions of
genetic variants in each individual. The amount of data generated from such a study is immense and has been termed
"big data." Microarrays, whole genome sequencing, and exome sequencing are all technologies that are used to analyze
vast amounts of variants from many individuals.
3. Compare and contrast these three technologies (microarrays, whole genome sequencing, exome sequencing) in
terms of their methodologies and the information that they can provide. The following resources may be helpful:
Genetic Science Learning Center. 2013. DNA microarray. microarray/>
My46 (University of Washington). n.d. Whole genome and exome sequencing. whole-genome-and-exome-sequencing>
NISC. 2015. Whole exome sequencing and analysis. Microarray:
Whole genome sequencing:
4. Direct-to-consumer companies, like 23andMe and Dante Labs, offer genomic screening to individuals for fees
ranging from several hundreds of dollars to several thousands of dollars. The technology used, the number of
SNPs tested, and the number of traits for which they screen vary by company. Research these two companies and
complete the chart below.
Genotyping technology used
What traits/conditions do they test for?
"Living in a Genomic World" by Glaser and Zimmer
Below is some information on four traits: warfarin sensitivity, Alzheimer's disease, asparagus metabolite detection, and
alcohol flush reaction.
Warfarin is a blood thinning medication used to prevent the formation of blood clots and one of the most commonly
used medications worldwide. Blood clots can block the flow of blood to important tissues and organs in the body,
which can ultimately lead to tissue or organ damage. The dose of warfarin that an individual receives needs to be
monitored accurately. Since warfarin is a blood-thinning medication, too much can cause excessive bleeding, while
too little may not be effective in preventing blood clots. People vary in their sensitivity to warfarin-for those who are
warfarin sensitive, a lower dose of medication is needed than for those who are warfarin resistant.
Variants in two genes, CYP2C9 and VKORCI, have been shown to affect individuals' sensitivity to warfarin. However,
this sensitivity varies by ethnic group. Caucasians tend to be more sensitive to warfarin than Africans or Asians. For
European Americans, about 30-40% of the variation is due to the genetic variants in CYP29 and VKORC1.
1. If 30-40% of the variation in warfarin sensitivity among European Americans is due to the genetic variants in
CYP29 and VKORC1, then what is the remaining variation due to?
2. How do you explain the difference in sensitivities to warfarin among different ethnic groups?
CYP2C9 encodes for a P450 2C9 isoenzyme, which normally acts to inhibit warfarin's anti-clotting properties.
Isoenzymes are enzymes which differ in amino acid sequence, but catalyze the same reaction. This gene is found on
chromosome 10. Two variants, CYP2C9*2 and CYP2C9*3, cause this enzyme to not function optimally, which in
turn influences how readily warfarin is metabolized. For someone with either of these variants, warfarin is metabolized
much slower and stays in the body longer. Therefore, individuals with the CYP2C9*2 or *3 variants need less warfarin
than individuals that do not have these variants, but have variant CYP2C9*1.
The P2C9*2 variant is found in exon 3. The normal nucleotide at this position is a C. The *2 variant has a T at this
same position and results in a 30-40% decrease in enzyme activity.
Genomic sequence with C variant: ATT GAG GAC CGT GTT CAA GAG
Genomic sequence with T variant: ATT GAG GAC TGT GTT CAA GAG
3. Translate each variant.
"Living in a Genomic World" by Glaser and Zimmer
4. What type of mutation is this?
5. Hypothesize why the T variant would result in a 30-40% decrease in enzyme activity.
The CYP2C9*3 variant is found in exon 7. The normal (reference) nucleotide at this position is an A. The *3 variant
refers to the presence of a C instead of an A in this position. This change results in an 80-90% decrease in enzyme ac-
tivity. Both of these variants decrease the body's ability to metabolize warfarin. If the body cannot metabolize warfarin
as well, warfarin will stay in the body longer. Therefore, people with these variants need lower doses of warfarin.
6. Another variant is found in the middle of the first intron of the CYP2C9 gene. The reference value is a T and the
variant is a C variant. Would this variant change an amino acid? Why?
7. Of the three variants mentioned above, 23andMe only tests for the CYP2C9*2 and *3 variants, but not the variant
in the intron. Why?
Interestingly, as a member of the cytochrome P450 gene family, CYP2C9 is involved in metabolizing a large number of
medications such as NSAIDs (aspirin, ibuprofen, naproxen), some seizure medications, and some diabetic medications.
The VKORC1 gene, found on chromosome 16, codes for the enzyme vitamin K epoxide reductase, which helps
convert vitamin K into a form that activates the clotting factors in the blood. Warfarin works by blocking this enzyme.
If the enzyme cannot activate vitamin K, then clotting factors cannot be made. One variant in this gene, - -1639G>A,
also known as rs9923231, located in the promoter region of VKORC1 has been correlated with warfarin sensitiv-
ity. For this locus, the normal nucleotide is a G. The variant that reduces the amount of enzyme that is made is an
A. Individuals with the A variant need less warfarin than individuals with the G variant. Each A variant increases a
person's sensitivity to warfarin.
Proper dosing of warfarin depends on knowing an individuals' genetic status at these different loci. Aside from the
three different SNPs mentioned here, there are other SNPs in these two genes that affect the action of warfarin in the
body. Different companies will test for a different number of these SNPs. For example, Gentle Labs tests for 13 differ-
ent SNPs within the CYP2C9 and VKORC1 genes while 23andMe tests for CYP2C9*2, *3, and the VKORC1 SNP
mentioned above. However, knowledge of one's genetic status alone is not enough. Individuals on warfarin therapy
still need regular blood tests to monitor their clotting ability.
"Living in a Genomic World" by Glaser and Zimmer
8. What other factors may influence an individual's ability to clot properly?
Late-Onset Alzheimer's Disease
Alzheimer's disease is the most common form of dementia worldwide. Alzheimer's disease currently affects an esti-
mated 2.5 to 4 million Americans, and this number is expected to increase in the coming decades as more people
are living longer. In general, individuals are estimated to have a 10% lifetime risk of developing dementia, with 60%
of dementia cases being caused by Alzheimer's disease. Alzheimer's disease starts with subtle memory loss, increasing
forgetfulness or mild confusion as the only symptoms. However, over time, memory loss progressively worsens to a
point where it interferes with most aspects of daily life.
Alzheimer's disease can be classified as early-onset or late-onset, depending on whether symptoms appear before or
after age 65, respectively. Early-onset Alzheimer's disease accounts for less than 5% of all cases and results from clearly
inherited mutations. Conversely, the development of late-onset Alzheimer's disease is thought to result from a combi-
nation of genetic, lifestyle and environmental factors.
9. Late onset Alzheimer's disease is considered a multi-factorial trait. What are the other factors (apart from genetic
causes) that contribute to an individuals overall risk?
The exact cause of late-onset Alzheimer's disease is currently unknown. This form of the disorder probably results
from a combination of genetic and environmental factors. A variant of the apopliprotein E gene, APOE, has been
recognized and extensively studied as a genetic risk factor for the development of late-onset Alzheimer's disease. Three
common variants of the APOE gene exist in the general population, called APOE £2, &3 and £4. APOE &3 is the most
frequent variant in the general population. APOE £4 is common in Northern Europe, whereas APOE 2 is a rare vari-
ant. The APOE &4 variant has been consistently associated with an increased risk of developing late-onset Alzheimer's
disease. More specifically, carrying a single copy of the APOE &4 allele increases the risk approximately threefold,
while carrying two copies of this allele increases the lifetime risk more than tenfold. On the other hand, carrying the
APOE 2 allele may have a small protective effect against developing late-onset Alzheimer's disease in certain popula-
tions. How different APOE alleles exactly impact the risk of developing late-onset Alzheimer's disease is still unclear.
The APOE gene encodes a protein that is a major constituent of lipoproteins, which are responsible for packaging
and transporting cholesterol and other fats through the bloodstream and cerebrospinal fluid. Defects in lipid and
cholesterol trafficking, or a defect in the production, aggregation or clearance of amyloid beta plaques are mechanisms
proposed contribute to the role of the APOE protein in Alzheimer risk.
It is important to note that the presence of an APOE &4 allele only changes the risk of developing late-onset Alzheim-
er's disease. Not all carriers of one or two APOE £4 alleles will develop Alzheimer's disease, and not having any APOE
£4 still confers a general population lifetime risk of developing late-onset Alzheimer's disease. This is due to other, yet
unknown risk factors.
10. Based on the function of the APOE protein, do you think there any other diseases that may be associated with
defects in APOE?
APOE is located on chromosome 19. Two SNPs, rs429358 and rs7412, are used to determine the three APOE
variants £2, &3, and £4. Below are two examples of how the how the SNPs are used to determine APOE variants. The
fourth variation is extremely rare and therefore not included.
Example 1: An individual that has the CT genotype for both SNPs is almost always £2/84.
Example 2: An individual that has the CT genotype at rs429358 and CC genotype at rs7412 is almost always £3/84.
Asparagus Metabolite Detection
Some individuals are able to detect a strong odor in their urine after eating asparagus while other individuals are not.
The odor is due to an excreted sulfur-containing metabolite called methanethiol. It has yet to be determined whether
smelling the asparagus scent in urine is due to the ability to produce methanethiol or the ability to detect it. In any
case, there seems to be a genetic component to this strange trait. There is a strong association with a genetic marker,
rs4481887, where each copy of an A allele increases the risk of being able to smell asparagus in urine 1.67 times as
compared to individuals with a GG at this marker.
Alcohol Flush Reaction
Alcohol can have an immediate, unpleasant reaction in some individuals. Common signs of alcohol intolerance are
skin flushing (redness), nasal congestion, headache, low blood pressure, nausea and vomiting. This condition is often
referred to as alcohol flush reaction caused by a genetic condition in which the body cannot break down alcohol. The
only way to prevent the reaction is avoidance. The reaction is due to variations in two genes that encode proteins for
breaking down alcohol in the bloodstream.
Alcohol flush reaction is determined by genotype in the genes ALDH2 and ADH1B. ALDH2 codes for an enzyme
called aldehyde dehydrogenase. This enzyme is involved in processing highly toxic acetaldehyde to a harmless acetic acid.
Having an A allele at the SNP inactivates the enzyme and prevents conversion to acetaldehyde. If an individual has the
AA genotype, the individual is highly sensitive to alcohol and the acetaldehyde is removed very slowly from the body.
The other gene is not tested, however, alcohol sensitivity may be altered even greater depending on an individual's
genotype at ADHIB.
11. One possible treatment for alcoholics is the drug called disulfiram (Antabuse). Research the mechanism of action of
disulfiram and explain why it would be a potential treatment.
Part III - Lab Results
Below is a partial list of Josie's results showing the gene that was tested, the chromosomal location of each gene and the
genotype of the SNPs within that gene.
1. Using the information you learned about the various traits in Part II, fill in the final column of the chart above by
determining Josie's phenotype or risk based on her genotype. Explain how you interpreted the results.
a. What other factors will alter Josie's risk of developing Alzheimer's disease?
b. Based on Josie's genotype, what dose of warfarin would be best for her (high, low, or intermediate dose)?
c. One of Josie's favorite vegetables is asparagus. She noticed that she could detect a strong sulfur smell in her
urine after eating asparagus. Explain why she is able to do this after looking at her data.
d. By looking at the data, what does Josie's genotype tell you about her sensitivity to alcohol? Should this
information result in any behavior modifications?
2. Typically, genomic testing services group the traits they test for into categories such as (1) recessive disorders, (2)
drug response, (3) risk factors, and (4) non-health related traits. Consider the four traits mentioned in Part II
(Alzheimer's disease, warfarin sensitivity, asparagus metabolite detection, and alcohol flush reaction). In which
category does each trait belong?
3. Would you have liked to know your APOE status regardless of the results? Why or why not?
4. Do you think that Josie should be obligated to tell her family members the results? Why or why not? Would you
feel differently based on different APOE results? Would you feel differently if you were a carrier for a recessive
disease? If you had the allele for Huntington's disease, an autosomal dominant disorder characterized by progressive
These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction
of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice.
Unethical use is strictly forbidden.
Q1. Number of SNP in human genome.
The normal frequency of SNPs = 1 in 1000bp.
Human genome = 3x10^9 bp, hence the number of SNPs in human genome = 3x10^6.
SNP may have a variable number of effect.
When present in Exons, the SNP might change the amino acid sequence and thus the protein that can be either more active or ineffective or inactive. It might not change the amino acid sequence, and just be a change in the nucleotide sequence. It might change the amino acid sequence such that a truncated protein is produced or it may change the nucleotide such that the null gene or pseudogene is formed.
When present in intron and intragenic regions, SNP may affect gene regulation, DNA replication, rate of transcription, etc. The SNP might cause local topographical changes, DNA affecting its properties.
In this technique, an array of unique nucleotides in multiple copies is arranged in a defined way on a solid matrix such a glass surface. The sequence of these spots is defined and is saved in a computer. The genome of any organism is fragmented, labelled and then the fragments are allowed to pair with the array. The complementary sequences get paired with their corresponding spot and the change in fluorescence is measured. Form the computer database of that array the sequence of bound DNA at that spot is identified. Hence we can know about different sequences present in the genome as well as their relative amounts (concentrations) based on the fluorescence change, that is in-turn based on how many DNA fragments have bound to a particular spot. DNA microarray of two individuals can tell about their differences in the Genome sequence, such as SNPs, mutations, deletions, etc. If performed only with the cDNAs isolated from any individuals, microarray can also tell about the relative abundance/expression of transcripts...