Skip to content

Commit

Permalink
updates to chapter 1
Browse files Browse the repository at this point in the history
  • Loading branch information
cooplab committed Sep 12, 2016
1 parent e62aeb4 commit 1f5be1f
Show file tree
Hide file tree
Showing 5 changed files with 140 additions and 83 deletions.
59 changes: 45 additions & 14 deletions chapter-01.tex
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,27 @@
\section{Allele and genotype frequencies}

\subsection{Allele frequencies}
Consider a diploid autosomal locus segregating at two alleles ($A_1$ and $A_2$). Let $N_{11}$ and $N_{12}$ be the number of $A_1A_1$ homozygotes and $A_1A_2$ heterozygotes, respectively. Moreover, let $N$ be the total number of diploid individuals in the population. We can then define the relative frequencies of $A_1A_1$ and $A_1A_2$ genotypes as $f_{11} = N_{11}/N$ and $f_{12} = N_{12}/N$, respectively. The frequency of allele $A_1$ in the population is then given by
Loci and alleles are the basic currency of population genetics--and
indeed of genetics. A locus (the singular of loci) is a specific location on the genome,
e.g. a particular DNA pair position in a gene. An allele is the genetic
information (variant) contained at that position, for example an A-T
base-pair. As DNA base pairs are complementary, it will suffice to say
that there is an A allele at this locus. All of the individuals in a
population (or a sample) may carry the same
genetic information at this locus, in which case we will say that the
locus is monomorphic. Or individuals may differ in the genetic
information they carry at a locus, for example there may be an A
allele and a T allele at our locus. When multiple alleles are present at a locus
the locus is said to be polymorphic. In our example some individuals
will be be
homozygote for an A
allele and some homozygote for a T allele, and some heterozygotes
(A/T). \\


Consider a diploid autosomal locus segregating at two alleles ($A_1$
and $A_2$). We'll use these arbitrary labels for our alleles, merely
to keep this general. Let $N_{11}$ and $N_{12}$ be the number of $A_1A_1$ homozygotes and $A_1A_2$ heterozygotes, respectively. Moreover, let $N$ be the total number of diploid individuals in the population. We can then define the relative frequencies of $A_1A_1$ and $A_1A_2$ genotypes as $f_{11} = N_{11}/N$ and $f_{12} = N_{12}/N$, respectively. The frequency of allele $A_1$ in the population is then given by
\begin{equation}
p = \frac{2 N_{11} + N_{12}}{2N} = f_{11} + \frac{1}{2} f_{12}. % Modified by Simon
\end{equation}
Expand Down Expand Up @@ -80,7 +100,30 @@ \subsection{Hardy--Weinberg proportions}
%figure/QT1.eps


\subsection{Coefficient of kinship}
\subsection{Identity by Descent \& The Coefficient of Kinship}
All of the individuals in a population are related to each other by a giant pedigree (family
tree). For most pairs of indivdiduals in a population these relationships are very
distant (i.e. distant cousins). However, some individuals will be
more closely related, e.g. sibling/first cousins.
Individuals of different levels of relatedness (kinship). Related individuals
will share alleles. With closer relatives sharing more alleles. In
Figure \ref{fig:IBD_cousins_chr_cartoon} we show the sharing of
chromosomal regions between two cousins. As we
are interested in the genetics of populations we are interested in
kinship, and we need some way to quantify the degree of kinship among individuals.\\

\begin{figure}
\begin{center}
\includegraphics[width= 0.5 \textwidth]{figures/Cousins_IBD_chromo_cartoon.png}
\end{center}
\caption{First cousins sharing a stretch of chromosome identical by
descent. The different grandparental diploid chromosomes are coloured so we
can track them and recombinations between them across the
generations. Notice that the identity by descent between the cousins persists for a long
stretch of chromosome due to the limited number of generations for
recombination.} \label{fig:IBD_cousins_chr_cartoon}
\end{figure}

We will define two alleles to be identical by descent (IBD) if they are
identical due to a common ancestor in the past few generations. For
the moment, we ignore mutation,
Expand Down Expand Up @@ -160,18 +203,6 @@ \subsection{Coefficient of kinship}
\end{question}


\begin{figure}
\begin{center}
\includegraphics[width= 0.5 \textwidth]{figures/Cousins_IBD_chromo_cartoon.png}
\end{center}
\caption{First cousins sharing a stretch of chromosome identical by
descent. The different grandparental diploid chromosomes are coloured so we
can track them and recombinations between them across the
generations. Notice that the identity by descent between the cousins persists for a long
stretch of chromosome due to the limited number of generations for
recombination.} \label{fig:IBD_cousins_chr_cartoon}
\end{figure}


\subsection{Inbreeding}
We can define an inbred individual as an individual whose parents are
Expand Down
27 changes: 26 additions & 1 deletion chapter-02.tex
Original file line number Diff line number Diff line change
Expand Up @@ -674,6 +674,23 @@ \subsection{The fixation of neutral alleles}
transmissions. Thus the average number of neutral mutational
differences separating our pair of species is simply $2\mu (1-C) T$.\\

\begin{question}
For this, and the next question, assume that humans and chimp diverged
around 5.5$\times 10^6$ years, a generation time ~20 years, that the speciation occurred instantaneously in allopatry with no subsequent gene flow, and the ancestral effective population size of the human and chimp common ancestor population was 10,000 individuals.\\
Nachman and Crowell sequenced 12 pseudogenes in human and chimp found substitutions at 1.3\% of sites. \\
{\bf A) } What can you say about the mutation rate per site per generation at these genes, and how does it compare to other estimates of human mutation rate?\\
{\bf B)} All of the pseudogenes they sequenced are on the autosomes. What
would you prediction be for pseudogenes on the X and Y chromosomes,
given that there are fewer rounds of replication in the female
germline than in the male germline.
\end{question}

\begin{question}
The gorilla lineage split from the human-chimp lineage $\sim$7 million years ago. Let’s assume that this speciation event occurred instantaneously in allopatry with no subsequent gene flow. \\
{\bf A)} What is the probability of that Gorilla is not an outgroup to human and chimp at a single locus?\\
{\bf B)} It has been estimated that the Gorilla lineage is not an outgroup at around ~30\% of autosomal loci. What effective population size would you need to assume to explain this observation? Is that only plausible explanation?\\
{\bf C)} The Gorilla lineage is an outgroup for large portions of the X chromosome, what is a plausible explanation for this finding?
\end{question}


\subsection{Neutral diversity and population structure}
Expand Down Expand Up @@ -832,6 +849,14 @@ \subsection{Neutral diversity and population structure}
island. Therefore, considering our island our sub-population we have
derived another simple model of $F_{ST}$ .


\begin{question}
You are investigating a small river population of sticklebacks, which receives infrequent migrants from a very large marine population. At a set of (putativelyneutral biallelic markers the freshwater population has frequencies:\\
0.2, 0.7, 0.8\\
at the same markers the marine population has frequencies:\\
0.4, 0.5 and 0.7.\\
From studying patterns of heterozygosity at a large collection of markers, you have estimated the long term effective size of your freshwater population is 2000 individuals.\\
What is your estimate of the migration rate from the marine
populations into the river?
\end{question}
\newpage

107 changes: 54 additions & 53 deletions chapter-04.tex
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ \section{The phenotypic resemblance between relatives}
quantitative phenotypes. We can then use this to understand the
evolutionary change in quantitative phenotypes in response to selection. \\

\subsection{A simple additive model of a trait}
Let's imagine that the genetic component of the variation in our trait
is controlled by $L$ autosomal loci that act in an additive manner. The frequency of allele $1$ at locus $l$ is $p_l$, with each copy of
allele $1$ at this locus increasing your trait value by $a_l$ above
Expand Down Expand Up @@ -84,7 +85,7 @@ \section{The phenotypic resemblance between relatives}
(i.e. the additive effect of the alleles that their genotype consists of).


\subsubsection{Additive genetic variance and heritability}
\subsection{Additive genetic variance and heritability}
As we are talking about an additive genetic model we'll talk about the
additive genetic variance ($V_A$), the variance due to the additive
effects of segregating genetic variation. This is a subset of the total genetic
Expand Down Expand Up @@ -144,7 +145,7 @@ \subsubsection{Additive genetic variance and heritability}
that relatives also share an environment, so may resemble each other
due to shared environmental effects. \\

\subsubsection{The covariance between relatives}
\subsection{The covariance between relatives}
So we'll go ahead and calculate the covariance in phenotype between two individuals
($1$ and $2$) who have a phenotype $X_1$ and $X_2$ respectively.
\begin{equation}
Expand Down Expand Up @@ -313,58 +314,8 @@ \subsubsection{The covariance between relatives}
This is sometimes called the ``animal model'', and also goes by the
name of variance components analysis.

\paragraph{Multiple traits.}

Traits often covary with each other, due to both environmentally
induced effects (e.g. due to the effects of diet on multiple traits)
and due to the expression of underlying genetic covariance between
traits. In turn this genetic covariance can reflect pleiotropy, a
mechanistic effect of an allele on multiple traits (e.g. variants that
effect skin pigmentation often effect hair color) or the genetic
linkage of loci independently affecting multiple traits. If we are
interested in evolution over short time-scales we can (often) ignore
the genetic basis of this correlation.

Consider two traits $X_{1,i}$ and $X_{2,i}$ in an indivdual $i$, these could be
say the individual's leg length and nose length. As before we can write
these as
\begin{eqnarray}
X_{1,i} &= \mu_1+ X_{1,A,i} + X_{1,E,i} \nonumber \\
X_{2,i} &= \mu_2 +X_{2,A,i} + X_{2,E,i} \nonumber \\
\end{eqnarray}
As before we can talk about the total phenotypic variance ($V_1,V_2$),
environmental variance ($V_{1,E}$ and $V_{2,E}$), and the additive genetic variance in trait one and two
($V_{1,A}$, $V_{2,A}$). But now we also have to consider the
total covariance $V_{1,2}=Cov(X_{1},X_{2})$, the environmentally induced covariance between the traits ($V_{E,1,2}=Cov(X_{1,E}
,X_{2,E} )$) and the additive genetic covariance ($V_{A,1,2}
=Cov(X_{1,A} ,X_{2,A} )$) between trait one and two.

We can store these values in a matrices
\begin{equation}
\bf{V}= \left( \begin{array}{cc}
V_{1} & V_{1,2} \\
V_{1,2} & V_{2} \\
\end{array} \right) \label{P_matrix}
\end{equation}
and
\begin{equation}
\bf{G}= \left( \begin{array}{cc}
V_{1,A} & V_{A,1,2} \\
V_{A,1,2} & V_{2,A} \\
\end{array} \right) \label{G_matrix}
\end{equation}
we can generalize this to an abitrary number of traits.

We can estimate these quantities, in a similar way to before, by
studying the covariance in different traits between relatives:

\begin{equation}
Cov(X_{1,i},X_{2,j}) = 2 F_{i,j} V_{A,1,2}
\end{equation}



\subsubsection{The response to selection}
\subsection{The response to selection}
Evolution by natural selection requires:
\begin{enumerate}
\item Variation in a phenotype
Expand Down Expand Up @@ -521,6 +472,56 @@ \subsubsection{The response to selection}

{\bf C)} What do you have to assume to perform the calculations in B. Assuming we only have fossils from the founding population and the population after 6000 years, should we assume that the calculations accurately reflect what actually occurred within our population?
\end{question}

\subsection{Multiple traits.}

Traits often covary with each other, due to both environmentally
induced effects (e.g. due to the effects of diet on multiple traits)
and due to the expression of underlying genetic covariance between
traits. In turn this genetic covariance can reflect pleiotropy, a
mechanistic effect of an allele on multiple traits (e.g. variants that
effect skin pigmentation often effect hair color) or the genetic
linkage of loci independently affecting multiple traits. If we are
interested in evolution over short time-scales we can (often) ignore
the genetic basis of this correlation.

Consider two traits $X_{1,i}$ and $X_{2,i}$ in an indivdual $i$, these could be
say the individual's leg length and nose length. As before we can write
these as
\begin{eqnarray}
X_{1,i} &= \mu_1+ X_{1,A,i} + X_{1,E,i} \nonumber \\
X_{2,i} &= \mu_2 +X_{2,A,i} + X_{2,E,i} \nonumber \\
\end{eqnarray}
As before we can talk about the total phenotypic variance ($V_1,V_2$),
environmental variance ($V_{1,E}$ and $V_{2,E}$), and the additive genetic variance in trait one and two
($V_{1,A}$, $V_{2,A}$). But now we also have to consider the
total covariance $V_{1,2}=Cov(X_{1},X_{2})$, the environmentally induced covariance between the traits ($V_{E,1,2}=Cov(X_{1,E}
,X_{2,E} )$) and the additive genetic covariance ($V_{A,1,2}
=Cov(X_{1,A} ,X_{2,A} )$) between trait one and two.

We can store these values in a matrices
\begin{equation}
\bf{V}= \left( \begin{array}{cc}
V_{1} & V_{1,2} \\
V_{1,2} & V_{2} \\
\end{array} \right) \label{P_matrix}
\end{equation}
and
\begin{equation}
\bf{G}= \left( \begin{array}{cc}
V_{1,A} & V_{A,1,2} \\
V_{A,1,2} & V_{2,A} \\
\end{array} \right) \label{G_matrix}
\end{equation}
we can generalize this to an abitrary number of traits.

We can estimate these quantities, in a similar way to before, by
studying the covariance in different traits between relatives:

\begin{equation}
Cov(X_{1,i},X_{2,j}) = 2 F_{i,j} V_{A,1,2}
\end{equation}

\paragraph{The response of multiple traits to selection, the
multivariate breeder's equation.}
We can generalize these results for multiple traits, to ask how selection on
Expand Down
Loading

0 comments on commit 1f5be1f

Please sign in to comment.