Skip to content

Commit

Permalink
auto commit
Browse files Browse the repository at this point in the history
  • Loading branch information
chrispiech committed Nov 18, 2023
1 parent 5766d65 commit 3f0764e
Show file tree
Hide file tree
Showing 22 changed files with 201 additions and 33 deletions.
4 changes: 2 additions & 2 deletions chapters/examples/mle_pareto/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -27,15 +27,15 @@

<p>Derive a formula for the MLE estimate of $\alpha$ based on the data you have collected.</p>

<h3>Writing the Log Likihood Function</h3>
<h3>Writing the Log Likelihood Function</h3>

<p>The first major objective in MLE is to come up with a log likelihood expression for our data. To do so we start by writing how likely our dataset looks, if we are told the value of $\alpha$:
\begin{aligned}
L(\alpha) = f(x_1\dots x_n) = \prod_{i=1}^n\frac{\alpha}{x_i^{\alpha+1}}
\end{aligned}
</p>

<p>Optimization will be much easier if we instead try to optimize the log liklihood:
<p>Optimization will be much easier if we instead try to optimize the log likelihood:
\begin{aligned}
LL(\alpha)
&= \log L(\alpha) = \log \prod_{i=1}^n\frac{\alpha}{x_i^{\alpha+1}} \\
Expand Down
19 changes: 18 additions & 1 deletion chapters/part2/categorical/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,21 @@
<center><h1>Categorical Distributions</h1></center>
<hr/>

<p>Stub: coming soon!</p>
<p>The Categorical Distribution is a fancy name for random variables which takes on values <i><b>other than numbers</b></i>. As an example, imagine a random variable for the weather today. A natural representation for the weather is one of a few categories: {sunny, cloudy, rainy, snowy}. Unlike in past examples, these values are not integers or real valued numbers! Are we allowed to continue? Sure! We can represent this random variable as $X$ where $X$ is a categorical random variable. </p>

<p>There isn't much that you need to know about Categorical distributions. They work the way you might expect. To provide the Probability Mass Function (PMF) for a categorical random variable, you just need to provide the probability of each category. For example, if $X$ is the weather today, then the PMF should associate all the values that $X$ could take on, with the probability that $X$ takes on those values. Here is an example PMF for the weather Categorical:</p>
<table class="table">
<thead>
<tr>
<th>Weather Value</th>
<th>Probability</th>
</tr>
<tr><td>Sunny</td><td>$\p(X = \text{Sunny) = 0.49}$</td></tr>
<tr><td>Cloudy</td><td>$\p(X = \text{Cloudy) = 0.30}$</td></tr>
<tr><td>Rainy</td><td>$\p(X = \text{Rainy) = 0.20}$</td></tr>
<tr><td>Rainy</td><td>$\p(X = \text{Snowy) = 0.01}$</td></tr>
</table>

<p>Notice that the probabilities must sum to 1.0. This is because (in this version) the weather must be one of the four categories. Since the values are not numeric, this random variable will <b>not</b> have an expectation (values are not numbers) variance nor a PMF expressed as a function, as opposed to a table.</p>

<p>Note to your future self: A categorical distribution is a simplified version of a <a href="{{pathToRoot}}part3/multinomial/">multinomial distribution</a> (where the number of outcomes is 1) </p>
85 changes: 74 additions & 11 deletions chapters/part3/multinomial/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,87 @@
<center><h1>Multinomial</h1></center>
<hr/>

<p>The multinomial is an example of a <i>parametric</i> distribution for multiple random variables.</p>
<p>The multinomial is an example of a <i>parametric</i> distribution for multiple random variables. The multinomial is a gentle introduction to joint distributions. It is a extension of the binomial. In both cases, you have $n$ independent experiments. In a binomial each outcome is a "success" or "not success". In a multinomial there can be more than two outcomes (multi). A great analogy for the multinomial is: we are going to roll an $m$ sided dice $n$ times. We care about reporting the number of outcomes of each side of your dice. </p>

<p>Say you perform $n$ independent trials of an experiment where each trial results in one of $m$ outcomes, with respective probabilities: $p_1, p_2, \dots , p_m$ (constrained so that $\sum_i p_i = 1$). Define $X_i$ to be the number of trials with outcome $i$. A multinomial distribution is a closed form function that answers the question: What is the probability that there are $c_i$ trials with outcome $i$. Mathematically:
\begin{align*}
P(X_1=c_1,X_2 = c_2, \dots , X_m = c_m) = { {n} \choose {c_1,c_2,\dots , c_m} }\cdot p_1^{c_1} \cdot p_2^{c_2}\dots p_m^{c_m}
\end{align*}</p>
<p>Here is the formal definition of the multinomial. Say you perform $n$ independent trials of an experiment where each trial results in one of $m$ outcomes, with respective probabilities: $p_1, p_2, \dots , p_m$ (constrained so that $\sum_i p_i = 1$). Define $X_i$ to be the number of trials with outcome $i$. A multinomial distribution is a closed form function that answers the question: What is the probability that there are $c_i$ trials with outcome $i$. Mathematically:
\begin{align*}
P(X_1=c_1,X_2 = c_2, \dots , X_m = c_m) &= { {n} \choose {c_1,c_2,\dots , c_m} }\cdot p_1^{c_1} \cdot p_2^{c_2}\dots p_m^{c_m} \\
&= { {n} \choose {c_1,c_2,\dots , c_m} }\cdot \prod_i p_i^{c_i}
\end{align*}

</p>

<p>This is our first joint random variable model! We can express it in a card, much like we would for random variables:</p>

<div class="bordered">
<p><b>Multinomial Joint Distribution</b></p>


<table>
<tbody class="rvCardBody">
<!-- <tr>
<th style="width:150px">Notation:</td>
<td>$X \sim \Bin(n, p)$</td>
</tr> -->
<tr>
<th>Description:</td>
<td>Number of outcomes of each possible outcome type in $n$ identical, independent experiments. Each experiment can result in one of $m$ different outcomes.</td>
</tr>
<tr>
<th>Parameters:</td>

<td>$p_1, \dots, p_m$ where each $p_i \in [0,1]$ is the probability of outcome type $i$ in one experiment.<br/>$n \in \{0, 1, \dots\}$, the number of experiments</td>
<!-- <td>$n \in \{0, 1, \dots\}$, the number of experiments.<br/>$p \in [0, 1]$, the probability that a single experiment gives a "success".</td> -->
</tr>


<tr>
<th>Support:</td>
<td>$c_i \in \{0, 1, \dots, n\}$, for each outcome $i$. It must be the case that $\sum_i c_i = n$</td>
</tr>
<tr>
<th>PMF equation:</th>
<td class="mathLeft">\begin{align*}
P(X_1=c_1,X_2 = c_2, \dots , X_m = c_m) = { {n} \choose {c_1,c_2,\dots , c_m} } \prod_i p_i^{c_i}
\end{align*}</td>
</tr>

</tbody>
</table>
</div>



<h3>Examples</h3>

<p>Often people will use the product notation to write the exact same equation:
\begin{align*}
P(X_1=c_1,X_2 = c_2, \dots , X_m = c_m) = { {n} \choose {c_1,c_2,\dots , c_m} }\cdot \prod_i p_i^{c_i}
\end{align*}</p>

<div class="purpleBox">
<b><i>Example:</b></i> A 6-sided die is rolled 7 times. What is the probability that you roll: 1 one, 1 two, 0 threes, 2 fours, 0 fives, 3 sixes (disregarding order).
<b><i>Standard Dice Example:</b></i> A 6-sided die is rolled 7 times. What is the probability that you roll: 1 one, 1 two, 0 threes, 2 fours, 0 fives, 3 sixes (disregarding order).
\begin{align*}
P(X_1=1,X_2 = 1&, X_3 = 0,X_4 = 2,X_5 = 0,X_6 = 3) \\&= \frac{7!}{2!3!}\left(\frac{1}{6}\right)^1\left(\frac{1}{6}\right)^1\left(\frac{1}{6}\right)^0\left(\frac{1}{6}\right)^2\left(\frac{1}{6}\right)^0\left(\frac{1}{6}\right)^3\\
\P(X_1=1,X_2 = 1&, X_3 = 0,X_4 = 2,X_5 = 0,X_6 = 3) \\&= \frac{7!}{2!3!}\left(\frac{1}{6}\right)^1\left(\frac{1}{6}\right)^1\left(\frac{1}{6}\right)^0\left(\frac{1}{6}\right)^2\left(\frac{1}{6}\right)^0\left(\frac{1}{6}\right)^3\\
&=420\left(\frac{1}{6}\right)^7
\end{align*}
</div>

<div class="purpleBox">
<b><i>Weather Example:</b></i>

<p>Each day the weather in Bayeslandia can be {Sunny, Cloudy, Rainy} where $p_\text{sunny} = 0.7$, $p_\text{cloudy} = 0.2$ and $p_\text{rainy} = 0.1$. Assume each day is independent of one another. What is the probability that over the next 7 days we have 6 sunny days, 1 cloudy day and 0 rainy days?
\begin{align*}
\P(X_{\text{sunny}}=6,X_{\text{rainy}} = 1&, X_{\text{cloudy}} = 0) \\
&= \frac{7!}{6!1!} (0.7)^6 \cdot (0.2)^1 \cdot (0.1) ^0 \\
&\approx 0.16
\end{align*}
</p>

<p>How does that compare to the probability that every day is sunny?

\begin{align*}
\P(X_{\text{sunny}}=7,X_{\text{rainy}} = 0&, X_{\text{cloudy}} = 0) \\
&= \frac{7!}{7!1!} (0.7)^7 \cdot (0.2)^0 \cdot (0.1) ^0 \\
&\approx 0.08
\end{align*}
</p>
</div>

<p>The multinomial is especially popular because of its use as a model of language. For a full example see the <a href="{{pathToLang}}examples/federalist">Federalist Paper Authorship</a> example.</p>
Binary file modified en/ProbabilityForComputerScientists.pdf
Binary file not shown.
4 changes: 2 additions & 2 deletions en/examples/mle_pareto/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -379,15 +379,15 @@

<p>Derive a formula for the MLE estimate of $\alpha$ based on the data you have collected.</p>

<h3>Writing the Log Likihood Function</h3>
<h3>Writing the Log Likelihood Function</h3>

<p>The first major objective in MLE is to come up with a log likelihood expression for our data. To do so we start by writing how likely our dataset looks, if we are told the value of $\alpha$:
\begin{aligned}
L(\alpha) = f(x_1\dots x_n) = \prod_{i=1}^n\frac{\alpha}{x_i^{\alpha+1}}
\end{aligned}
</p>

<p>Optimization will be much easier if we instead try to optimize the log liklihood:
<p>Optimization will be much easier if we instead try to optimize the log likelihood:
\begin{aligned}
LL(\alpha)
&= \log L(\alpha) = \log \prod_{i=1}^n\frac{\alpha}{x_i^{\alpha+1}} \\
Expand Down
19 changes: 19 additions & 0 deletions en/part2/categorical/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,25 @@
<center><h1>Categorical Distributions</h1></center>
<hr/>

<p>The Categorical Distribution is a fancy name for random variables which takes on values <i><b>other than numbers</b></i>. As an example, imagine a random variable for the weather today. A natural representation for the weather is one of a few categories: {sunny, cloudy, rainy, snowy}. Unlike in past examples, these values are not integers or real valued numbers! Are we allowed to continue? Sure! We can represent this random variable as $X$ where $X$ is a categorical random variable. </p>

<p>There isn't much that you need to know about Categorical distributions. They work the way you might expect. To provide the Probability Mass Function (PMF) for a categorical random variable, you just need to provide the probability of each category. For example, if $X$ is the weather today, then the PMF should associate all the values that $X$ could take on, with the probability that $X$ takes on those values. Here is an example PMF for the weather Categorical:</p>
<table class="table">
<thead>
<tr>
<th>Weather Value</th>
<th>Probability</th>
</tr>
<tr><td>Sunny</td><td>$\p(X = \text{Sunny) = 0.49}$</td></tr>
<tr><td>Cloudy</td><td>$\p(X = \text{Cloudy) = 0.30}$</td></tr>
<tr><td>Rainy</td><td>$\p(X = \text{Rainy) = 0.20}$</td></tr>
<tr><td>Rainy</td><td>$\p(X = \text{Snowy) = 0.01}$</td></tr>
</table>

<p>Notice that the probabilities must sum to 1.0. This is because (in this version) the weather must be one of the four categories. Since the values are not numeric, this random variable will <b>not</b> have an expectation (values are not numbers) variance nor a PMF expressed as a function, as opposed to a table.</p>

<p>Note to your future self: A categorical distribution is a simplified version of a <a href="../../../part3/multinomial/">multinomial distribution</a> (where the number of outcomes is 1) </p>




Expand Down
2 changes: 2 additions & 0 deletions en/part2/more_discrete/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,8 @@
<center><h1>More Discrete Distributions</h1></center>
<hr/>

<p>Stub: Chapter coming soon!</p>

<p>
<!-- Template that renders a single random variable card -->

Expand Down
4 changes: 4 additions & 0 deletions en/part3/marginalization/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -455,6 +455,10 @@ <h3>Example: Favorite Number</h3>

</p>

<h3>Marginalization with More Variables</h3>

<p>Stub: Section coming soon!</p>




Expand Down
85 changes: 74 additions & 11 deletions en/part3/multinomial/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -356,26 +356,89 @@
<center><h1>Multinomial</h1></center>
<hr/>

<p>The multinomial is an example of a <i>parametric</i> distribution for multiple random variables.</p>
<p>The multinomial is an example of a <i>parametric</i> distribution for multiple random variables. The multinomial is a gentle introduction to joint distributions. It is a extension of the binomial. In both cases, you have $n$ independent experiments. In a binomial each outcome is a "success" or "not success". In a multinomial there can be more than two outcomes (multi). A great analogy for the multinomial is: we are going to roll an $m$ sided dice $n$ times. We care about reporting the number of outcomes of each side of your dice. </p>

<p>Say you perform $n$ independent trials of an experiment where each trial results in one of $m$ outcomes, with respective probabilities: $p_1, p_2, \dots , p_m$ (constrained so that $\sum_i p_i = 1$). Define $X_i$ to be the number of trials with outcome $i$. A multinomial distribution is a closed form function that answers the question: What is the probability that there are $c_i$ trials with outcome $i$. Mathematically:
\begin{align*}
P(X_1=c_1,X_2 = c_2, \dots , X_m = c_m) = { {n} \choose {c_1,c_2,\dots , c_m} }\cdot p_1^{c_1} \cdot p_2^{c_2}\dots p_m^{c_m}
\end{align*}</p>
<p>Here is the formal definition of the multinomial. Say you perform $n$ independent trials of an experiment where each trial results in one of $m$ outcomes, with respective probabilities: $p_1, p_2, \dots , p_m$ (constrained so that $\sum_i p_i = 1$). Define $X_i$ to be the number of trials with outcome $i$. A multinomial distribution is a closed form function that answers the question: What is the probability that there are $c_i$ trials with outcome $i$. Mathematically:
\begin{align*}
P(X_1=c_1,X_2 = c_2, \dots , X_m = c_m) &= { {n} \choose {c_1,c_2,\dots , c_m} }\cdot p_1^{c_1} \cdot p_2^{c_2}\dots p_m^{c_m} \\
&= { {n} \choose {c_1,c_2,\dots , c_m} }\cdot \prod_i p_i^{c_i}
\end{align*}

</p>

<p>This is our first joint random variable model! We can express it in a card, much like we would for random variables:</p>

<div class="bordered">
<p><b>Multinomial Joint Distribution</b></p>


<table>
<tbody class="rvCardBody">
<!-- <tr>
<th style="width:150px">Notation:</td>
<td>$X \sim \Bin(n, p)$</td>
</tr> -->
<tr>
<th>Description:</td>
<td>Number of outcomes of each possible outcome type in $n$ identical, independent experiments. Each experiment can result in one of $m$ different outcomes.</td>
</tr>
<tr>
<th>Parameters:</td>

<td>$p_1, \dots, p_m$ where each $p_i \in [0,1]$ is the probability of outcome type $i$ in one experiment.<br/>$n \in \{0, 1, \dots\}$, the number of experiments</td>
<!-- <td>$n \in \{0, 1, \dots\}$, the number of experiments.<br/>$p \in [0, 1]$, the probability that a single experiment gives a "success".</td> -->
</tr>


<tr>
<th>Support:</td>
<td>$c_i \in \{0, 1, \dots, n\}$, for each outcome $i$. It must be the case that $\sum_i c_i = n$</td>
</tr>
<tr>
<th>PMF equation:</th>
<td class="mathLeft">\begin{align*}
P(X_1=c_1,X_2 = c_2, \dots , X_m = c_m) = { {n} \choose {c_1,c_2,\dots , c_m} } \prod_i p_i^{c_i}
\end{align*}</td>
</tr>

</tbody>
</table>
</div>



<h3>Examples</h3>

<p>Often people will use the product notation to write the exact same equation:
\begin{align*}
P(X_1=c_1,X_2 = c_2, \dots , X_m = c_m) = { {n} \choose {c_1,c_2,\dots , c_m} }\cdot \prod_i p_i^{c_i}
\end{align*}</p>

<div class="purpleBox">
<b><i>Example:</b></i> A 6-sided die is rolled 7 times. What is the probability that you roll: 1 one, 1 two, 0 threes, 2 fours, 0 fives, 3 sixes (disregarding order).
<b><i>Standard Dice Example:</b></i> A 6-sided die is rolled 7 times. What is the probability that you roll: 1 one, 1 two, 0 threes, 2 fours, 0 fives, 3 sixes (disregarding order).
\begin{align*}
P(X_1=1,X_2 = 1&, X_3 = 0,X_4 = 2,X_5 = 0,X_6 = 3) \\&= \frac{7!}{2!3!}\left(\frac{1}{6}\right)^1\left(\frac{1}{6}\right)^1\left(\frac{1}{6}\right)^0\left(\frac{1}{6}\right)^2\left(\frac{1}{6}\right)^0\left(\frac{1}{6}\right)^3\\
\P(X_1=1,X_2 = 1&, X_3 = 0,X_4 = 2,X_5 = 0,X_6 = 3) \\&= \frac{7!}{2!3!}\left(\frac{1}{6}\right)^1\left(\frac{1}{6}\right)^1\left(\frac{1}{6}\right)^0\left(\frac{1}{6}\right)^2\left(\frac{1}{6}\right)^0\left(\frac{1}{6}\right)^3\\
&=420\left(\frac{1}{6}\right)^7
\end{align*}
</div>

<div class="purpleBox">
<b><i>Weather Example:</b></i>

<p>Each day the weather in Bayeslandia can be {Sunny, Cloudy, Rainy} where $p_\text{sunny} = 0.7$, $p_\text{cloudy} = 0.2$ and $p_\text{rainy} = 0.1$. Assume each day is independent of one another. What is the probability that over the next 7 days we have 6 sunny days, 1 cloudy day and 0 rainy days?
\begin{align*}
\P(X_{\text{sunny}}=6,X_{\text{rainy}} = 1&, X_{\text{cloudy}} = 0) \\
&= \frac{7!}{6!1!} (0.7)^6 \cdot (0.2)^1 \cdot (0.1) ^0 \\
&\approx 0.16
\end{align*}
</p>

<p>How does that compare to the probability that every day is sunny?

\begin{align*}
\P(X_{\text{sunny}}=7,X_{\text{rainy}} = 0&, X_{\text{cloudy}} = 0) \\
&= \frac{7!}{7!1!} (0.7)^7 \cdot (0.2)^0 \cdot (0.1) ^0 \\
&\approx 0.08
\end{align*}
</p>
</div>

<p>The multinomial is especially popular because of its use as a model of language. For a full example see the <a href="../../examples/federalist">Federalist Paper Authorship</a> example.</p>


Expand Down
10 changes: 5 additions & 5 deletions print/hash_values.json
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@
"../en/examples/curse_of_dimensionality/index.html": "fac70cbb46786f8792da2a807697b8a8",
"../en/examples/algorithmic_art/index.html": "107ed32a10cd20d14999fbf543f94b44",
"../en/part3/joint/index.html": "dbffbdcff9636a82dd909bb1466e9b41",
"../en/part3/multinomial/index.html": "8463e9a1ef3066c79e7386c3817fb770",
"../en/part3/multinomial/index.html": "c581d6289185d73a6ff6651ffa39c8f8",
"../en/part3/continuous_joint/index.html": "8468ebeaa8c44eb7c2059cff4764cca5",
"../en/part3/inference/index.html": "8c26b53e77d8fdbff3379332ab9a4c04",
"../en/part3/bayesian_networks/index.html": "4e4c5711fbb7fa358080161e26612989",
Expand Down Expand Up @@ -76,10 +76,10 @@
"../en/part5/naive_bayes/index.html": "1acd433f159bf1c94b1b7de2abdc2ddc",
"../en/part5/log_regression/index.html": "bff4bfd7290ffce644df5d9f9fbf076d",
"../en/examples/mle_demo/index.html": "b3161db15c9f254fe419a1125f0b3cbd",
"../en/examples/mle_pareto/index.html": "72d8ca58d5c1e9f088517acfa1893972",
"../en/examples/mle_pareto/index.html": "1456ee37e433b2996496a0370c7f9380",
"../en/examples/mixture_models/index.html": "dee0c5dfca1250ce1820aea33b02397e",
"../en/part2/geometric/index.html": "c0fba1c8af6bfea7717ffecd3428a3f4",
"../en/part3/marginalization/index.html": "a4b4dcbd6812fd335903de5c26a03dbe",
"../en/part2/more_discrete/index.html": "6271b236252bed6fc6067cfacf34c4de",
"../en/part2/categorical/index.html": "e676baa3874a8783ef49f89497fbb7ef"
"../en/part3/marginalization/index.html": "2c30e95538f2b7fe217c3d0991e392b5",
"../en/part2/more_discrete/index.html": "bca8c30051815138af7aaa63327a9507",
"../en/part2/categorical/index.html": "1a1ccbcbbef54d12377ff1e789a4268f"
}
Binary file modified print/pdfs/categorical.pdf
Binary file not shown.
Binary file modified print/pdfs/marginalization.pdf
Binary file not shown.
Binary file modified print/pdfs/mle_pareto.pdf
Binary file not shown.
Binary file modified print/pdfs/more_discrete.pdf
Binary file not shown.
Binary file modified print/pdfs/multinomial.pdf
Binary file not shown.
Binary file modified print/pdfs/separator_intro.pdf
Binary file not shown.
Binary file modified print/pdfs/separator_part1.pdf
Binary file not shown.
Binary file modified print/pdfs/separator_part2.pdf
Binary file not shown.
Binary file modified print/pdfs/separator_part3.pdf
Binary file not shown.
Binary file modified print/pdfs/separator_part4.pdf
Binary file not shown.
Binary file modified print/pdfs/separator_part5.pdf
Binary file not shown.
2 changes: 1 addition & 1 deletion searchIndex.json

Large diffs are not rendered by default.

0 comments on commit 3f0764e

Please sign in to comment.