From e29de382df79811b2bbac3a9d5faf0c659223d6a Mon Sep 17 00:00:00 2001
From: Carlos Scheidegger <cscheid@cs.arizona.edu>
Date: Thu, 7 Feb 2019 13:42:17 -0700
Subject: [PATCH] typos

---
 book/perc.tex | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/book/perc.tex b/book/perc.tex
index 0f85682..b3388f0 100644
--- a/book/perc.tex
+++ b/book/perc.tex
@@ -47,7 +47,7 @@ \section{Bio-inspired Learning}
 \concept{activations}).  Based on how much these incoming neurons are
 firing, and how ``strong'' the neural connections are, our main neuron
 will ``decide'' how strongly it wants to fire.  And so on through the
-whole brain.  Learning in the brain happens by neurons becomming
+whole brain.  Learning in the brain happens by neurons becoming
 connected to other neurons, and the strengths of connections adapting
 over time.
 
@@ -66,7 +66,7 @@ \section{Bio-inspired Learning}
 being a positive example and not firing is interpreted as being a
 negative example.  In particular, if the weighted sum is positive, it
 ``fires'' and otherwise it doesn't fire.  This is shown
-diagramatically in Figure~\ref{fig:perc:example}.
+diagrammatically in Figure~\ref{fig:perc:example}.
 
 Mathematically, an input vector $\vx = \langle x_1, x_2, \dots, x_D
 \rangle$ arrives.  The neuron stores $D$-many weights, $w_1, w_2,
@@ -74,7 +74,7 @@ \section{Bio-inspired Learning}
 \begin{equation} \label{eq:perc:sum}
 a = \sum_{d=1}^D w_d x_d
 \end{equation}
-to determine it's amount of ``activation.''  If this activiation is
+to determine its amount of ``activation.''  If this activation is
 positive (i.e., $a > 0$) it predicts that this example is a positive
 example.  Otherwise it predicts a negative example.
 
@@ -84,7 +84,7 @@ \section{Bio-inspired Learning}
 this feature.  So features with zero weight are ignored.  Features
 with positive weights are indicative of positive examples because they
 cause the activation to increase.  Features with negative weights are
-indicative of negative examples because they cause the activiation to
+indicative of negative examples because they cause the activation to
 decrease.
 
 \thinkaboutit{What would happen if we encoded binary features like
@@ -264,7 +264,7 @@ \section{Error-Driven Updating: The Perceptron Algorithm}
   between $20\%$ and $50\%$ of your time, are there any cases in which
   you might \emph{not} want to permute the data every iteration?}
 
-\section{Geometric Intrepretation}
+\section{Geometric Interpretation}
 
 \begin{mathreview}{Dot Products}
   \parpic[r][t]{\includegraphics[width=1.5in]{figs/perc_dotprojection}}
@@ -343,7 +343,7 @@ \section{Geometric Intrepretation}
 projected onto $\vw$.  Below, we can think of this as a
 one-dimensional version of the data, where each data point is placed
 according to its projection along $\vw$.  This distance along $\vw$ is
-exactly the \emph{activiation} of that example, with no bias.
+exactly the \emph{activation} of that example, with no bias.
 
 From here, you can start thinking about the role of the bias term.
 Previously, the threshold would be at zero.  Any example with a
@@ -545,7 +545,7 @@ \section{Perceptron Convergence and Linear Separability}
   after the \emph{first update}, and $\vw\kth$ the weight vector after
   the $k$th update.  (We are essentially ignoring data points on which
   the perceptron doesn't update itself.)  First, we will show that
-  $\dotp{\vw^*}{\vw\kth}$ grows quicky as a function of $k$.  Second,
+  $\dotp{\vw^*}{\vw\kth}$ grows quickly as a function of $k$.  Second,
   we will show that $\norm{\vw\kth}$ does not grow quickly.
 
   First, suppose that the $k$th update happens on example $(\vx,y)$.
@@ -698,7 +698,7 @@ \section{Improved Generalization: Voting and Averaging}
 \end{equation}
 The only difference between the voted prediction,
 Eq~\eqref{eq:perc:vote}, and the averaged prediction,
-Eq~\eqref{eq:perc:avg}, is the presense of the interior $\sign$
+Eq~\eqref{eq:perc:avg}, is the presence of the interior $\sign$
 operator.  With a little bit of algebra, we can rewrite the test-time
 prediction as:
 \begin{equation}