d5748a460e4b9280d495c1ced002870572fe360f — Martin Hafskjold Thoresen 5 years ago
Fix label spacing and replace $$with   1 files changed, 19 insertions(+), 21 deletions(-) M book.tex  M book.tex => book.tex +19 -21 @@ 74,7 74,7 @@ However, encoding a totally random function takes O(u \log m) space. This is a problem since n = U, which is large. Therefore we settle on a family \mathcal{H} of hash functions of small size which is \emph{Universal}. Universality means that the probability of two non-equal keys having the same hash value is O(1/m): -$$\forall x \neq y\ P_{h \in \mathcal{H}}[h(x) = h(y)] = O(1/m)$$+$\forall x \neq y\ P_{h \in \mathcal{H}}[h(x) = h(y)] = O(1/m)$ \begin{example} h(x) = ((a x) \mod p) \mod m where 0<a<p, p is prime and p > m is a universal hash function. @@ 82,7 82,7 @@$$\forall x \neq y\ P_{h \in \mathcal{H}}[h(x) = h(y)] = O(1/m)$$\begin{definition}{k-independent} A family of hash functions \mathcal{H} is k-independent if -$$\forall x_1, \dots x_k\ \Pr[\bigwedge\limits_{i} h(x_i) = t_i] = O(1/m^k)$$+ $\forall x_1, \dots x_k\ \Pr[\bigwedge\limits_{i} h(x_i) = t_i] = O(1/m^k)$ \end{definition} \begin{example} @@ 100,7 100,7 @@ Tabulation hashing is a hashing scheme. We view x as a vector of bit blocks x_1, \dots, x_c, and use a totally random hash table on \emph{each} block. Then we \texttt{xor} the blocks together to get the final value: -$$h(x) = T_1(x_1) \oplus T_2(x_2) \oplus \dots \oplus T_c(x_c)$$+$h(x) = T_1(x_1) \oplus T_2(x_2) \oplus \dots \oplus T_c(x_c)$ Since a block is u/c bits, the universe for one block is of size u^{1/c}. One table is of size u^{1/c} (we assume the hash value are machine words, as in~\cite{Patrascu:2011:PST:1993636.1993638}), so the total space used is O(cu^{1/c}). @@ 121,7 121,7 @@ so \E[C_t] = O(1), for m = \Omega(n) Since we need to traverse the list for all operations, the cost is quadratic in C_t, so we are interested in \E[C_t^2]: -$$ \E[C_t^2] = 1/m \sum_{s \in [m]} E[C_s^2] = 1/m \sum_{i \neq j} \Pr[h(x_i) = h(x_j)] $$+$\E[C_t^2] = 1/m \sum_{s \in [m]} E[C_s^2] = 1/m \sum_{i \neq j} \Pr[h(x_i) = h(x_j)]$ If we have universal hashing this is \frac{1}{m} n^2 O(\frac{1}{m}) = O(n^2/m^2) = O(1) for m = \Omega(n). With a totally random hash function C_t = O(\frac{\log n}{\log\log n}) with probability 1 - 1/n^c for any c. This also holds for simple tabulation hashing. @@ 171,7 171,7 @@ If the hashing functions are fully random or \log n-wise independent we get O -\chapter{Static Tree Queries} +\chapter{Static Tree Queries}% \label{ch:statictrees} \topics{Range Minimum Query (\code{RMQ}{i, j}), Lowest Common Ancestor (\code{LCA}{x, y}), Level Ancestor (\code{LA}{x, n}), Range Queries, Range Trees.} @@ 216,7 216,7 @@ in the array. However, since we are not interested in the actual minimum, but on the two arrays act exactly the same. A consequence of this is that we get \emph{Universe reduction}, where we have mapped the universe \mathcal{U} to [n-1]. -\section{Constant time \LCA{} (\RMQ{})} +\section{Constant time \LCA{} (\RMQ{})}% \label{sec:constant-lca} \subsubsection{Step 1: Reduction} @@ 312,7 312,7 @@ Since all nodes have constant access to a leaf node (by its ladder, of which the only leaves need to store the jump pointers. In other words, we make all queries start at leaves. -\subsubsection{Step 6: Leaf Trimming} +\subsubsection{Step 6: Leaf Trimming}% \label{sec:first-leaf-trim} \begin{definition}{Maximally Deep Node} A node with \geq \frac{1}{4} \log n descendants. @@ 435,7 435,7 @@ Table~\ref{tab:trie-node-repr} sums up all approaches, with query time and space 4 & Weight-balanced BST & O(P + \log k) & O(T)\\ 5 & Leaf Trimming + (1) + (4) & O(P + \log\Sigma) & O(T) \end{tabular} - \caption{Table over node representation, query time, and space requirement.} + \caption{Table over node representation, query time, and space requirement.}% \label{tab:trie-node-repr} \end{table} @@ 512,7 512,7 @@ This allows for faster searching, as we know how much of the adjacent suffix mat 5 & \texttt{a\}\\ 6 & \texttt{\} \end{tabular} - \caption{The suffixes of T + \caption{The suffixes of T% \label{tab:suffixes}} \end{subfigure} \begin{subfigure}{0.45\textwidth} @@ 527,7 527,7 @@ This allows for faster searching, as we know how much of the adjacent suffix mat 4 & \texttt{na\} & 2 \\ 2 & \texttt{nana\} & - \end{tabular} - \caption{The Suffix Array of T. + \caption{The Suffix Array of T.% \label{tab:suffix-array} } \end{subfigure} @@ 586,7 586,7 @@ Note that the number of suffixes when recursing is O(2/3) of what we started w \subsubsection{Step 5} Now we want to use the suffix array just obtained to radix sort the suffixes in T_2. -Note that$$T_2[i..] = T[3i+2..] = \big<T[3i+2],\ T[3i+3..]\big> = \big<T[3i+2],\ T_0[i+1..]\big>.$$+Note that $T_2[i..] = T[3i+2..] = \big<T[3i+2],\ T[3i+3..]\big> = \big<T[3i+2],\ T_0[i+1..]\big>.$ We can take off the first letter of the suffix, and get a suffix which we know the sorted order or, since it is in T_0, which we sorted in Step 4. This is like Radix sort, but with only two values, both of which can be compared in O(1). This allows us to sort T_2. @@ 654,7 654,7 @@ We can use the \emph{Potential Method} for amortized analysis. Remember that \emph{Amortized cost} is actual cost + change in potential. Let \Phi = c \cdot \Sigma, where \Sigma is the number of modifications in all nodes latest versions. Observe that when we make a change, the amortized cost is -$$\leq c + c\ [- 2cp + p \cdot \text{recursions}]\text{?}$$+$\leq c + c\ [- 2cp + p \cdot \text{recursions}]\text{?}$ The first c is for computation, the second c is the added potential, since we just added one modification to a node, and [-2cp + p \cdot \text{recursions}]\text{?} are for the recursions, if there is one. @@ 733,7 733,7 @@ If we want to delete the +3, we can add a -3 at the end. \subsection{Full Retroactivity} What do we need for full retroactivity? -\begin{definition}{Decomposable Search Problem} +\begin{definition}{Decomposable Search Problem}% \label{sec:dsp} \text{query}(x, A\cup B) = f(\text{query}(x, A), \text{query}(x, B)) \end{definition} @@ 768,7 768,7 @@ To see why this is the case, consider a very simply computer with two registers X=x, Y\mathrel{+}=\Delta, and Y = XY. On query, the machine returns Y, and all operations are O(1). The operation sequence -$$\big<Y \mathrel{+}= a_n,\ Y=XY,\ Y\mathrel{+}= a_{n-1},\ \dots,\ Y\mathrel{+}= a_0\big>$$+$\big<Y \mathrel{+}= a_n,\ Y=XY,\ Y\mathrel{+}= a_{n-1},\ \dots,\ Y\mathrel{+}= a_0\big>$ computes the polynomial \sum^{n}_{i} a_{i}x^{i}. We can use this to compute the polynomial for any X by retroactively inserting X=x at t=0. However, it is not possible to reeveluate such a polynomial using field operations any faster than to just evaluate it again. @@ 800,20 800,18 @@ In order to make things easier we define a \emph{Brige}. \end{definition} Now if t' is the bridge preceding t we see that -$$\max\{k'\mid k' \text{\ deleted at time}\geq t\} = \max\{k'\notin Q_{\text{now}} \mid k' \text{\ inserted at time}\geq t'\}$$+$\max\{k'\mid k' \text{\ deleted at time}\geq t\} = \max\{k'\notin Q_{\text{now}} \mid k' \text{\ inserted at time}\geq t'\}$ In other terms, the largest key deleted after a time t is the largest key inserted after the previous bridge that is not in the final set. Now we can store Q_{\text{now}} as a BBST on values, and all insertions in a BBST on time. The latter trees nodes is also augmented with the value of the largest insert in its subtree that is not in Q_{\text{now}}. At last, we store a third BBST with \emph{all} the updates, ordered on time, and also augmented as follows: -$$
-\text{Augmentation} =
+$\text{Augmentation} = \begin{cases} 0 \text{\ if \code{insert}{k}, and\ }k \in Q_\text{now}\\ +1 \text{\ if \code{insert}{k}, and\ }k \notin Q_\text{now}\\ -1 \text{\ if \algo{Delete-Min}}\\ -\end{cases} - +\end{cases}$
In addition the internal nodes store subtree sums and min/max prefix sums.
We do this in order to detect brdiges, since a bridge is a prefix summing to 0.
When we have to find out which element to insert into $Q_{\text{now}}$ we can walk up from the node in the insertion BST,

@@ 836,7 834,7 @@ with $O(\log^2 m)$ overhead. However, it is also possible to get full retroactiv
Before tackeling the real problem, we look at an easier problem.

\subsection{List Labeling}
-We want to store integer labels in a list, such that the list, such that insert/delete here queries are constant,
+We want to store integer labels in a list, such that insert/delete here queries are constant,
and that the list is in a strictly monotone ordering.
\emph{Label Space} is the size of the labels as a function of the number of elements in the list we want to store.
Table~\ref{tab:list-labeling} shows the best known updates for different sizes of the label space.

@@ 874,7 872,7 @@ Dynamic partial sums, dynamic connectivity.

van Emde Boas, x-fast trees, y-fast trees, fusion trees.

-\section{van Emde Boas Tree}
+\section{van Emde Boas Tree}%
\label{sec:vEB}

\chapter{Succinct Structures}