~mht/ads-book

882dda0271204d6eee7bb53904bdc303a26db3a3 — Martin Hafskjold Thoresen 5 years ago 61e0d72
Finish LCA and LA chapter
1 files changed, 59 insertions(+), 1 deletions(-)

M book.tex
M book.tex => book.tex +59 -1
@@ 4,6 4,7 @@
]{geometry}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{mathtools}
\usepackage{tikz}
\usetikzlibrary{positioning,calc,shapes}
\usepackage{todonotes}


@@ 29,6 30,8 @@
\newenvironment{definition}[1]{{\parindent=0em \textbf{Definition} --- \emph{#1}:\\}}{\vspace{1em}}

\DeclareMathOperator*{\argmin}{arg\,min} % thin space, limits underneath in displays
\DeclarePairedDelimiter{\ceil}{\lceil}{\rceil}
\DeclarePairedDelimiter{\floor}{\lfloor}{\rfloor}

% Expectation symbol
\DeclareMathOperator*{\E}{\mathbb{E}}


@@ 204,6 207,7 @@ the two arrays act exactly the same.
A consequence of this is that we get \emph{Universe reduction}, where we have mapped the universe $\mathcal{U}$ to $[n-1]$.

\section{Constant time \LCA{} (\RMQ{})}
\label{sec:constant-lca}

\subsubsection{Step 1: Reduction}
We start by reducting the problem to $\pm 1 \textsc{RMQ}$, in which adjacent elements of the array differs by at most 1.


@@ 251,7 255,7 @@ $[3,4,5,4] \rightarrow [0,1,2,1] \rightarrow [-, 1, 1, 0]$.
There are $2^{n'} = \sqrt{n}$ possible such blocks,
${(1/2 \log n)}^2$ possible queries, and each answer requires $\log \log n$ bits,
so storing a lookup tables for all possible blocks, over all possible queries with all possible
answers take $\sqrt{n} {(1/2 \log n)}^2 \log \log n = o(n)$ bits.
answers take $\sqrt{n}\ {(1/2 \log n)}^2\ \log \log n = o(n)$ bits.
Now each bottom block can simply store a pointer into the table, and we get $O(1)$ query for the bottom groups.




@@ 263,11 267,65 @@ We want $O(1)$.
\subsubsection{Step 1: Jump Pointers}
Instead of having each node storing only its parent, each node can store its $2^k$th parent.
Each node has $O(\log n)$ such parents, making the space requirement $O(n \log n)$.
Query time is $O(\log n)$, since we at least halve $n$ each jump.

\subsubsection{Step 2: Long-path Decomposition}
Decompose the tree into a set of paths.
Find the longest path in the tree from the root, and store the nodes in an array.
The nodes themselves store the array and its index in the array.
Recurse on the subtrees that are hanging from the path.

With this scheme, a query is done as follows:
if $n$ is less than the nodes index in its path, we jump directly to the node.
Else, we jump to the first node in our path, subtract $n$ by $x$s index, and repeat.
We risk at most to visit $O(\sqrt{n})$ such paths, since we know that the paths are the \emph{longest} paths.
We end up using $O(n)$ space, and $O(\sqrt{n})$ query time.

\subsubsection{Step 3: Ladder Decomposition}
Extend each path upwards by twice its length.
Now the arrays overlap, but the nodes still only store their original array and index.
This doubles the space of Step 2, but we are still linear.
The improvement of this step is that we at least double the length from the node to the end of its path,
since the path in which we can jump freely goes at least twice that length up.

\subsubsection{Step 4: Step 1 + Step 3}
We combine the jump pointers and the ladder decomposition.
Jump pointers are great for long jumps, and ladders are great for short jumps.
We follow the jump pointer $k/2 \leq 2^{\floor{\log k}} \leq k$ steps up, for some $k$.
Since we have gone up a path from $x$ to a node by the jump pointer, we know that the node we hit is
of large height, and hence is part of a long ladder. Since its height from the end of the path
can be doubled by Step 3, we can get from $k/2$ to $k$, in which we know our target is.
Hence, we get $O(1)$ query, but still $O(n \log n)$ space (and preprocessing).

\subsubsection{Step 5: Only Leaves Store Jump Pointers}
Since all nodes have constant access to a leaf node (by its ladder, of which the last node is a leaf, by the maximal property),
only leaves need to store the jump pointers.
In other words, we make all queries start at leaves.

\subsubsection{Step 6: Leaf Trimming}
We define a \emph{maximally deep node} as a node with $\geq 1/4 \log n$ descendants.
Split the tree in two layers by the maximally deep nodes.
The number of leaves in the top part is now $O(n/\log n)$, since
\todo{eh?}
for each $1/4 \log n$ nodes in the original tree we have ``replaced'' it with a subtree (the bottom structure).
If we now use Step 5 on the top, we get $O(n)$ space.

\subsubsection{Step 7: Lookup Table}
For the bottom trees, we use lookup tables.
The trees are of size $n' \leq 1/4 \log n$. The number of rooted trees on $n'$ nodes is limited by
$2^{2n'} \leq \sqrt{n}$ by encoding an euler tour as a string of $\pm1$, like in Section~\ref{sec:constant-lca}.
There are ${(n')}^2 = O(\log^2 n)$ possible queries (if $n$ is large, we just go to the top structure),
and an answer takes $O(\log \log n)$ bits,
so a lookup table for all possible trees, with all possible queries takes only
$\sqrt{n}\ O(\log^2 n)\ O(\log \log n) = o(n)$ bits.

We end up with $O(1)$ time queries, using $O(n)$ space!



\chapter{Strings}


String search, suffix trees, suffix arrays.

\chapter{Temporal Structures}