b066471bd3add214218193370fa16a00436adfdd — Martin Hafskjold Thoresen 5 years ago
Write geometry and connectivity

1 files changed, 308 insertions(+), 5 deletions(-)

M book.tex

M book.tex => book.tex +308 -5
@@ 219,7 219,8 @@ A consequence of this is that we get \emph{Universe reduction}, where we have ma
\section{Constant time \LCA{} (\RMQ{})}%
\label{sec:constant-lca}

-\subsubsection{Step 1: Reduction}
+\subsubsection{Step 1: Reduction}%
+\label{sec:first-euler-tour}
We start by reducting the problem to $\pm 1 \textsc{RMQ}$, in which adjacent elements of the array differs by at most 1.
Walk an \emph{Euler Tour} of the tree:

@@ 601,7 602,8 @@ Since all operations are linear, with the exception of the recursive call, we ge
$T(n) = T(\frac{2}{3} n) + O(n) = O(n)$.
Linear time!

-\chapter{Temporal Structures}
+\chapter{Temporal Structures}%
+\label{ch:time}

\topics{Partial persistency, full persistency, funcitonal persistency, pareial retroactivity, full retroactivity.}

@@ 834,11 836,13 @@ with $O(\log^2 m)$ overhead. However, it is also possible to get full retroactiv
Before tackeling the real problem, we look at an easier problem.

\subsection{List Labeling}
-We want to store integer labels in a list, such that insert/delete here queries are constant,
+We want to store integer labels in a list, such that insert/delete queries around a node in the list are constant,
and that the list is in a strictly monotone ordering.
-\emph{Label Space} is the size of the labels as a function of the number of elements in the list we want to store.
+Let \emph{Label Space} be the size of the labels as a function of the number of elements in the list we want to store.
Table~\ref{tab:list-labeling} shows the best known updates for different sizes of the label space.

+\todo{Come back to this}
+
\begin{table}[b]
\centering
\begin{tabular}{c l}

@@ 857,10 861,309 @@ Table~\ref{tab:list-labeling} shows the best known updates for different sizes o

\chapter{Geometry}
+\topics{Orthogonal Range Search, Range Trees, Layered Range Trees, Fractional Cascading}
+
+We look at problems involving geometry, for instance queries in 2D space:
+given a set of points, which points are in an axis aligned rectangle?
+
+In general, geometric data strucutres is all about data in higher dimensions.
+We differentiate between static structures and dynamic structures.
+
+\section{Line Sweep}
+We look at the problem of maintaining line-segments in 2D\@; we would like to store the order of the lines and the intersections.
+In Chapter~\ref{ch:time} we looked at time traveling data structures.
+We can use these to shave off a single dimension on geometry problems by pretending that one axis is time.
+
+By walking along the x-axis we can maintain a BBST with the points.
+On each time $t$ where something happends, that is either a segment is started, ended, or a crossing occur, we can
+translate this to an operation in the BBST\@.
+For the static version we need a persistent BBST\@.
+This allows queries to be done in a specific time, which is our $x$ coordinate.
+Now if we would like to know which line is above a point $(x, y)$, we can translate the $x$ coordinate to a time $t$,
+and query for the successor of $y$ in the BBST at time $t$.
+We get $O(\log n)$ query after $O(n \log n)$ preprocessing (building the persistent BBST).
+
+The dynamic version is similar, but we need a retroacitve BBST, since we need to insert segment begin and segmen end events.
+
+\section{Orthogonal Range Search}%
+\label{sec:range-tree}
+In this problem we want to maintain $n$ points in $d$ dimentional space,
+and answer queries where we ask for the points in a $d$ dimentional hypercube $[a_1, b_1] \times [a_2, b_2] \times \cdots \times [a_d, b_d]$.
+We want existence, count, or the actual points.
+The time bound we are aiming for initially is $O(\log^d n + k)$ where $k$ is the number of points we return (trivial bound).
+Again we differentiate between the dynamic and the static version.
+
+\subsection{$d=1$}
+We store the points in a BBST where all the points are leaves (this only doubles the space).
+For a range search $[a, b]$ we find
+the nodes $a' = pred(a)$, $b' = succ(b)$, and \code{LCA}{a', b'}, and report the points in the subtrees in between $a'$ and $b'$.
+Since both $pred$ and $succ$ are $O(\log n)$ and the size of the subtree between $a'$ and $b'$ is $O(k)$ we get queries in $O(\log n + k)$ time.
+
+\subsection{$d=2$}
+We store a 1-dimentional tree on the $x$ coordinate of all points, similar to in the $d=1$ case.
+This time however, we augment the nodes of the tree with a new tree, containing the same points, but sorted on $y$.
+That is, any node is itself the root of a tree sorted on $x$ which containts the points $P$ as leaves.
+The node also points to a new tree which stores $P$ on $y$.
+The y-trees are independent, and have no cross links.
+
+We note that each point is in $O(\log n)$ trees: the main tree, and one for each ancestor node, of which there are $O(\log n)$.
+On a query, we find the subtrees of the main tree containing all points with $x$ coordinates in the range.
+Then we go through all y-trees and do a range search on $y$.
+This gives us $O(\log^2 n + k)$ query time.
+
+The space requirement is only $O(n \log n)$, and construction time is also $O(n \log n)$.
+Observe that the space recurrence is $S(n) = 2S(n/2) + O(n)$, the same as the time recurrence for \algo{Merge-Sort}.
+
+\subsection{$d=D$}
+The approach taken in the $d=2$ case generalizes to any dimension.
+We end up with $O(\log^D n + k)$ query, $\Theta(n \log^{D+1} n)$ space and construction,
+and $\Theta(\log^D n)$ update (in the dynamic setting).
+
+
+\section{Layered Range Tree}%
+\label{sec:layered-range-tree}
+We observe that the approach taken in the previous section is wasteful: when $d=2$ we search for the same y-intervals in $O(\log n)$ trees.
+We want to take advantage of this by reusing the searches.
+
+Instead of having the nodes in the x-tree store another tree, this time they only point to a sorted array on y.
+The idea is that we only want to do a single search on y, which will be in the root node (the array containing all points).
+Now, when we walk down from the root to \code{LCA}{a', b'} (the pred.\ and succ.) we can follow pointers into the child array,
+so that we know at once where we are.
+Now when we get to the subtrees we want to output we have the y-interval of points that we are interested in,
+and since the subtree is completely covered on $x$, these are exactly the points we are looking for,
+so we get this search for free''.
+
+Note that we depend on the fact that a child node has a subset of the point that the parent has.
+
+We start out with querying the points in the $y$ array which takes $O(\log n)$ time, and then we walk down to the leaves,
+which is a walk of length $O(\log n)$.
+On each step we output points, of which there are $k$ in total.
+Hence we end up with $O(\log n + \log n + k) = O(\log n + k)$ queries ($O(\log^{d-1} n)$ in general).
+The space is the same as previously: $O(n \log n)$ ($O(n \log^{d-1} n)$ in general).
+The construction time is also the same, since the pointer setup can be done in linear time, so we get the same reccurence as \algo{Merge-Sort}, yet again.
+
+Unfortinately, this does not generalize to higher dimensions: we can only shave off one $\log$ factor using this approach.
+
+\section{Weight-Balance trees}
+\newcommand{\bba}{BB[$\alpha$]}
+We would like to use range trees in a dynamic setting.
+The tree we look at is the \bba{} tree.
+A weight-balanced tree is similar to a \emph{height} balance tree, which we know: AVL trees and Red-Black trees are examples of height-balanced trees.
+With weight-balanced trees we would naturally like to balance the weight --- the number of nodes --- of the subtrees instead of the height.
+
+More formally, we have the following invariant:
+\begin{align*}
+    \forall x\ &size(left(x)) \geq \alpha\ size(x)\\
+               &size(right(x)) \geq \alpha\ size(x)\quad\text{where }\alpha \in {[0, 1/2]}
+\end{align*}
+
+A curious property of this invariant is that it implies height balance: $h \leq \log_{\frac{1}{1 - \alpha}} n$
+
+On update we simply insert at the leaves, and update the weights upward in the tree,
+assuming all internal nodes store the weights of its children explicitly.
+When a node becomes unbalanced, we simply rebuild the entire subtree from scratch.
+While this might seem slow, we can use an amortization scheme where we charge the $\Theta(k)$ updates in a subtree for that
+subtrees rebuild time, since we need a lot of changes in a subtree before the root of that subtree becomes unbalanced.
+The details are a little messy, but the bottom line is we get $O(\log n)$ amortized update.
+
+We can apply this to the range tree from Section~\ref{sec:range-tree} to get $O(\log^d n)$ amortized updates.
+
+We would also like to use the layered approach from Section~\ref{sec:layered-range-tree} to shave off a $\log$ factor,
+but it turns out that array rebuilding is problematic.
+However, we only need something array like in the root node of the tree, since we
+only need a binary search there, and we never use random access for the arrays in the internal nodes as we only follow pointers.
+We can replace the root array with a BBST, and the internal array with linked lists.
+We end up with the same query time $O(\log^{d-1} n + k)$, since the procedure is exactly the same, but also get $O(\log^d n)$ updates.
+
+

\chapter{Connectivity in Dynamic Graphs}
+\topics{Dynamic connectivity on trees, Euler tour trees}.
+
+Before starting, we point out that this chapter is subject to fewer proofs, and more stated results.
+
+We would like to solve the problem of connectivity queries.
+We maintain a graph which are subject to updates (edge insertion and deletion), and we answer queries of the form is $u$ and $v$ connected''?
+As in previous sections we split the problem into two variants: fully dynamic and partially dynamic.
+
+\begin{definition}{Fully Dynamic}
+    Connectivity queries in which the graph is fully dynamic
+\end{definition}
+
+\begin{definition}{Partially Dynamic}
+    Connectivity queries in which the graph update can be either edge insertions \emph{or} edge deletions, but not both.
+    Only insertions is called \emph{incremental}, and only deletion is called \emph{decremental}.
+\end{definition}
+
+Unless specified, we consider fully dynamic connectivity.
+
+\section{General Results}
+
+\subsubsection{Trees}
+We can handle connectivity queries for trees in $O(\log n)$ time, by using Link-Cut trees or Euler-Tour Trees (Section~\ref{sec:euler-tour}).
+If we limit ourselves to decremental connectivity, constant time is possible.
+
+\subsubsection{Plane Graphs}
+A plane graph is a planar graph with a fixed embedding; that is, edges know which faces they divide, and updates specify the face of the inserted element.
+Similar to with trees, $O(\log n)$ is also possible.
+
+\subsubsection{General Graphs}
+Is $O(\log n)$ per operation possible?
+This is an open problem, but we know how to get $O(\log^2)$ (amortized) update, and $O(\frac{\log n}{\log\log n})$ query.
+If we are willing to get slower updates for faster queries, $O(\sqrt{n})$ update and $O(1)$ query is possible.
+
+For the incremental case, we can get $\Theta(\alpha(a, b))$, where $\alpha$ is the inverse Ackermann function, by using \algo{Union-Find}.
+
+Decremental is possible in $O(m \log n + n \text{ polylog } n + \text{\# queries})$, where $m$ is the number of edges and $n$ is the number of vertices.
+
+There is also a known fundamental lower bound: either update or query have to be $\Omega(\log n)$.
+
+\section{Euler-Tour Trees}%
+\label{sec:euler-tour}
+We now look at the specifics regarding the result on tree connectivity, namely that $O(\log n)$ per operation is possible.
+We have already seen Euler Tour trees, in Section~\ref{sec:first-euler-tour}.
+The general idea is to traverse the tree and write down a node every time we get to it.
+Then we build a BBST of the written down nodes, where the ordering is the order in the list.
+Each node in the tree store the first and last visit in the BBST\@.
+The Euler Tree supports the following operations:
+
+\begin{enumerate}
+    \item \algo{Make-Tree}: Make a new isolated tree
+    \item \code{Find-Root}{v}: find the root of $v$s tree
+    \item \code{Link}{u, v}: Attach $u$ as a child of $v$
+    \item \code{Cut}{v}: remove $v$ from its parent
+\end{enumerate}
+
+We look at how each operation is implemented.
+Before we proceed, we remind ourselved of some of the operations that a BBST supports in $O(\log n)$ time:
+\code{Split}{x}: turn the tree into two trees, one in which have the keys $< x$ and the other have the keys $> x$;
+\code{Concat}{x, y}: turn the two trees $x$ and $y$ where $\forall x_i,y_i\ x_i < y_i$ into one tree with the keys $x \cup y$.
+Both operations can be done in $O(\log n)$ time.
+
+\subsection{Make-Tree}
+This is trivial: the tree for a singleton is the singleton itself.
+
+\subsection{Find-Root}
+Note that the root of the tree is not the root of the BBST\@.
+We start in the first tour visit of $v$, walk up to the root, and down to the rightmost node in the tree.
+The rightmost node is the first visited node, which is the root of the \emph{actual} tree in which we want to find the root.
+This takes $O(\log n)$ time.
+
+We find the last occurence of $v$ in the BBST, and insert the tree of $u$ in there.
+We also need to make sure that $u$ and $v$ themselves are occuring as they should after concatinating in $u$s tree.
+A single split and two concats.
+
+Note that $u$ have to be the root of its tree. What do we do if it is not?
+We can \emph{reroot} the tree: pick up the node we want to be the new root, such that the remaining of the tree falls down''.
+This is a cyclic shift in the euler tour, and can be done in one split and one concat,
+by splitting at the first occurence of $v$ in the tour, and concating it to the end.
+
+\subsection{Cut}
+We find the first and last occurence of $v$ in the tree, and cut at those two places, since $v$s subtree
+is a contiguous interval in the euler tour.
+
+Then we concat the first and last part together, and remove one of the $parent(v)$ nodes, so there are not two in a row.
+Two splits and one concat.
+
+\subsubsection{}
+Since all operations consists of walking up or down, splitting or concating, which all takes $O(\log n)$ time, we get $O(\log n)$ for all operations.
+Connectivity queries can be done by comparing the roots of the nodes we are querying.
+
+\section{Fully Dynamic Graphs}
+We look at how to obtain $O(\log^2 n)$ amortized queries for fully dynamic graphs.
+We maintain a spanning forest of the graph, using Euler-Tour trees.
+Now edge insertion corresponds to \algo{Link}.
+Edge deletion have two cases: if the edge deleted is not in the spanning forest we maintain, nothing has changed;
+If it \emph{is} we run into trouble, since simply deleting the edge does not imply that the graph becomes disconnected:
+there might be another edge that we did not use in the spanning tree, because the two components were already connected by
+the edge that we are now deleting.
+If we know that no such edge exist, we can simply \algo{Cut} out the tree, and we are done.
+
+The way we do this is to assign \emph{levels} to edges, and store $O(\log n)$ levels spanning forests, where some edges
+may get lost when going a level down.
+All edges start at level $\log n$, and the level is monotoically decresing, and at least 1.
+
+Let $G_i$ be the subgraph of edges with level $\leq i$.
+Note that $G_{\log n} = G$.
+
+Let $F_i$ be the spanning forest of $G_i$, stored using Euler-Tour trees.
+Note that $F_{\log n}$ answers the connectivity queries in $G$, since the forest spans the entire graph,
+and support connectivity queries in $O(\log n)$ time.
+
+We maintain the following invariants:
+
+\subsubsection{Invariant 1}
+Every connected component of $G_i$ has $\leq 2^i$ vertices.
+
+\subsubsection{Invariant 2}
+The forests nest: $F_0 \subseteq F_1 \subseteq \cdots \subseteq F_{\log n}$,
+and are gived by $F_i = F_{\log n} \cap G_i$.
+There is only one forest, and $F_i$ is just the part of the forest with the lower levels.
+This also means that $F_{\log n}$ is a minimal spanning forest with respect to edge levels.
+
+\subsubsection{Insertion}
+On insertion of $e=(u, v)$ we set $e.level = \log n$, and add $e$ to $u$ and $v$s indicence lists.
+If $(u, v)$ are not connected we add $e$ to $F_{\log n}$.
+This makes for $O(\log n)$ insertion.
+
+\subsubsection{Removal}
+This is the hard part.
+We start by removing $e$ from the indicence lists of the veritces it is connected to.
+This can be done in constant time if the edge itself stores a pointer into where it is in those lists.
+Then we check if $e \in F_{\log n}$ we are done (if $e$ is in any forest it is in $F_{\log n}$, since they nest).
+Else, we have to delete $e$ from $F_{e.level}\dots F_{\log n}$, which is exactly the trees that $e$ lives in.
+All of these are Euler-Tour trees, and there are at most $O(\log n)$ of them, which makes a total cost of $O(\log^2 n)$.
+
+Now we have to look for a replacement edge. We know by invariant 2 that there are no edges with a lower level,
+since then that edge would be in the tree istead of $e$.
+So if there is a replacement edge, it has level $\geq e.level$.
+We search upwards from $e.level$ to $\log n$.
+For each level $i$ we let $T_u$ and $T_v$ be the trees of $F_i$ containing $u$ and $v$.
+Without loss of generality, let $|T_u| \leq |T_v|$.
+By invariant 1, we know that the sizes of these components are limited: $|T_u| + |T_v| \leq 2^i$, since they were connected before deleting $e$.
+This means that $T_u \leq 2^{i-1}$, so we \emph{can} push down all edges in $T_u$ to level $i-1$ without destroying invariant 1.
+We will use this as the charging scheme to get the amortized running time we want.
+
+We look at all edges $e'=(x,y), x \in T_u$ at level $i$.
+The edge is either internal to $T_u$, or it goes to $T_v$, like $e$ does.
+Why can it not go to another component $T_w$?
+Assume there is an edge $f=(x, w),\ w \in T_w$ of level $i$.
+Since $f.level = i$ we know that $f \in G_i$, and since $F_{\log n}$ is a minimal spanning forest,
+we know that if $T_u$ and $T_w$ are connected in $G$ they are connected in $G_i$, since $f$ can be used.
+But this contradicts the assumption, namely that $f$ is not internal to $T_u$.
+Therefore $T_u$ and $T_w$ cannot be connected in $G$, so $f$ cannot exist.
+
+If $e'$ is internal to $T_u$ it does not help us, so we set $e'.level = i - 1$, which we can afford.
+If $e'$ goes to $T_v$ we are done, since it is a replacement edge; insert it into $F_i, \dots, F_{\log n}$.
+
+Overall we pay $O(\log^2 n + \log n \cdot \text{\# level decreses})$,
+but the number of level decreses is bounded by the number of inserts times $\log n$,
+since edge levels are strictly decresing and between $\log n$ and 1.
+We can charge inserts with $\log n$, making the amortized cost of delete $O(\log^2 n)$, which is what we wanted.
+
+The last complication is that we need to augment the tree with subtree sizes at every node, in order to make the comparison $T_u \leq T_v$ in constant time,
+and that we somehow must find all edges on a certain level.
+To handle this we store in all internal nodes in the Euler-Tour trees an array, signaling whether the nodes in this subtree
+has any level $i$ edges adjacent to them.
+In addition, the adjacency lists of the nodes store one list for each level, instead of having one list for all edges.
+This makes the search to find the next level $i$ edge $O(\log n)$.
+
+\section{Other Results}
+We list some other related results in the field of conenctivity.
+
+\subsection{k-connectivity}
+2-edge connectivity is maintainable in $O(\log^4 n)$ time, and 2-vertex connectivity in $O(\log^5 n)$ time.
+
+\subsection{Minimum Spanning Forest}
+The general problem of maintaining a minimum spanning forest can be solved dynamically in $O(\log^4 n)$ time.
+
+
+
+
+
+

-Dynamic connectivity on trees, euler tour trees.

\chapter{Lower Bounds}