%-*- LaTeX -*- \NeedsTeXFormat{LaTeX2e} \documentclass[leqno]{assign} \usepackage{newcent} \usepackage{amsmath,amsthm,amssymb,enumitem} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Start reading here %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % % The four commands below are used to define the ordinal number of the % lecture (can be found on the lecture syllabus), the title of the lecture % (formulate the best description you can come up with; you can always % resort to the tentative title on the lecture syllabus if you have % difficulty), the date of the lecture (in this format: DD Month YYYY), % and the name of the scribe (that's you). % % Please supply correct arguments for the four commands below: % \lecturenumber{} \lecturetitle{Midterm } \lecturedate{Oct 20, 2008} \duedate{Oct 27, 2008} \lecturescribe{} % % In the space indicated below, please type out your lecture notes. You % should use all the standard LaTeX typesetting conventions, plus a couple % of additional commands described below. If you are unfamiliar with using % LaTeX to typeset technical documents, please read the very useful summary % entitled ``The Not So Short Introduction to LaTeX2e'' which is available % on BlackBoard. You can also examine the source files for the scribed % lecture notes from lectures 1, 2, and 3 for some formatting ideas. % % Use the standard commands \section{}, \subsection{}, and \subsubsection{} % to format your document logically into sections, subsections, and % subsubsections (as needed). % % If you reference a URL anywhere, use the provided command \urlclick{} % like this: \urlclick{http://www.web.com/foo.html}. This will create a % clickable hyperlink in the resulting .pdf file. % % To add a figure to your notes, use the provided command \figureadd{} % like this: \figureadd{basename}{label}{caption}. In this command, % basename is the base name of your figure, which must exist in both .eps % and .jpg formats. For example, if you have created a figure named both % foo.jpg and foo.eps, then the basename is simply foo. The label field % creates a label which you can reference elsewhere using the standard % command \ref{}. Please use the caption field to provide a descriptive % caption for any figure you create. When you submit your scribed notes, % please also submit all your figure files in both .eps and .jpg formats. % % For a theorem, corollary, lemma, claim, fact, definition, assumption, % observation, or example, consider using these provided commands to set it % apart from other text: % \theorem{...} % \corollary{...} % \lemma{...} % \claim{...} % etc. % % For proofs, or sketches of proofs, use these provided environments: % \begin{proof} ... \end{proof} % \begin{proof_sketch} ... \end{proof_sketch} % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \newcommand{\reals}{{\mathbb R}} \newcommand{\integers}{{\mathbb Z}} \newcommand{\ch}{\ensuremath\mathcal{CH}} % \newtheorem{theorem}{Theorem}[section] % \newtheorem{lemma}{Lemma}[section] % \newtheorem{corollary}{Corollary}[section] % \newtheorem{claim}{Claim}[section] % \newtheorem{prop}{Proposition}[section] % \newtheorem{fact}{Fact}[section] \newtheorem*{problem}{Problem} \newtheorem*{defn}{Definition} % \numberwithin{equation}{section} \renewcommand{\thesection}{\arabic{section}} \begin{document} %\label{begin} \textbf{Instructions:} \begin{itemize} \item The exam MUST be submitted no later than 5pm Monday Oct 27, whether electronically or by hand. \item The exam is open-book: you may use any reference material from the textbook, or from the Cormen/Leiserson/Rivest/Stein book. You MAY NOT use answers found on the web. \item If you need a desired result (running time of Kruskal's algorithm, etc) that is not directly related to the problem you're solving, you may cite it. For example, you can say, ``We can compute an MST in $O(m\log n)$ time using Kruskal's algorithm (Kleinberg/Tardos, Chapter 4.5). Citations must come from one of the two textbooks indicated above. \item Unless clearly specified, all statements require proofs. \end{itemize} \newpage \section{Recurrences} \label{sec:recurrences} Give the tightest possible bounds for functions defined by the following recurrences. In each case, you may assume a base case of the form $f(1) = c$. Show your analysis. \numberwithin{equation}{section} \begin{equation} A(n) = A(n-1) + 1/n\label{eq:1} \end{equation} \begin{equation} B(n) = B(n/2) + A(n)\label{eq:2} \end{equation} (i.e $A(n)$ from the previous recurrence \begin{equation} T(n) = 4T(\lceil \frac{n-8}{2}\rceil + \lfloor \frac{n}{\log_\pi n} \rfloor) + 4\binom{n-2}{3} - 13 n\log^4 n + \frac{\log \log n + 1}{\log n\log\log\log n}\label{eq:3} \end{equation} \begin{equation} \label{eq:4} H(n) = 2H(n-1) - H(n-2) + n \end{equation} \newpage \section{Range Searching} \label{sec:range-searchingn} The \emph{range counting} problem in two dimensions is defined thus: we are given a collection of $n$ points $P$. A \emph{query} consists of an (infinite) line $\ell$, and the desired output is the set of all points from $P$ that lie to one side of $\ell$. For concreteness, imagine a car driving along the line from left to right: we wish to count all points the driver sees on their left side. \begin{center} \includegraphics[width=3in]{points.png} \end{center} It's fairly easy to do this in time linear in $n$: for each point we can test in constant time which side of the line its on. The goal is to get a sublinear query time. Here's a divide-and-conquer procedure that achieves this. From $P$ compute two lines $\ell_1$ and $\ell_2$. These two lines divide the plane into four parts, and have the special property that each part contains exactly $n/4$ points. Create a tree with a root that represents all of $P$, and four children, each representing the points in one of the four parts. In each node, store the number of points that it represents. Recurse. Now a query $\ell$ can be processed as follows: \begin{enumerate} \item $\ell$ intersects at most three of the four children of the root. Let these three children be $v_1, v_2, v_3$. \item Suppose one of the $v_i$ (say $v_1$) is completely contained on one side of $\ell$. Merely add the count of points represented by $v_1$ to the overall count, and discard it. \item Else, recurse in each node. \end{enumerate} How long does this procedure take ? More precisely, given $n$ points, and a data structure built as described above, how long does it take to return the desired count ? \newpage \section{Typesetting} \label{sec:typesetting} You're designing a word processor, and one of the things you have to do is justify text. In other words, you need to convert something that looks like this: \begin{verbatim} Call me Ishmael Some years ago, never mind how long precisely, having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world \end{verbatim} to this: \begin{verbatim} Call me Ishamel. Some years ago, never mind how long precisely, having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world \end{verbatim} Notice how the process smooths out the ragged right boundary (whether this is a good or not I'll leave as a judgement call!). How might we do this automatically ? Let's formalize the problem. Suppose we're given a sequence of words $w_1 \ldots w_n$, each $w_i$ consisting of $c_i$ characters. Further, assume we're working with a page width of $L$ characters (which means that no line in the output can be longer than $L$, including spaces). When we typeset the words, we'll partition the sequence into lines, and in between each pair of adjacent words on a line is a space. Thus, in order to assign words $w_i \ldots w_j$ to a single line, they must satisfy the constraint \[ c_j + \sum_{k=i}^{j-1} (c_k + 1) \le L \] We'll define the \emph{slack} of a line as the degree to which this inequality is loose: i.e right hand side of the above inequality minus the left hand side. Notice that the slack is always positive. Ideally, we'd like the slack to be as small as possible, and one way of doing this is to define the cost of a typesetting as the \emph{sum of squares} of slacks for the lines. Design an algorithm that produces a typesetting with minimum cost. \newpage \section{Simulating A Flow} \label{sec:simulating-flow} Consider the following network (numbers on edges are capacities): \begin{figure}[ht] \centering \includegraphics[width=2in]{graph} \end{figure} \begin{enumerate}[label=\alph*)] \item Simulate the Ford-Fulkerson algorithm and find the max flow and min cut. Show the working of the algorithm by showing each augmenting path. \item Draw the residual graph obtained after the algorithm is complete. Mark all vertices reachable from S, and those that can reach T. You may omit edges with zero capacity. \item An edge in the network is called a \emph{bottleneck} edge if increasing its capacity will increase the max flow. Mark all bottleneck edges in the network. \item Give an example of a network with no bottleneck edges (!). \item Give an efficient algorithm to find all bottleneck edges in a network (Hint: compute a max flow, and then look at the residual graph) \end{enumerate} \newpage \section{More Matchings} \label{sec:maximal-matchings} In class, we saw a max-flow algorithm for computing the maximum matching in a bipartite graph $G = (L \cup R, E)$. But do we need such a complicated method ? Suppose we just throw edges into our matching ``bag'' as long as we aren't stuck (i.e as long as there's a new edge not incident on any of the vertices in the bag). The resulting matching is called \emph{maximal}, because it cannot be grown any further. \paragraph*{1)} \label{sec:1} Provide an example where a maximal matching is not maximum. But all is not lost: a maximal matching can be ``close'' to a maximum matching in cardinality. In order to see this, it helps to understand what an augmenting path looks like when running the max-flow algorithm. Let's ignore the edges from $s$ and to $t$. What we have left is a path that alternates between edges not in the current matching and edges that are. The importance of such a path is this: it always has odd length (say $2k+1$), and of these edges, $k+1$ are not in the matching, and $k$ are. This means that given the current matching $M$ and such an augmenting path $P$, the new set of edges $M' = M \Delta P$ (where $A \Delta B = (A - B) \cup (B-A)$ is the symmetric difference between sets $A$ and $B$) is of size $|M'| = |M|+1$. Thus, repeatedly finding such augmenting paths and computing the resulting symmetric difference allows us to continually increase the size of the matching till we reach the maximum. \paragraph*{2)} Let $M$ and $M'$ be two matchings, where $|M| \le |M'|$. Show that in the graph induced by the edges of $M \Delta M'$, each vertex has degree at most two. Prove that this implies that this graph consists of cycles and paths, and show that it contains at least $|M'| - |M|$ vertex-disjoint augmenting paths with respect to $M$. We can construct the residual network corresponding to a maximal matching: assign a unit of flow to each edge of the maximal matching, and corresponding units of flow to the edges from $s$ and to $t$ that are incident on the vertices of the matching. \textbf{Fact:} Any augmenting path in this residual network has length at least $3$ (not counting edges adjacent to $s$ and $t$) \paragraph*{3)} \label{sec:3} Use this fact, together with the result from 2) above, to show that if $M^*$ is a \emph{maximum} matching, and $M$ is a maximal matching, then \[ |M| \ge (1/2)|M^*| \] \textbf{Hint:} Think of $M^*$ as the matching $M'$ in 2). \end{document} %%% Local Variables: %%% mode: latex %%% TeX-master: "midterm" %%% End: