Report drafting, second chapter finished

frblazquez · Jan 11, 2021 · 8be8dc5 · 8be8dc5
1 parent 1e87ddf
commit 8be8dc5
Show file tree

Hide file tree

Showing 4 changed files with 68 additions and 55 deletions.
diff --git a/Report/Graver_Basis.pdf b/Report/Graver_Basis.pdf
diff --git a/Report/Graver_Basis_tex/Bibliography.bib b/Report/Graver_Basis_tex/Bibliography.bib
@@ -84,12 +84,13 @@ @inproceedings{ONN:2007
     location      = {Israel Institute of Technology, Haifa, Israel},
 } 
 
-@inproceedings{NAKAB:1986,
-    author        = {N. Alon and K. A. Berman},
-    title         = {Regular Hypergraphs, Gordon’s lemma, Steinitz’ Lemma and Invariant theory},
-    year          = {1986},
-    location      = {Massachusetts Institute of Technology, Cambridge, Massachusetts},
+@inproceedings{ONN:2010,
+    author        = {Shmuel Onn},
+    title         = {Nonlinear discrete optimization},
+    year          = {2010},
+    location      = {Israel Institute of Technology, Haifa, Israel},
 } 
 
 
 
+
diff --git a/Report/Graver_Basis_tex/Thesis.tex b/Report/Graver_Basis_tex/Thesis.tex
@@ -176,7 +176,7 @@
 %% ----------------------------------------------------------------
 %% DEDICATION
 %% ---------------------------------------------------------------- 
-%\begin{comment}
+\begin{comment}
 % Insert empty page / blank page
 %\thispagestyle{empty}
 %\null
@@ -194,7 +194,7 @@
 }
 
 \addtocontents{toc}{\vspace{2em}}  % Add a gap in the Contents, for aesthetics
-%\end{comment}
+\end{comment}
 
 %% ----------------------------------------------------------------
 %% ACKNOWLEDGEMENTS

diff --git a/Report/Graver_Basis_tex/src/2.GraverBasis.tex b/Report/Graver_Basis_tex/src/2.GraverBasis.tex
@@ -2,116 +2,128 @@ \chapter{Graver bases} \label{literature}
 
 \lhead{\emph{Graver bases}}  % Set the left side page header
 
-%\begin{definition}
-%Two vectors $u,v \in \mathbb{R}^n$ are said to be \textbf{sign compatible} if $u_i \cdot v_i \geq 0$ for all $i \in \{1,...,n\}$, i.e. they have the same sign componentwise.
-%\end{definition}
 
-Before introducing the concept of Graver basis of a matrix, we define a partial order $\sqsubseteq$ in $\mathbb{R}^n$ by $u \sqsubseteq v$ if $u_i \cdot v_i \geq 0$ and $|u_i| \leq |v_i|$ for all i. Note that the condition $u_i \cdot v_i \geq 0$ means that $\sqsubseteq$ can only compare vectors with the same sign componentwise. The Graver basis of a matrix is the set of minimal elements (for this order $\sqsubseteq$) in its integral kernel excluding zero. Formally: 
+Before introducing the concept of Graver basis of a matrix, we define a partial order $\sqsubseteq$ in $\mathbb{R}^n$ by $u \sqsubseteq v$ if $u_i \cdot v_i \geq 0$ and $|u_i| \leq |v_i|$ for all i. Note that the condition $u_i \cdot v_i \geq 0$ means that $\sqsubseteq$ can only compare \textit{sign compatible} vectors, i.e., with the same sign componentwise. The Graver basis of a matrix is the set of minimal elements (for this order $\sqsubseteq$) in its integral kernel excluding zero. Formally: 
 
 \begin{definition}[\textbf{Graver basis}]
 The Graver basis ($\mathcal{G}(A)$) of a given matrix $A \in \mathbb{Z}^{mxn}$ is defined as the set of $\sqsubseteq$-minimal elements in $\{z \in \mathbb{Z}^n: Az = 0, z\neq0\}$.%$ker(A) \setminus \{0\}$.\\
 \end{definition}
 
-Graver bases were initially defined as \textit{universal integral test set} in \cite{GRAVER:1975} by Jack. E. Graver, in 1975. They often appear also defined in an equivalent way as the nonzero indecomposable elements in $ker(A)$. Indecomposable in the sense that they can not be expressed as the sum of two vectors with the same sign componentwise. It's easy to see the equivalence of both definitions.
+\vspace{-5pt}
+Graver bases were initially defined as \textit{universal integral test set} in \cite{GRAVER:1975} by Jack. E. Graver, in 1975. They often appear also defined in an equivalent way as the nonzero indecomposable elements in $ker(A)$. Indecomposable in the sense that they can not be expressed as the sum of two sign compatible vectors. It's easy to see the equivalence of both definitions.
 
-%\begin{definition}
-%A vector $u \in ker(A)$ is \textbf{indecomposable} if it is not the sum of two sign compatible and non zero elements in $ker(A)$.
-%\end{definition}
-
-%\begin{definition}[\textbf{Graver basis}]
-%The Graver Basis of a given matrix $A \in \mathbb{Z}^{mxn}$ is defined as the set of integral indecomposable elements in the kernel of A.\\
-%(Initially defined as \textit{universal integral test set} in [Graver 1975])
-%\end{definition}
-
-%The equivalence of both definitions is clear since decomposing a vector as the sum of two sign compatible nonzero elements $v_1, v_2$ implies that it's not minimal as $v_1, v_2$ are lower.
-
-%\section{Graver Basis properties}
 Now that Graver bases are formally defined, we present their main properties in the form of propositions which will be the theoretical basis for the algorithms presented in the next sections.
 
 \begin{proposition}
 For every matrix A, $\mathcal{G}(A)$ is a finite set.
 \end{proposition}
 \vspace{-20pt}
 \begin{proof}
-%Gordan's lemma implies that every subset of $\mathbb{Z}^n$ has a finite number of $\sqsubseteq$-minimal elements. This result can be seen in \cite{NAKAB:1986}. 
 Dickson's lemma states that every subset of $\mathbb{N}^n$ has a finite number of minimal elements (with the order $\leq$ componentwise). It's easy to see that this implies that the integral kernel of A (excluding zero) has a finite number of $\sqsubseteq$-minimal elements in every orthant. As the elements in different orthants are not comparable we have that $\mathcal{G}(A)$ is the union of $2^n$ finite sets, concluding the proof.
 \end{proof}
 
-% TODO: Not really true!!
-% Dickson's lemma better? (Shmuel Onn - Convex integer programming - 2007)
+Unfortunately, the cardinality of $\mathcal{G}(A)$ may be exponential in $n$, the number of columns of $A$. This of course limits the explicit computation and usage of the Graver basis to only certain cases but of course doesn't limit its theoretical properties. The most important of these properties is expressed in the following proposition:
 
 \begin{proposition}
-Every integral element in ker(A) can be expressed as positive integral linear combination of sign compatible elements in $Gr(A)$.
+Every integral element in $ker(A)$ can be expressed as positive integral linear combination of sign compatible elements in $\mathcal{G}(A)$.
 \end{proposition}
-% TODO: Is it necessary to require them to be sign compatible??
-% TODO: Say that this makes Graver basis be a universal integral test set!
+\vspace{-20pt}
+% TODO: Review!
+% TODO: Compare with proof of lemma 4.2 [Onn-Convex_IP-2007]!
+\begin{proof}
+The proof is by induction in the $\ell_1$ norm. For the base case we see that given $u \in \mathbb{Z}^n\cap ker(A)$ such that $||u||_1 = 1$ then $u$ belongs to $\mathcal{G}(A)$ and the result holds.
 
+For the induction case lets suppose the result is given up to k and take $u \in \mathbb{Z}^n\cap ker(A)$ such that $||u||_1 = k$. Again, if $u$ is minimal in $\mathbb{Z}^n\cap ker(A) \setminus \{0\}$ the result is clear so let's suppose this is not the case. Therefore it exists $u_1$ s.t. $u_1 \sqsubseteq u, u_1 \neq u$. We take $u_2 = u - u_1$. Note that thanks to the definition of $\sqsubseteq$ $u$, $u_1$ and $u_2$ are sign compatible and thanks to $u \neq u_1$ necessarily  $||u_1||_1,||u_2||_1 \leq k$. With this appreciations, the proof concludes after applying the induction hypothesis:\\
+\vspace{-30pt}
+\begin{center}
+    $u = u_1 + u_2 = \sum \alpha_{1i}g_{1i} + \sum \alpha_{2i}g_{2i} = \sum \alpha_{j}g_{j}$
+\end{center}
+\end{proof}
+
+This proposition is the reason why Graver bases were introduced as \textit{universal integral test set}. It ensures that, given any feasible point, the whole feasible region can be expressed in terms of elements in $\mathcal{G}(A)$. Note that thanks to requiring positive coefficients and sign compatible elements we avoid cancellations in every component. 
+
+In the next proposition we see how, thanks to this property, we can do an optimality test for any feasible point using only elements in the Graver basis.
 
 \begin{proposition}
-Given z in the feasible region of an IP, z is not optimum if and only if there exists $g \in Gr(A)$ s.t. $c^tg > 0$ and $l \leq z + g \leq u$
+Given a feasible point $z$ of the IP, $z$ is not optimum if and only if there exists $g \in \mathcal{G}(A)$ s.t. $c^tg > 0$ and $l \leq z + g \leq u$.
 \end{proposition}
-
+\vspace{-20pt}
 \begin{proof}
-Lets suppose first that a feasible point $z$ is not an optimum, then $z^* - z$ belongs to $ker(A)\setminus\{0\}$. Thanks to the previous proposition we have $g_i \in G(A)$, $\alpha_i \geq 0$ s. t. $0 < c^t(z^* - z) = \sum \alpha_i c^t g_i$ and it's then clear that exists at least one $g_i \in G(A)$ verifying $c^tg_i > 0$ and respecting the bounds.
-% TODO: Clarify the respecting the bounds part
+If there exists $g \in \mathcal{G}(A)$ (therefore $\in ker(A)$) $s.t. c^tg > 0$ is clear that $z + g$ is a feasible point which strictly improves the objective function, so $z$ is not an optimum. 
 
-For the other implication is clear that $z + g$ is a feasible point which improves the objective function so $z$ is not an optimum. 
+For the other implication, if z is not an optimum we can take a feasible point $y$ improving $z$, then $y - z$ verifies the hypothesis of the previous proposition so there exist $g_i \in G(A)$, $\alpha_i \geq 0$ s. t. $0 < c^t(y - z) = \sum \alpha_i c^t g_i$ and it's then clear that exists at least one $g_i \in \mathcal{G}(A)$ verifying $c^tg_i > 0$. Finally, thanks of $\alpha_i \geq 0$ and $g_i$ being sign compatible with $y - z$ we have that for all $i$, $l \leq z \leq z + g_i \leq z + \sum \alpha_i g_i = y \leq u$.
 \end{proof}
 
-\section{Graver Basis greedy algorithm}
+\section{Graver Basis greedy augmentation algorithm}
+
+We now consider how to solve the general IP with the help of Graver bases. Note that proposition 2.4 doesn't only give us an optimality test but also provide us an improvement direction if the feasible point is not optimal. We can follow that improvement direction to get a better feasible point and then repeat this process. That is the idea of the following procedure (idea introduced in \cite{GRAVER:1975}):
 
-% TODO: Take references from De Loera et al. 2006!
 \textbf{General IP algorithm using Graver basis}
+\vspace{-8pt}
 \begin{enumerate}
     \item From a feasible solution $z_i$
     \item Find $g^*$ optimum for the sub-problem: \vspace{4pt}\\
-          $max\{c^tg : g \in Gr(A), l \leq z_i + g \leq u \}$ \vspace{4pt}
+          $max\{c^tg : g \in \mathcal{G}(A), l \leq z_i + g \leq u \}$ \vspace{4pt}
     \begin{itemize}
         \item $c^tg^* \leq 0 \implies z_i$ optimal solution.
-        \item $c^tg^* > 0 \implies$ $g^*$ improvement direction, loop back to 1 with $z_{i+1} = z_i + \lambda \cdot g^*$ with the biggest $\lambda$ respecting the bounds.
+        \item $c^tg^* > 0 \implies$ $g^*$ improvement direction, loop back to 1 with:\\ $z_{i+1} = z_i + \lambda \cdot g^*$ with the biggest $\lambda$ respecting the bounds.
     \end{itemize}
 \end{enumerate}
-\hspace{15pt} [References??]
+%\hspace{15pt} [References??]
 
-% TODO: Problem, requires Graver Basis computation!
-% TODO: Explain that the complexity is polynomial!
+We can affirm that it's an algorithm, i.e., it finishes in a finite number of steps, thanks to the lower and upper bounds $l$ and $u$. We can assume they are finite and, this way, the objective function is also bounded. Since every iteration we are strictly increasing the objective function, no infinite loop is possible. As is well known, it is always possible to add suitable polynomial upper and lower bounds without excluding some optimal solution if any, so assuming $l$ and $u$ to be finite is no loss of generality.
 
-The question that arises now is which is the complexity of this algorithm. \cite{HOW:2009} (Theorem 2.b) states that the number of augmentation steps is polynomial and, since the cost of each augmentation step is in the order of $|G(A)|\times n$, we have that this algorithm is polynomial. Also \cite{LHOW:2006} states this (but the proof is more complicated).
+% TODO: href style!
+% TODO: Maybe proposition with the complexity ??
+% TODO: Improve \cite{...} (Theorem x.y) ??
+The question that arises now is the complexity of this algorithm. It was analyzed in \cite{LHOW:2006} (Theorem 3.3) showing that it's polynomial. This of course doesn't mean we have a polynomial algorithm for the general IP, it means that, given an IP along with its Graver basis, we have a polynomial algorithm in this input size. The complexity of the problem remains in computing the Graver basis which, as we announced, may be exponential. This makes the algorithm non-viable but for small matrices. In the Appendix A we go further to analyze how to compute the Graver basis of a given matrix and we introduce the tool \href{https://4ti2.github.io/}{4ti2}.
+
+% TODO: Putting these together (instead of it's clear then)??
+% TODO: Textbf for the complexity (last sentence) ??
+Another way to estimate the complexity of the algorithm is using \cite{HOW:2009} (Theorem 2.b), which states that the number of augmentation steps is polynomial. Since once obtained the Graver basis the cost of each augmentation step is in the order $n \cdot |\mathcal{G}(A)|$ (search over $\mathcal{G}(A)$), it's clear that the algorithm is polynomial in $n \cdot |\mathcal{G}(A)|$.
 
-This of course doesn't mean we have a polynomial algorithm for the general IP because the trick is that the Graver Basis is given as part of the input. The problem is of course computing it and, in most of the cases, its size is exponential in the dimension.
 
 
 \section{Graver Basis norm bounds}
 
+Up to this point we have seen how Graver bases allow a straightforward algorithm for the general IP. However, we have seen that its main drawback is that it requires the explicit computation of the Graver basis. In this section we show how we can avoid computing it thanks to bounds on the $\ell_1$-norm of the Graver basis elements.
+
 \begin{proposition}[\textbf{Graver basis bounds}]
-Given $A \in \mathbb{Z}^{mxn}$ and $\Delta$ an upper bound for the absolute value of each component of $A$, for every $g \in Gr(A)$:
+Given $A \in \mathbb{Z}^{mxn}$ and $\Delta$ an upper bound for the absolute value of each component of $A$, for every $g \in \mathcal{G}(A)$:
+\vspace{-10pt}
 \begin{itemize}
     \item $||g||_1 \leq m^{m/2}\Delta^m\cdot(n - m)$ \hspace{10pt}[Onn 2010]
     \item $||g||_1 \leq (2m \Delta + 1)^m$ \hspace{41pt}[Eisenbrand,Hunkenschröder,Klein 2018]
 \end{itemize}
 \end{proposition}
 
-Unfortunately this bounds are both exponential. The second one has the advantage of being n-independent. In certain cases we can get a much tighter bound for the Graver Basis elements and this can help us to get a faster algorithm. The key ideas are the following points.
+We refer to \cite{ONN:2010} and \cite{EISENBRAND:2018} for the proof. Note that both bounds are exponential in the number of rows of $A$ but the second one has the advantage of being independent in the number of columns. 
 
-\textbf{Bases of augmentation algorithm}
-\begin{itemize}
-    \item If not optimal, an element in Graver basis is an improvement direction.
-    \item If Graver basis bounded, we can restrict our improvement direction search.
-\end{itemize}
+Why should bounds to the Graver bases help? Because thanks to Proposition 2.4, the search of an improvement direction can be restricted to the elements in the Graver basis and, thanks to the bounds, we can restrict our search space without excluding any element of the Graver basis. This is the idea of the following algorithm (see \cite{HEMMECKE:2013}).
+
+%\textbf{Bases of augmentation algorithm}
+%\begin{itemize}
+%    \item If not optimal, an element in Graver basis is an improvement direction.
+%    \item If Graver basis bounded, we can restrict our improvement direction search.
+%\end{itemize}
 
 \textbf{General IP algorithm using Graver basis norm bound}
+\vspace{-8pt}
 \begin{enumerate}
     \item From a feasible solution $z_i$
     \item Find $g^*$ optimum for the sub-problem: \vspace{4pt}\\
-          $max\{c^tg : Ag = 0, l-z_i \leq g \leq u-z_i, g \in \mathbb{Z}^n, ||g||_1 \leq ||Gr(A)|| \}$ \vspace{4pt}
+          $max\{c^tg : Ag = 0, l-z_i \leq g \leq u-z_i, g \in \mathbb{Z}^n, ||g||_1 \leq ||\mathcal{G}(A)|| \}$ \vspace{4pt}
     \begin{itemize}
         \item $g^* = 0 \implies z_i$ optimal solution.
-        \item $g^* \neq 0 \implies$ $g^*$ improvement direction, loop back to 1 with $z_{i+1} = z_i + \lambda \cdot g^*$ with the biggest $\lambda$ respecting the bounds.
+        \item $g^* \neq 0 \implies$ $g^*$ improvement direction, loop back to 1 with:\\
+        $z_{i+1} = z_i + \lambda \cdot g^*$ with the biggest $\lambda$ respecting the bounds.
     \end{itemize}
 \end{enumerate}
-\hspace{15pt} [Hemmecke, Onn, Romanchuk 2013]
+%\hspace{15pt} [Hemmecke, Onn, Romanchuk 2013]
 
 % TODO: Solution, doesn't require Graver Basis computation
 % TODO: Problem, not good bounds for the general case, no improvement
 
-The main advantage of this algorithm is that it doesn't require the explicit computation of the Graver Basis. However, the main drawback is that in general the bound for the graver bassis elements also increases exponentially with the dimension so this additional restriction to the problem won't be a help.
+As we advanced, the main advantage of this algorithm is that it doesn't require the explicit computation of the Graver Basis. However, the complexity is totally dependent on the added restriction $||g||_1 \leq ||\mathcal{G}(A)||$, and the only bounds we have for the general case are exponential. In this case, the additional restriction to the problem doesn't improves the lower and upper bounds and the complexity is exponential. 
+
+In certain cases we can get a much tighter bound for the Graver Basis elements and this can help us to get a faster algorithm. The N-Fold IP is an iconic example.