Add extra notebook with Douglas-Rachford solver

odlgroup · Dec 4, 2017 · 5dfa0a3 · 5dfa0a3
1 parent 5efa27e
commit 5dfa0a3
Showing 1 changed file with 287 additions and 0 deletions.
diff --git a/code/Using_a_different_solver.ipynb b/code/Using_a_different_solver.ipynb
@@ -0,0 +1,287 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Using a different solver\n",
+    "\n",
+    "The steps taken in the [first notebook](TV_denoising_with_PDHG.ipynb) are largely the same for all solvers. However, the specific form\n",
+    "\n",
+    "$$\n",
+    "    \\min_{x \\in X} \\left[ f(L x) + g(x) \\right]\n",
+    "$$\n",
+    "\n",
+    "required us to rewrite the problem using product spaces.\n",
+    "\n",
+    "Other optimization methods, such as the (primal-dual variants of the) [Forward-Backward](https://github.com/odlgroup/odl/blob/master/odl/solvers/nonsmooth/forward_backward.py) [BC2015] and [Douglas-Rachford](https://github.com/odlgroup/odl/blob/master/odl/solvers/nonsmooth/douglas_rachford.py) [BH2013] methods, provide alternatives to PDHG that can be easier to set up.\n",
+    "\n",
+    "## The Douglas-Rachford solver\n",
+    "\n",
+    "We will take a closer look at the Douglas-Rachford splitting based method. Without going into the mathematical details of why and how, we observe that this method can solve problems of the form\n",
+    "\n",
+    "$$\n",
+    "    \\min_{x \\in X} \\left[ f(x) + \\sum_{i=1}^n g_i (L_i x) \\right]\n",
+    "$$\n",
+    "\n",
+    "with convex functions $f$, $g_1, \\dots, g_n$ and linear operators $L_1, \\dots, L_n$. So in contrast to the PDHG method, we may have *any number of functions composed with linear operators*, which makes it much easier to set up the problem.\n",
+    "\n",
+    "## Setting up the problem\n",
+    "\n",
+    "We consider again the TV denoising problem\n",
+    "\n",
+    "$$\n",
+    "    \\min_{x \\geq 0} \\left[\\| x - y \\|_2^2 + \\alpha \\| \\nabla x \\|_1 \\right],\n",
+    "$$\n",
+    "\n",
+    "or, rewriting the positivity constraint,\n",
+    "\n",
+    "$$\n",
+    "    \\min_x \\left[ \\| x - y \\|_2^2 + \\alpha \\| \\nabla x \\|_1 + \\iota_0(x) \\right].\n",
+    "$$\n",
+    "\n",
+    "This form is already in the right shape for our solver, since we can choose the functions\n",
+    "\n",
+    "$$\n",
+    "    f = \\iota_0,\\quad g_1 = \\|\\cdot - y\\|_2^2, \\quad g_2 = \\|\\cdot\\|_1,\n",
+    "$$\n",
+    "\n",
+    "and the linear operators\n",
+    "\n",
+    "$$\n",
+    "    L_1 = I, \\quad L_2 = \\nabla.\n",
+    "$$\n",
+    "\n",
+    "## Parameter selection\n",
+    "\n",
+    "In exchange for the easy problem setup, we somewhat trade the simplicity of finding good optimization parameters. The Douglas-Rachford method is guaranteed to converge if $\\tau$ and $\\sigma_1, \\dots, \\sigma_n$ are chosen such that\n",
+    "\n",
+    "$$\n",
+    "    \\tau \\sum_{i=1}^n \\sigma_i \\|L_i\\|^2 < 4.\n",
+    "$$\n",
+    "\n",
+    "In our case, we have three parameters $\\tau, \\sigma_1, \\sigma_2$, which makes it a bit more complicated to pick good values compared to PDHG with two parameters. As a rule of thumb, it often makes sense to have the $\\sigma_i$ parameters balance the operator norms, i.e., to choose\n",
+    "\n",
+    "$$\n",
+    "    \\sigma_i = c \\|L_i\\|^{-2},\n",
+    "$$\n",
+    "\n",
+    "and to select $\\tau$ such that the sum converges, i.e.,\n",
+    "\n",
+    "$$\n",
+    "    \\tau < \\frac{4}{nc}.\n",
+    "$$\n",
+    "\n",
+    "It also makes sense to push the keep the sum in the order of 1 since otherwise, the step sizes are too small. Overall selecting good parameters is a bit of a trial-and-error procedure."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Implementation\n",
+    "\n",
+    "We consider the same scenario as in the [first notebook](TV_denoising_with_PDHG.ipynb). As before, we define a reconstruction space $X$ using uniform sampling on the rectangle $[0, N]$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "import numpy as np\n",
+    "import scipy.misc\n",
+    "\n",
+    "import odl\n",
+    "\n",
+    "# Generate test image\n",
+    "image = scipy.misc.ascent().astype('float32')\n",
+    "\n",
+    "# Create reconstruction space\n",
+    "shape = image.T.shape\n",
+    "X = odl.uniform_discr(min_pt=[0, 0], max_pt=shape, shape=shape)\n",
+    "\n",
+    "# Wrap image as space element, generate noisy variant and display\n",
+    "image /= image.max()\n",
+    "x_true = X.element(np.rot90(image, -1))\n",
+    "# To get predictable randomness, we explicitly seed the random number generator\n",
+    "with odl.util.NumpyRandomSeed(123):\n",
+    "    y = x_true + 0.1 * odl.phantom.white_noise(X)\n",
+    "    \n",
+    "_ = x_true.show(title='Original image (x_true)')\n",
+    "_ = y.show(title='Noisy image (y)')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we set up the functions and the linear operators:\n",
+    "\n",
+    "$$\n",
+    "    f = \\iota_0,\\quad g_1 = \\|\\cdot - y\\|_2^2, \\quad g_2 = \\|\\cdot\\|_1, \\\\\n",
+    "    L_1 = I, \\quad L_2 = \\nabla.\n",
+    "$$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ident = odl.IdentityOperator(X)\n",
+    "grad = odl.Gradient(X)  # need this here for L1Norm below\n",
+    "\n",
+    "# Function without linear operator\n",
+    "f = odl.solvers.IndicatorNonnegativity(X)\n",
+    "\n",
+    "# Functions to be composed with linear operators. L[i] applies to g[i].\n",
+    "alpha = 0.15\n",
+    "g = [odl.solvers.L2NormSquared(X).translated(y),\n",
+    "     alpha * odl.solvers.L1Norm(grad.range)]\n",
+    "L = [ident, grad]\n",
+    "\n",
+    "# We check if everything makes sense by evaluating the total functional at 0\n",
+    "x = X.zero()\n",
+    "print(f(x) + sum(g[i](L[i](x)) for i in range(len(g))))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we set the $\\tau$ and $\\sigma_i$ parameters for the optimization. We use a small function to make it easier to check if the parameters work. To make the parameter search space smaller, we choose $\\tau$ (primal step size) and select\n",
+    "\n",
+    "$$\n",
+    "    \\sigma_i = \\frac{c}{\\|L_i\\|^2}, \\quad c = \\frac{3}{n \\tau}\n",
+    "$$\n",
+    "\n",
+    "with $n = 2$ in our case."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "grad_norm = 1.1 * odl.power_method_opnorm(grad, xstart=y, maxiter=20)\n",
+    "opnorms = [1, grad_norm]  # identity has norm 1\n",
+    "\n",
+    "\n",
+    "def check_params(tau, sigmas):\n",
+    "    sum_part = sum(sigma * opnorm ** 2\n",
+    "                   for sigma, opnorm in zip(sigmas, opnorms))\n",
+    "    print('Sum evaluates to', sum_part)\n",
+    "    check_value = tau * sum_part\n",
+    "    \n",
+    "    assert check_value < 4, 'value must be < 4, got {}'.format(check_value)\n",
+    "    print('Values ok, check evaluates to {}, must be < 4'.format(check_value))\n",
+    "    \n",
+    "\n",
+    "tau = 1.5\n",
+    "c = 3.0 / (len(opnorms) * tau)\n",
+    "sigmas = [c / opnorm ** 2 for opnorm in opnorms]\n",
+    "check_params(tau, sigmas)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we're ready to run the solver!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "# Starting point\n",
+    "x = X.zero()\n",
+    "\n",
+    "# Run PDHG method. The vector `x` is updated in-place.\n",
+    "odl.solvers.douglas_rachford_pd(x, f, g, L, tau, sigmas, niter=200)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We again visually inspect the results and evaluate some quality metrics from `odl.contrib.fom`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "_ = x_true.show('True image')\n",
+    "_ = y.show('Noisy image')\n",
+    "_ = x.show('Denoised image')\n",
+    "_ = (x_true - x).show('Difference true - denoised')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from odl.contrib import fom\n",
+    "\n",
+    "print('Noisy')\n",
+    "print('-----')\n",
+    "print('Mean squared error:', fom.mean_squared_error(y, x_true))\n",
+    "print('PSNR:', fom.psnr(y, x_true))\n",
+    "print('SSIM:', fom.ssim(y, x_true))\n",
+    "print('')\n",
+    "\n",
+    "print('Denoised')\n",
+    "print('--------')\n",
+    "print('Mean squared error:', fom.mean_squared_error(x, x_true))\n",
+    "print('PSNR:', fom.psnr(x, x_true))\n",
+    "print('SSIM:', fom.ssim(x, x_true))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## References\n",
+    "\n",
+    "[BC2015] Bot, R I, and Csetnek, E R. *On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems*. Optimization, 64.1 (2015), pp 5--23.\n",
+    "\n",
+    "[BH2013] Bot, R I, and Hendrich, C. *A Douglas-Rachford type primal-dual method for solving inclusions with mixtures of     composite and parallel-sum type monotone operators*. SIAM Journal on Optimization, 23.4 (2013), pp 2541--2565."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}