From 5dfa0a3b2e575ec45fc6f782a5c611ce10c54cbb Mon Sep 17 00:00:00 2001 From: Holger Kohr Date: Mon, 4 Dec 2017 10:52:16 +0100 Subject: [PATCH] Add extra notebook with Douglas-Rachford solver --- code/Using_a_different_solver.ipynb | 287 ++++++++++++++++++++++++++++ 1 file changed, 287 insertions(+) create mode 100644 code/Using_a_different_solver.ipynb diff --git a/code/Using_a_different_solver.ipynb b/code/Using_a_different_solver.ipynb new file mode 100644 index 0000000..71bf933 --- /dev/null +++ b/code/Using_a_different_solver.ipynb @@ -0,0 +1,287 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Using a different solver\n", + "\n", + "The steps taken in the [first notebook](TV_denoising_with_PDHG.ipynb) are largely the same for all solvers. However, the specific form\n", + "\n", + "$$\n", + " \\min_{x \\in X} \\left[ f(L x) + g(x) \\right]\n", + "$$\n", + "\n", + "required us to rewrite the problem using product spaces.\n", + "\n", + "Other optimization methods, such as the (primal-dual variants of the) [Forward-Backward](https://github.com/odlgroup/odl/blob/master/odl/solvers/nonsmooth/forward_backward.py) [BC2015] and [Douglas-Rachford](https://github.com/odlgroup/odl/blob/master/odl/solvers/nonsmooth/douglas_rachford.py) [BH2013] methods, provide alternatives to PDHG that can be easier to set up.\n", + "\n", + "## The Douglas-Rachford solver\n", + "\n", + "We will take a closer look at the Douglas-Rachford splitting based method. Without going into the mathematical details of why and how, we observe that this method can solve problems of the form\n", + "\n", + "$$\n", + " \\min_{x \\in X} \\left[ f(x) + \\sum_{i=1}^n g_i (L_i x) \\right]\n", + "$$\n", + "\n", + "with convex functions $f$, $g_1, \\dots, g_n$ and linear operators $L_1, \\dots, L_n$. So in contrast to the PDHG method, we may have *any number of functions composed with linear operators*, which makes it much easier to set up the problem.\n", + "\n", + "## Setting up the problem\n", + "\n", + "We consider again the TV denoising problem\n", + "\n", + "$$\n", + " \\min_{x \\geq 0} \\left[\\| x - y \\|_2^2 + \\alpha \\| \\nabla x \\|_1 \\right],\n", + "$$\n", + "\n", + "or, rewriting the positivity constraint,\n", + "\n", + "$$\n", + " \\min_x \\left[ \\| x - y \\|_2^2 + \\alpha \\| \\nabla x \\|_1 + \\iota_0(x) \\right].\n", + "$$\n", + "\n", + "This form is already in the right shape for our solver, since we can choose the functions\n", + "\n", + "$$\n", + " f = \\iota_0,\\quad g_1 = \\|\\cdot - y\\|_2^2, \\quad g_2 = \\|\\cdot\\|_1,\n", + "$$\n", + "\n", + "and the linear operators\n", + "\n", + "$$\n", + " L_1 = I, \\quad L_2 = \\nabla.\n", + "$$\n", + "\n", + "## Parameter selection\n", + "\n", + "In exchange for the easy problem setup, we somewhat trade the simplicity of finding good optimization parameters. The Douglas-Rachford method is guaranteed to converge if $\\tau$ and $\\sigma_1, \\dots, \\sigma_n$ are chosen such that\n", + "\n", + "$$\n", + " \\tau \\sum_{i=1}^n \\sigma_i \\|L_i\\|^2 < 4.\n", + "$$\n", + "\n", + "In our case, we have three parameters $\\tau, \\sigma_1, \\sigma_2$, which makes it a bit more complicated to pick good values compared to PDHG with two parameters. As a rule of thumb, it often makes sense to have the $\\sigma_i$ parameters balance the operator norms, i.e., to choose\n", + "\n", + "$$\n", + " \\sigma_i = c \\|L_i\\|^{-2},\n", + "$$\n", + "\n", + "and to select $\\tau$ such that the sum converges, i.e.,\n", + "\n", + "$$\n", + " \\tau < \\frac{4}{nc}.\n", + "$$\n", + "\n", + "It also makes sense to push the keep the sum in the order of 1 since otherwise, the step sizes are too small. Overall selecting good parameters is a bit of a trial-and-error procedure." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Implementation\n", + "\n", + "We consider the same scenario as in the [first notebook](TV_denoising_with_PDHG.ipynb). As before, we define a reconstruction space $X$ using uniform sampling on the rectangle $[0, N]$" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%matplotlib inline\n", + "import numpy as np\n", + "import scipy.misc\n", + "\n", + "import odl\n", + "\n", + "# Generate test image\n", + "image = scipy.misc.ascent().astype('float32')\n", + "\n", + "# Create reconstruction space\n", + "shape = image.T.shape\n", + "X = odl.uniform_discr(min_pt=[0, 0], max_pt=shape, shape=shape)\n", + "\n", + "# Wrap image as space element, generate noisy variant and display\n", + "image /= image.max()\n", + "x_true = X.element(np.rot90(image, -1))\n", + "# To get predictable randomness, we explicitly seed the random number generator\n", + "with odl.util.NumpyRandomSeed(123):\n", + " y = x_true + 0.1 * odl.phantom.white_noise(X)\n", + " \n", + "_ = x_true.show(title='Original image (x_true)')\n", + "_ = y.show(title='Noisy image (y)')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we set up the functions and the linear operators:\n", + "\n", + "$$\n", + " f = \\iota_0,\\quad g_1 = \\|\\cdot - y\\|_2^2, \\quad g_2 = \\|\\cdot\\|_1, \\\\\n", + " L_1 = I, \\quad L_2 = \\nabla.\n", + "$$" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "ident = odl.IdentityOperator(X)\n", + "grad = odl.Gradient(X) # need this here for L1Norm below\n", + "\n", + "# Function without linear operator\n", + "f = odl.solvers.IndicatorNonnegativity(X)\n", + "\n", + "# Functions to be composed with linear operators. L[i] applies to g[i].\n", + "alpha = 0.15\n", + "g = [odl.solvers.L2NormSquared(X).translated(y),\n", + " alpha * odl.solvers.L1Norm(grad.range)]\n", + "L = [ident, grad]\n", + "\n", + "# We check if everything makes sense by evaluating the total functional at 0\n", + "x = X.zero()\n", + "print(f(x) + sum(g[i](L[i](x)) for i in range(len(g))))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we set the $\\tau$ and $\\sigma_i$ parameters for the optimization. We use a small function to make it easier to check if the parameters work. To make the parameter search space smaller, we choose $\\tau$ (primal step size) and select\n", + "\n", + "$$\n", + " \\sigma_i = \\frac{c}{\\|L_i\\|^2}, \\quad c = \\frac{3}{n \\tau}\n", + "$$\n", + "\n", + "with $n = 2$ in our case." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "grad_norm = 1.1 * odl.power_method_opnorm(grad, xstart=y, maxiter=20)\n", + "opnorms = [1, grad_norm] # identity has norm 1\n", + "\n", + "\n", + "def check_params(tau, sigmas):\n", + " sum_part = sum(sigma * opnorm ** 2\n", + " for sigma, opnorm in zip(sigmas, opnorms))\n", + " print('Sum evaluates to', sum_part)\n", + " check_value = tau * sum_part\n", + " \n", + " assert check_value < 4, 'value must be < 4, got {}'.format(check_value)\n", + " print('Values ok, check evaluates to {}, must be < 4'.format(check_value))\n", + " \n", + "\n", + "tau = 1.5\n", + "c = 3.0 / (len(opnorms) * tau)\n", + "sigmas = [c / opnorm ** 2 for opnorm in opnorms]\n", + "check_params(tau, sigmas)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we're ready to run the solver!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "# Starting point\n", + "x = X.zero()\n", + "\n", + "# Run PDHG method. The vector `x` is updated in-place.\n", + "odl.solvers.douglas_rachford_pd(x, f, g, L, tau, sigmas, niter=200)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We again visually inspect the results and evaluate some quality metrics from `odl.contrib.fom`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "_ = x_true.show('True image')\n", + "_ = y.show('Noisy image')\n", + "_ = x.show('Denoised image')\n", + "_ = (x_true - x).show('Difference true - denoised')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from odl.contrib import fom\n", + "\n", + "print('Noisy')\n", + "print('-----')\n", + "print('Mean squared error:', fom.mean_squared_error(y, x_true))\n", + "print('PSNR:', fom.psnr(y, x_true))\n", + "print('SSIM:', fom.ssim(y, x_true))\n", + "print('')\n", + "\n", + "print('Denoised')\n", + "print('--------')\n", + "print('Mean squared error:', fom.mean_squared_error(x, x_true))\n", + "print('PSNR:', fom.psnr(x, x_true))\n", + "print('SSIM:', fom.ssim(x, x_true))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## References\n", + "\n", + "[BC2015] Bot, R I, and Csetnek, E R. *On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems*. Optimization, 64.1 (2015), pp 5--23.\n", + "\n", + "[BH2013] Bot, R I, and Hendrich, C. *A Douglas-Rachford type primal-dual method for solving inclusions with mixtures of composite and parallel-sum type monotone operators*. SIAM Journal on Optimization, 23.4 (2013), pp 2541--2565." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.2" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}