{ "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 4, "cells": [ { "cell_type": "markdown", "metadata": { "lang": "en" }, "source": [ "This lesson has 3 virtues:\n", "\n", "* manipulate bases of subspaces\n", "* mix partial derivatives and vectors\n", "* find an algorithm more efficient than the gradient method to solve $A \\, {\\bf x} = {\\bf b}$" ] }, { "cell_type": "markdown", "metadata": { "lang": "en" }, "source": [ "# Conjugate gradient method\n", "\n", "We have seen that it is possible to compute the optimal µ of the gradiant method to\n", "go directly to the minimum of the $J$ functional in along the line\n", "${\\bf x}^k \\; + µ \\, \\nabla J({\\bf x}^k)$ with µ varying from $- \\infty$\n", "at $+ \\infty$ (see ma33).\n", "\n", "We have also seen that if we have the optimal µ, then the next steepest slope,\n", "$\\nabla J ({\\bf x}^{k+1})$, will be orthogonal to the slope which defines the line on which we seek µ. So we have\n", "\n", "$$\\nabla J ({\\bf x}^{k+1})^T \\, . \\, \\nabla J ({\\bf x}^k) = 0$$\n", "\n", "This is good because it means that the next minimum, ${\\bf x}^{k+2}$, will be the minimum of the space generated by $\\nabla J ({\\bf x}^{k+1})$ and $\\nabla J ({\\bf x}^k)$\n", "(which is better than just being the minimum along a line).\n", "\n", "Unfortunately nothing tells me that ${\\bf x}^{k+3}$ will not be calculated along our\n", "first direction, $\\nabla J({\\bf x}^k)$. We have seen in the case\n", "of the gradient without searching for the optimal µ that we can oscillate between 2 values. That\n", "is not possible with the optimal µ but we can go around in circles on a few values.\n", "\n", "Even if we don't go around in circles and we converge, is it likely that we are looking for\n", "again a minimum in a part of the $ℝ^n$ space that we have already explored.\n", "\n", "An optimal search for the minimum of a convex function in a $ℝ^n$ space does not\n", "should not take more than $n$ iteration if we are still able to calculate\n", "the optimal µ in the chosen direction. To be convinced of this, it suffices to choose\n", "as directions the vectors of the basis of our space $ℝ^n$. Once we\n", "has found the minimum in all directions of the base, we have the global minimum.\n", "There is no longer any possible direction of search which would not be the linear combination of the vectors\n", "already used." ] }, { "cell_type": "markdown", "metadata": { "lang": "en" }, "source": [ "#### 2D case\n", "
| \n", " \n", "If we look for the point ${\\bf x} = (x_1, x_2)$ which minimizes $J({\\bf x})$ (convex)\n", "and that we know how to calculate the optimal µ of our gradient method,\n", "then we find the global minimum in 2 iterations at most.\n", "\n", "But id we\n", "take a too large µ which makes us pass above the minimum\n", "global, then at the next iteration the largest slope will be exactly\n", "the opposite of the one we just used and we will oscillate.\n", "\n", "On the drawing we can see that we find the global minimum in 2 iterations\n", "but we say to ourselves that it is a stroke of luck and besides is it really\n", "in 2 iterations since it is written ${\\bf x}^k$? That's the limit\n", "of a drawing that wants to explain what is happening in $ℝ^n$ to humans\n", "accustomed to $ℝ^3$ (see this video on the 4th dimension outside\n", "classes).\n", "\n", " | \n",
" \n",
" |
|---|