1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
|
{
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 4,
"cells": [
{
"cell_type": "markdown",
"metadata": {
"lang": "en"
},
"source": [
"## The gradient method to solve A x = b\n",
"\n",
"The goal of this lab is to let you move forward on your own. Look at previous course and program the gradient method\n",
"to solve the $A {\\bf x} = {\\bf b}$ matrix system with A symmetric and diagonally dominant\n",
"($a_{ii} > \\sum_{k \\ne i} |a_{ik}|$).\n",
"\n",
"* Start in 2D with a 2x2 matrix, check that the result is good and plot the convergence curve\n",
"* Switch to nxn (we will show that it works with a 9x9 matrix)\n",
"\n",
"It may be interesting to normalize the matrix A to prevent the calculations from exploding."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"def solve_with_gradiant(A, b):\n",
" mu = 0.01\n"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[49, 13],\n",
" [13, 13]])"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
"A = np.random.randint(10, size=(2,2))\n",
"A = A + A.T\n",
"A += np.diag(A.sum(axis=1))\n",
"A"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([62, 26])"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b = A.sum(axis=1)\n",
"b"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"solve_with_gradiant(A,b)"
]
},
{
"cell_type": "markdown",
"metadata": {
"lang": "en"
},
"source": [
"## Introduce inertia\n",
"\n",
"Introduce inertia into the gradient method. What do we notice?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {
"lang": "en"
},
"source": [
"## Optimal value of µ\n",
"\n",
"Note that two successive slope directions are orthogonal if the next point is the minimum in\n",
"given direction ($\\nabla J ({\\bf x}^k$)).\n",
"\n",
"$$\\nabla J ({\\bf x}^{k+1})^T \\; \\nabla J ({\\bf x}^k) = 0$$\n",
"\n",
"#### Demonstration\n",
"\n",
"\n",
"<table><tr>\n",
" <th>\n",
"We want to adjust µ to arrive at the minimum of J when we advance in the direction $- \\nabla J({\\bf x}^{k})$.\n",
"This means that the partial derivative of $J({\\bf x}^{k+1})$ with respect to µ must be\n",
"equal to 0 or by making µ appear in the equation:\n",
"\n",
"$$\n",
"\\frac{\\partial J ({\\bf x}^k - µ \\; \\nabla J ({\\bf x}^k))}{\\partial µ} = 0\n",
"$$\n",
"\n",
"By expanding we have\n",
"\n",
"$$\n",
"\\begin{align}\n",
"\\frac{\\partial J ({\\bf x}^{k+1})}{\\partial {\\bf x}^{k+1}} \\; \n",
"\\frac{\\partial {\\bf x}^{k+1}}{\\partial µ} & = 0 \\\\\n",
"J'({\\bf x}^{k+1}) \\, . \\, (- \\nabla J ({\\bf x}^k)) & = 0 \\\\\n",
"(A\\, {\\bf x}^{k+1} - b) \\, . \\, \\nabla J ({\\bf x}^k) & = 0 \\quad \\textrm{since A is symetric}\\\\\n",
"\\nabla J ({\\bf x}^{k+1}) \\, . \\, \\nabla J ({\\bf x}^k) & = 0 \\quad \\textrm{QED}\n",
"\\end{align}\n",
"$$\n",
"\n",
"\n",
"</th><th>\n",
"<img src=\"images/gradient.png\" width = \"400px\"/>\n",
"</th></tr></table>\n",
"\n",
"Using this property, evaluate the optimal value of µ to reach the minimum in the direction of\n",
"$\\nabla J ({\\bf x}^k)$.\n",
"\n",
"Write the gradient method with the calculation of the optimal µ at each iteration to solve $A {\\bf x} = {\\bf b}$."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
]
}
|