Gradient descent for linear regression

- 비용 함수와 기울기 하강을 이용해 선형 회귀를 위한 알고리즘, 또는 데이터에 맞는 일차함수를 구한다.

Gradient descent algorithm

$Repeat\;until\;convergence\;\{$

$\theta_{j} \leftarrow \theta_{j} - \alpha\frac{\partial}{\partial\theta_{j}}J(\theta)\;\;\;\;(j\,=\,0, 1, ..., n)$

$\}$

$h_{\theta}(x) = \theta_{0} + \theta_{1}x_{1} + \cdots + \theta_{n}x_{n}$

$J(\theta) = \frac{1}{2m}\sum_{i=1}^{m}\{h_{\theta}(x^{(i)}) - y^{(i)}\}^2$

$\frac{\partial}{\partial\theta_{j}}J(\theta)$

$= \frac{\partial}{\partial\theta_{j}}\frac{1}{2m}\sum_{i=1}^{m}\{h_{\theta}(x^{(i)}) - y^{(i)}\}^2$

$= \frac{\partial}{\partial\theta_{j}}\frac{1}{2m}\sum_{i=1}^{m}\{\theta_{0} + \theta_{1}x_{1} + \cdots + \theta_{n}x_{n} - y^{(i)}\}^2$

$j = 0: \frac{\partial}{\partial\theta_{j}}J(\theta) = \frac{1}{m}\sum_{i=1}^{m}\{h_{\theta}(x^{(i)}) - y^{(i)}\}$

$j = k: \frac{\partial}{\partial\theta_{j}}J(\theta) = \frac{1}{m}\sum_{i=1}^{m}[\{h_{\theta}(x^{(i)}) - y^{(i)}\}x^{(i)}]$ $(for$ $1 \leq k \leq n)$

$Repeat\;until\;convergence\;\{$

$\theta_{0} \leftarrow \theta_{0} - \alpha\frac{1}{m}\sum_{i=1}^{m}\{h_{\theta}(x^{(i)}) - y^{(i)}\}$

$\theta_{k} \leftarrow \theta_{k} - \alpha\frac{1}{m}\sum_{i=1}^{m}[\{h_{\theta}(x^{(i)}) - y^{(i)}\}x^{(i)}]$

$\}$

$(Update\; \theta_{0}\; and\; \theta_{k}\; simultaneously)$

$m$: The number of training examples

$n$: The number of features

$i$: i-th training example

$\alpha$: learning rate