use gradient descent to achieve liner regression (Matlab)

时间：2019-08-27 20:06:37 阅读：106 评论：0 收藏：0 [点我收藏+]

# what‘s the Supervised learning

Always we want to use the data we have to predict the part we don‘t know. And we prefer to find a mathematical function to predict. Therefore Supervised learning was created.

技术分享图片

As we said above, "Training set" is the data we have, and the "learning algorithm" is a strategy to help us get the mathematical function. the lower case "h" means "hypothesis"(for some historical reasons) and that‘s the mathematical function to predict.

Let‘s give a definition more formally, Supervised learning: "given a training set, to learn a function h: X -> Y so that h(x) is a "good" predictor for the corresponding value of y."

# regression problem and classification problem

we still focus on the structure chart above, the difference between a regression problem and a classification problem is "dose the predicted y is continuous".

if we are trying to predict is continuous, we call the learning problem is a regression problem. When y can take on only a small number of discrete values, we call it a classification problem.

# Linear Regression

Let‘s start with a simple example. we get a set of training examples -- "three points (1,1) (2,2) (3,2)". and we want to get a line helping us to predict. We easily know that these points cannot in the same line. so how we can get a "good" predictor?

First of all, let‘s say we decide to approximate y as a linear function of x: h_Θ(x) = Θ₀+Θ₁x.

Here, the Θ_i‘s are the parameters (also called weights) parameterizing the space of liner functions mapping from X to Y.

Now, given a training set, how do we pick, or learn the parameters Θ?

One reasonable way is to make h(x) close to y. To formalize this, we define a function(cost function):

技术分享图片

// notation: m: the number of training example

//x = "input" variables/features

//y = "out" variables/targets

//(xⁱ,yⁱ): the ith traning example

# gradient descent

Now our problem is changed to minimize J(Θ).

before that let‘s consider the gradient descent algorithm:

技术分享图片

// It starts with some initial Θ and repeatedly performs the update.

//as for why we can get the minimum Θ, it‘s about a piece of mathematical knowledge.

now we do some simplifying.

技术分享图片

therefore, we get the following algorithm:

技术分享图片 // For a single training example.

技术分享图片 //batch gradient descent

技术分享图片 //stochastic gradient descent

and the following code is using Matlab to solve the simple problem above.

clear all
clc

% hypothesis
theta0 = 1;
theta1 = 1;
x = [1;2;3];
y = [1;2;2];

% gradient descent
alpha = 0.01;
for k = 1:5000
    time = k;
    hypothesis = theta0 + theta1*x;
    %cost function
    %cost = sum((hypothesis - y).^2)/3;
    r0 = sum(hypothesis - y);
    theta0 = theta0 - alpha*r0;
    r1 = sum((hypothesis - y).*x);
    theta1 = theta1 - alpha*r1;
end

use gradient descent to achieve liner regression (Matlab)

原文：https://www.cnblogs.com/Samuel514/p/11420298.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)