Linear Regression

时间：2020-04-13 13:56:19 阅读：60 评论：0 收藏：0 [点我收藏+]

原创转载请注明出处：https://www.cnblogs.com/agilestyle/p/12690755.html

准备数据

import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error

X, y = make_regression(n_samples=500, n_features=1, noise=20, random_state=0)

# (500, 1)
X.shape

# (500,)
y.shape

plt.scatter(X, y)

技术分享图片

建模训练

lr = LinearRegression()
y = y.reshape(-1, 1)

# LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)
lr.fit(X, y)

评价模型

# 查看模型的截距 array([-1.53821792])
lr.intercept_

# 查看模型的斜率 array([[45.2879203]])
lr.coef_

# 0.8458565184565707
lr.score(X, y)

lr_mse = mean_squared_error(y, lr.predict(X))
# 均方误差 372.3837648686677
lr_mse

plt.scatter(X, y)
plt.plot(X, lr.predict(X), ‘r‘)

技术分享图片

使用随机梯度下降求解参数

sgd = SGDRegressor(eta0=0.01, max_iter=100)  # eta0: 初始学习率，max_iter: 最大迭代次数
sgd.fit(X, y.ravel())
sgd.score(X, y)

Note：

梯度下降的两个重要因素

学习率决定了下降的步伐的大小，学习率过小，每次只移动一小步，需要移动很多次才能达到最小值或局部最小值；学习率过大，则每次移动一大步，很容易错过最优解。
偏导数决定了下降的方向。每次都选择下降最快的方向。一个函数在某一点的导数描述了这个函数在这一点附近的变化率。

Reference

scikit-learn Cookbook Second Edition

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html

https://docs.scipy.org/doc/numpy/reference/generated/numpy.ravel.html

https://developers.google.com/machine-learning/crash-course/reducing-loss/gradient-descent

https://developers.google.com/machine-learning/crash-course/reducing-loss/learning-rate

Linear Regression

原文：https://www.cnblogs.com/agilestyle/p/12690755.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)