首页 > 系统服务 > 详细

线性回归的Spark实现 [Linear Regression / Machine Learning / Spark]

时间:2015-05-13 18:45:49      阅读:756      评论:0      收藏:0      [点我收藏+]

1- 问题提出

 


2- 线性回归

 


3- 理论推导

 


4- Python/Spark实现

 1 # -*- coding: utf-8 -*-
 2 from pyspark import SparkContext
 3 
 4 
 5 theta = [0, 0]
 6 alpha = 0.001
 7 
 8 sc = SparkContext(local)
 9 
10 def func_theta_x(x):
11     return sum([i * j for i, j in zip(theta, x)])
12 
13 def cost(x):
14     thx = func_theta_x(x)
15     return thx - x[-1]
16 
17 def partial_theta(x):
18     dif = cost(x)
19     return [dif * i for i in x[:-1]]
20 
21 rdd = sc.textFile(/home/freyr/linearRegression.txt)22         .map(lambda line: map(float, line.strip().split(\t)))
23 
24 maxiter = 400
25 iter = 0
26 while True:
27     parTheta = rdd.map(partial_theta)28                   .reduce(lambda x, y: [i + j for i, j in zip(x, y)])
29 
30     for i in range(2):
31         theta[i] = theta[i] - alpha * parTheta[i]
32 
33     iter += 1
34 
35     if iter <= maxiter:
36         if sum(map(abs, parTheta)) <= 0.01:
37             print I get it!!!
38             print Iter = %s % iter
39             print Theta = %s % theta
40             break
41     else:
42         print Failed...
43         break

PS: 1. linearRegression.txt


 

线性回归的Spark实现 [Linear Regression / Machine Learning / Spark]

原文:http://www.cnblogs.com/freyr/p/4501100.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!