首页 > 其他 > 详细

Data Cleaning 3

时间:2016-10-23 12:08:53      阅读:241      评论:0      收藏:0      [点我收藏+]

1. Find correlations for each type of data by using corr()

  correlations = combined.corr(method = "pearson")
  print(correlations["sat_score"])

note: The value of correlation is from -1 to 1. If the data close to 1, they are positive correlated. If the value close to -1, they are negative correlated. If the data close to 0, they are not correlated.  

2. Then we can plot these data by using plot() function.

  %matplotlib inline

  import matplotlib.pyplot as plt

  combined.plot(‘total_enrollment‘,‘sat_score‘,kind = "scatter") #plot(x,y,kind)

3. Then we can filter the data to digging some info we need. 

4. We mapping out the school we need in certain area.

  from mpl_toolkits.basemap import Basemap

  m = Basemap(projection = "merc",llcrnrlat = 40.496044, urcrnrlat = 40.915256, llcrnrlon = -74.255735,urcrnrlon = -73.700272,resolution = "i") # urcrnrlon =  upper right corner longititude. llcrnrlon = lower left corner longitude. urcrnrlat = upper right corner latitute,llcrnrlat = lower left corner latitude.
  m.drawmapboundary(fill_color=‘#85A6D9‘)
  m.drawcoastlines(color=‘#6D5F47‘, linewidth=.4)
  m.drawrivers(color=‘#6D5F47‘, linewidth=.4)

  latitudes = combined["lat"].tolist()
  longitudes = combined["lon"].tolist()

  m.scatter(longitudes,latitudes,s = 20, zorder = 2 , latlon = True ) # scatter can only shows the list.

5. We can change the parameter of the scatter() to change the 
  plt.show

Data Cleaning 3

原文:http://www.cnblogs.com/kingoscar/p/5989106.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!