(转载请注明出处:http://blog.csdn.net/buptgshengod)
group=array([[9,400],[200,5],[100,77],[40,300]])
shape:显示(行,列)例:shape(group)=(4,2)
zeros:列出一个相同格式的空矩阵,例:zeros(group)=([[0,0],[0,0],[0,0],[0,0]])
tile函数位于python模块 numpy.lib.shape_base中,他的功能是重复某个数组。比如tile(A,n),功能是将数组A重复n次,构成一个新的数组
sum(axis=1)矩阵每一行向量相加
createDataset
from __future__ import division
from numpy import *
import operator
def createDataset():
group=array([[9,400],[200,5],[100,77],[40,300]])
labels=[‘1‘,‘2‘,‘3‘,‘1‘]
return group,labels autoNorm
def autoNorm(dataSet):
minVals = dataSet.min(0)
maxVals = dataSet.max(0)
ranges = maxVals - minVals
normDataSet = zeros(shape(dataSet))
m = dataSet.shape[0]
normDataSet = dataSet - tile(minVals, (m,1))
#print normDataSet
normDataSet = normDataSet/tile(ranges, (m,1)) #element wise divide
# print normDataSet
return normDataSet, ranges, minValsclassify
def classify(inX, dataSet, labels, k):
dataSetSize = dataSet.shape[0]
diffMat = tile(inX, (dataSetSize,1)) - dataSet
sqDiffMat = diffMat**2
sqDistances = sqDiffMat.sum(axis=1)
distances = sqDistances**0.5
sortedDistIndicies = distances.argsort()
classCount={}
for i in range(k):
voteIlabel = labels[sortedDistIndicies[i]]
classCount[voteIlabel] = classCount.get(voteIlabel,0) + 1
sortedClassCount = sorted(classCount.iteritems(), key=operator.itemgetter(1), reverse=True)
return sortedClassCount[0][0]
【机器学习算法-python实现】KNN-k近邻算法的实现(附源码),布布扣,bubuko.com
【机器学习算法-python实现】KNN-k近邻算法的实现(附源码)
原文:http://blog.csdn.net/buptgshengod/article/details/24313239