python3聚类函数 python聚类算法( 六 )


for col in range(numFeatures):
features = [record[col] for record in dataset] ### 提取每一列的特征向量 如此处col= 0 ,则features = [1,1,0,0]
uniqueFeat = set(features)
curInfoGain = 0 ### 根据每一列进行拆分 , 所获得的信息增益
for featVal in uniqueFeat:
subDataset = splitDataset(dataset,col,featVal) ### 根据col列的featVal特征值来对数据集进行划分
prob = 1.0 * len(subDataset)/numFeatures ### 计算子特征数据集所占比例
curInfoGain += prob * calcEntropy(subDataset) ### 计算col列的特征值featVal所产生的信息增益
# print "col : " ,col , " featVal : " , featVal , " curInfoGain :" ,curInfoGain ," baseInfoGain : " ,baseInfoGain
print "col : " ,col , " curInfoGain :" ,curInfoGain ," baseInfoGain : " ,baseInfoGain
if curInfoGainbaseInfoGain:
baseInfoGain = curInfoGain
bestFeature = col
return baseInfoGain,bestFeature ### 输出最大的信息增益,以获得该增益的列
dataset,label = initDataSet()
infogain , bestFeature = findBestFeature(dataset)
print "bestInfoGain :" , infogain, " bestFeature:",bestFeature
【python3聚类函数 python聚类算法】关于python3聚类函数和python聚类算法的介绍到此就结束了,不知道你从中找到你需要的信息了吗 ?如果你还想了解更多这方面的信息 , 记得收藏关注本站 。