首页 > 其他 > 详细

课堂练习(词频统计)

时间:2017-09-25 15:43:02      阅读:296      评论:0      收藏:0      [点我收藏+]
希望曾老师讲的内容

没有什么意见,希望可以讲一下大数据的就业前景,就业的薪资待遇。

小说词频统计


import jieba book = "F:\最强升级系统.txt" txt = open(book,"r",encoding=‘GBK‘).read() ex = {‘神仙‘,‘系统‘,‘狂暴‘,‘玩家‘,‘提示‘,‘龙飞‘} ls = [] words = jieba.lcut(txt) counts = {} for word in words: ls.append(word) if len(word) == 1: continue else: counts[word] = counts.get(word,0)+1 for word in ex: del(counts[word]) items = list(counts.items()) items.sort(key = lambda x:x[1], reverse = True) for i in range(10): word , count = items[i] print ("{:<10}{:>5}".format(word,count)) lk = open(‘lk.txt‘,‘w+‘) lk.write(str(ls)) import matplotlib.pyplot as plt from wordcloud import WordCloud wzhz = WordCloud().generate(txt) plt.imshow(wzhz) plt.show()
================ RESTART: C:/Users/Administrator/Desktop/1.py ================
Building prefix dict from the default dictionary ...
Dumping model to file cache C:\Users\ADMINI~1\AppData\Local\Temp\jieba.cache
Loading model cost 0.814 seconds.
Prefix dict has been built succesfully.
没有           41
乔乔           31
恭喜           26
战士           23
李三           22
修炼           21
一个           20
废物           18
蛤蟆功          18
妖兽           17

  

  

课堂练习(词频统计)

原文:http://www.cnblogs.com/55lsk/p/7591974.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!