首页 > 其他 > 详细

【ES】简单使用

时间:2017-05-10 20:17:36      阅读:309      评论:0      收藏:0      [点我收藏+]
 1 import sys
 2 reload(sys)
 3 sys.setdefaultencoding(utf-8)
 4 
 5 from datetime import datetime
 6 from elasticsearch import Elasticsearch 
 7 from os import path
 8 import jieba
 9 import random
10 es = Elasticsearch()
11 
12 filePath = path.dirname(__file__)
13 
14 
15 # index1:wordcount
16 # stopwords
17 stopWordFile = ustopwords.txt
18 stopWordList = []
19 for L in open(path.join(filePath , stopWordFile)).readlines():
20     stopWordList.append(L.strip().decode(utf-8))
21 stopWordList.extend([u腾讯,u视频 , u])
22 stopWordList = set(stopWordList)
23 
24 # information words
25 new = words.txt
26 text = open(path.join( filePath , new )).read().strip(\r)
27 wordDict = {}
28 for w in jieba.cut(text):
29     if w not in stopWordList:
30         wordDict.setdefault(w , 0)
31         wordDict[w] += 1
32                 
33 for key in wordDict.keys():
34     data = {word:key , count:wordDict[key]}
35     es.index(index = wordcount , doc_type = test , body = data)
36     

 

【ES】简单使用

原文:http://www.cnblogs.com/colipso/p/6837845.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!