确定序列中出现频率最高的项

时间：2019-08-04 22:49:12 阅读：122 评论：0 收藏：0 [点我收藏+]

collections模块有一个类，专门用于统计序列中各个元素出现的次数，叫做Counter。

Counter实际上是dict的一个子类。并提供了额外的一些专门用于统计次数的方法：most_common()。

words = [
    'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',
    'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around', 'the',
    'eyes', "don't", 'look', 'around', 'the', 'eyes', 'look', 'into',
    'my', 'eyes', "you're", 'under'
]
from collections import Counter 
word_counts = Counter(words)
top_three = word_counts.most_common(3) 
print(top_three)
# Outputs [('eyes', 8), ('the', 5), ('look', 4)]

Counter对象的一个很少被知道的特性是其实现了对象的相关数学操作，如下：

>>> a = Counter(words)
>>> b = Counter(morewords)
>>> a
Counter({'eyes': 8, 'the': 5, 'look': 4, 'into': 3, 'my': 3, 'around': 2,
             "you're": 1, "don't": 1, 'under': 1, 'not': 1})
>>> b
Counter({'eyes': 1, 'looking': 1, 'are': 1, 'in': 1, 'not': 1, 'you': 1,
             'my': 1, 'why': 1})
>>> # Combine counts
>>> c = a + b
>>> c
Counter({'eyes': 9, 'the': 5, 'look': 4, 'my': 4, 'into': 3, 'not': 2,
             'around': 2, "you're": 1, "don't": 1, 'in': 1, 'why': 1,
             'looking': 1, 'are': 1, 'under': 1, 'you': 1})
>>> # Subtract counts
>>> d = a - b
>>> d
Counter({'eyes': 7, 'the': 5, 'look': 4, 'into': 3, 'my': 2, 'around': 2,
             "you're": 1, "don't": 1, 'under': 1})
>>>

毋庸置疑，Counter对象对于几乎任何需要制表和计数数据的问题都是非常有用的工具。相比涉及字典的手动操作的解决方案，你应该优先考虑使用Counter对象。

确定序列中出现频率最高的项

原文：https://www.cnblogs.com/jeffrey-yang/p/11300200.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)