一些Pandas常用方法

时间：2017-09-06 23:46:09 阅读：424 评论：0 收藏：0 [点我收藏+]

Series（列）方法describe()，对于不同类型的变量的列，有不同返回值（http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.describe.html）

>>> s = pd.Series([1, 2, 3])
>>> s.describe()
count    3.0
mean     2.0
std      1.0
min      1.0
25%      1.5
50%      2.0
75%      2.5
max      3.0

>>> s = pd.Series([‘a‘, ‘a‘, ‘b‘, ‘c‘])
>>> s.describe()
count     4
unique    3
top       a
freq      2
dtype: object

列方法Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)

返回各值的频数，如果normalize=True返回各个值的频率

crosstab方法pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, dropna=True, normalize=False)

作用Compute a simple cross-tabulation of two (or more) factors. By default computes a frequency table of the factors unless an array of values and an aggregation function are passed

举例

>>> a
array([foo, foo, foo, foo, bar, bar,
       bar, bar, foo, foo, foo], dtype=object)
>>> b
array([one, one, one, two, one, one,
       one, two, two, two, one], dtype=object)
>>> c
array([dull, dull, shiny, dull, dull, shiny,
       shiny, dull, shiny, shiny, shiny], dtype=object)
>>> crosstab(a, [b, c], rownames=[‘a‘], colnames=[‘b‘, ‘c‘])
b    one          two
c    dull  shiny  dull  shiny
a
bar  1     2      1     0
foo  2     2      1     2

>>> foo = pd.Categorical([‘a‘, ‘b‘], categories=[‘a‘, ‘b‘, ‘c‘])
>>> bar = pd.Categorical([‘d‘, ‘e‘], categories=[‘d‘, ‘e‘, ‘f‘])
>>> crosstab(foo, bar)  # ‘c‘ and ‘f‘ are not represented in the data,
                        # but they still will be counted in the output
col_0  d  e  f
row_0
a      1  0  0
b      0  1  0
c      0  0  0

一些Pandas常用方法

原文：http://www.cnblogs.com/imageSet/p/7487375.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)