目录
数据表示->数据清洗->数据统计->数据可视化->数据挖掘->人工智能
人工智能:数据/语言/图像/视觉等方面深度分析与决策
Python库之机器学习
Numpy: 表达N维数组的最基础库,http://www.numpy.org
import numpy as np
def np_sum():
a = np.array([0, 1, 2, 3, 4])
b = np.array([9, 8, 7, 6, 5])
c = a**2 + b**3
return c
print(np_sum())
[729 513 347 225 141]
def py_sum():
a = [0, 1, 2, 3, 4]
b = [9, 8, 7, 6, 5]
c = []
for i in range(len(a)):
c.append(a[i]**2 + b[i]**3)
return c
print(py_sum())
[729, 513, 347, 225, 141]
Pandas: Python数据分析高层次应用库,http://pandas.pydata.org
能操作sql、json、pickle、csv、excel、ini等文件
DataFrame = 行列索引 + 二维数据
SciPy: 数学、科学和工程计算功能库,http://www.scipy.org
Matplotlib: 高质量的二维数据可视化功能库,http://matplotlib.org
Seaborn: 统计类数据可视化功能库,http://seaborn.pydata.org/
Mayavi:三维科学数据可视化功能库,http://docs.enthought.com/mayavi/mayavi/
PyPDF2:用来处理pdf文件的工具集,http://mstamy2.github.io/PyPDF2
from PyPDF2 import PdfFileReader, PdfFileMerger
merger = PdfFileMerger()
input1 = open("document1.pdf", "rb")
input2 = open("document2.pdf", "rb")
merger.append(fileobj=input1, pages=(0, 3))
merger.merge(position=2, fileobj=input2, pages=(0, 1))
output = open("document-output.pdf", "wb")
merger.write(output)
NLTK:自然语言文本处理第三方库,http://www.nltk.org/
from nltk.corpus import treebank
t = treebank.parsed_sents('wsj_0001.mrg')[0]
t.draw()
Python-docx:创建或更新Microsoft Word文件的第三方库,http://python-docx.readthedocs.io/en/latest/index.html
from docx import Document
document = Document()
document.add_heading('Document Title', 0)
p = document.add_paragraph('A plain paragraph having some ')
document.add_page_break()
document.save('demo.docx')
Scikit-learn:机器学习方法工具集,与数据处理相关的第三方库,http://scikit-learn.org/
TensorFlow:AlphaGo背后的机器学习计算框架,https://www.tensorflow.org/
import tensorflow as tf
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
res = sess.run(result)
print('result:', res)
MXNet:基于神经网络的深度学习计算框架,https://mxnet.incubator.apache.org/
原文:https://www.cnblogs.com/nickchen121/p/11219403.html