1.numpy.random.normal
Draw random samples from a normal (Gaussian) distribution.
The probability density function of the normal distribution, first derived by De Moivre and 200 years later by both Gauss and Laplace independently [R250], is often called the bell curve because of its characteristic shape (see the example below).
The normal distributions occurs often in nature. For example, it describes the commonly occurring distribution of samples influenced by a large number of tiny, random disturbances, each with its own unique distribution [R250].
Parameters: |
loc : float
scale : float
size : int or tuple of ints, optional
|
---|
See also
Notes
The probability density for the Gaussian distribution is
where is the mean and the standard deviation. The square of the standard deviation, , is called the variance.
The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at and [R250]). This implies that numpy.random.normal is more likely to return samples lying close to the mean, rather than those far away.
References
[R249] | Wikipedia, “Normal distribution”, http://en.wikipedia.org/wiki/Normal_distribution |
[R250] | (1, 2, 3, 4) P. R. Peebles Jr., “Central Limit Theorem” in “Probability, Random Variables and Random Signal Principles”, 4th ed., 2001, pp. 51, 51, 125. |
Examples
Draw samples from the distribution:
Verify the mean and the variance:
Display the histogram of the samples, along with the probability density function:
(Source code, png, pdf)
import numpy as np
np.random.randn(2,3)
array([[ 0.59941534, 1.0991949 , 1.36316028], [-0.01979197, 1.30783162, -0.69808199]])
意思是从标准正太分布中随机抽取。
3.scipy.optimize.leastsq
最小二乘法
import numpy as np
from scipy.optimize import leastsq
#待拟合的函数,x是变量,p是参数
def fun(x, p):
a, b = p
return a*x + b
#计算真实数据和拟合数据之间的误差,p是待拟合的参数,x和y分别是对应的真实数据
def residuals(p, x, y):
return fun(x, p) - y
#一组真实数据,在a=2, b=1的情况下得出
x1 = np.array([1, 2, 3, 4, 5, 6], dtype=float)
y1 = np.array([3, 5, 7, 9, 11, 13], dtype=float)
#调用拟合函数,第一个参数是需要拟合的差值函数,第二个是拟合初始值,第三个是传入函数的其他参数
r = leastsq(residuals, [1, 1], args=(x1, y1))
#打印结果,r[0]存储的是拟合的结果,r[1]、r[2]代表其他信息
print r[0]
运行之后,拟合结果是
[2. 1.]
但是在这次实际的使用过程中,我拟合的函数不是这样简单的,其中的一个难点是待拟合函数是一个分段函数,需要判断自变量的值,然后给出不同的函数方程式,举个例子, 这样一个分段函数:当x > 3时,y = ax + b, 当x <= 3 时,y = ax – b, 用Python代码写一下:
def fun(x, p):
a, b = p
if (x > 3):
return a*x + b
else:
return a*x - b
如果我们还是使用原来的差值函数进行拟合,会得到这样的错误:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
原因很简单,我们现在的fun函数只能计算单个值了,如果传入的还是一个array,自然就会报错。那么怎么办呢?我也很郁闷,于是在scipy的maillist里寻求帮助, 外国牛牛们都很热心,很快就指出了问题。其实是我对于差值函数理解错了,leastsq函数所要传入的差值函数需要返回的其实是一个array, 于是我们可以这样修改差值函数:
def residuals(p, x, y):
temp = np.array([0,0,0,0,0,0],dtype=float)
for i in range(0, len(x)):
temp[i] = fun(x[i], p)
return temp - y
4.
import numpy as np #惯例
import scipy as sp #惯例
from scipy.optimize import leastsq #这里就是我们要使用的最小二乘的函数
import pylab as pl
m = 9 #多项式的次数
def real_func(x):
return np.sin(2*np.pi*x) #sin(2 pi x)
def fake_func(p, x):
f = np.poly1d(p) #多项式分布的函数
return f(x)
#残差函数
def residuals(p, y, x):
return y - fake_func(p, x)
#随机选了9个点,作为x
x = np.linspace(0, 1, 9)
#画图的时候需要的“连续”的很多个点
x_show = np.linspace(0, 1, 1000)
y0 = real_func(x)
#加入正态分布噪音后的y
y1 = [np.random.normal(0, 0.1) + y for y in y0]
#先随机产生一组多项式分布的参数
p0 = np.random.randn(m)
plsq = leastsq(residuals, p0, args=(y1, x))
print (‘Fitting Parameters :‘, plsq[0]) #输出拟合参数
pl.plot(x_show, real_func(x_show), label=‘real‘)
pl.plot(x_show, fake_func(plsq[0], x_show), label=‘fitted curve‘)
pl.plot(x, y1, ‘bo‘, label=‘with noise‘)
pl.legend()
pl.show()
原文:http://blog.csdn.net/liangzuojiayi/article/details/51247489