`statistics` — 数学统计函数 ¶

Added in version 3.4.

This module provides functions for calculating mathematical statistics of numeric ( Real -valued) data.

The module is not intended to be a competitor to third-party libraries such as NumPy , SciPy , or proprietary full-featured statistics packages aimed at professional statisticians such as Minitab, SAS and Matlab. It is aimed at the level of graphing and scientific calculators.

Unless explicitly noted, these functions support int , float , Decimal and Fraction . Behaviour with other types (whether in the numeric tower or not) is currently unsupported. Collections with a mix of types are also undefined and implementation-dependent. If your input data consists of mixed types, you may be able to use map() to ensure a consistent result, for example: map(float, input_data) .

Some datasets use NaN (not a number) values to represent missing data. Since NaNs have unusual comparison semantics, they cause surprising or undefined behaviors in the statistics functions that sort data or that count occurrences. The functions affected are median() , median_low() , median_high() , median_grouped() , mode() , multimode() ，和 quantiles() 。 NaN values should be stripped before calling these functions:

>>> from statistics import median
>>> from math import isnan
>>> from itertools import filterfalse
>>> data = [20.7, float('NaN'),19.2, 18.3, float('NaN'), 14.4]
>>> sorted(data)  # This has surprising behavior
[20.7, nan, 14.4, 18.3, 19.2, nan]
>>> median(data)  # This result is unexpected
16.35
>>> sum(map(isnan, data))    # Number of missing values
2
>>> clean = list(filterfalse(isnan, data))  # Strip NaN values
>>> clean
[20.7, 19.2, 18.3, 14.4]
>>> sorted(clean)  # Sorting now works as expected
[14.4, 18.3, 19.2, 20.7]
>>> median(clean)       # This result is now well defined
18.75

`mean()`	Arithmetic mean (“average”) of data.
`fmean()`	Fast, floating-point arithmetic mean, with optional weighting.
`geometric_mean()`	Geometric mean of data.
`harmonic_mean()`	Harmonic mean of data.
`kde()`	Estimate the probability density distribution of the data.
`kde_random()`	Random sampling from the PDF generated by kde().
`median()`	Median (middle value) of data.
`median_low()`	Low median of data.
`median_high()`	High median of data.
`median_grouped()`	Median (50th percentile) of grouped data.
`mode()`	Single mode (most common value) of discrete or nominal data.
`multimode()`	List of modes (most common values) of discrete or nominal data.
`quantiles()`	Divide data into intervals with equal probability.

`pstdev()`	Population standard deviation of data.
`pvariance()`	Population variance of data.
`stdev()`	Sample standard deviation of data.
`variance()`	Sample variance of data.

`covariance()`	Sample covariance for two variables.
`correlation()`	Pearson and Spearman’s correlation coefficients.
`linear_regression()`	Slope and intercept for simple linear regression.

`statistics` — 数学统计函数 ¶

平均和中心位置的度量 ¶

传播的度量 ¶

Statistics for relations between two inputs ¶

函数细节 ¶

异常 ¶

`NormalDist` 对象 ¶

范例和配方 ¶

Classic probability problems ¶

Monte Carlo inputs for simulations ¶

Approximating binomial distributions ¶

Naive bayesian classifier ¶

内容表

上一话题

下一话题

本页

内容表

上一话题

下一话题

本页

statistics — 数学统计函数 ¶

平均和中心位置的度量 ¶

传播的度量 ¶

Statistics for relations between two inputs ¶

函数细节 ¶

异常 ¶

NormalDist 对象 ¶

范例和配方 ¶

Classic probability problems ¶

Monte Carlo inputs for simulations ¶

Approximating binomial distributions ¶

Naive bayesian classifier ¶

内容表

上一话题

下一话题

本页

`statistics` — 数学统计函数 ¶

`NormalDist` 对象 ¶