The statistics
module in Python provides functions for calculating mathematical statistics of numeric data. These functions are useful for performing statistical calculations, including mean, median, mode, variance, and more.
Table of Contents
- Introduction
- Measures of Central Tendency
mean
fmean
geometric_mean
harmonic_mean
median
median_low
median_high
median_grouped
mode
multimode
- Measures of Dispersion
pstdev
pvariance
stdev
variance
- Other Statistical Functions
quantiles
- Examples
- Calculating Mean, Median, and Mode
- Calculating Standard Deviation and Variance
- Using Quantiles
- Real-World Use Case
- Conclusion
- References
Introduction
The statistics
module provides a wide range of functions to perform statistical calculations on numeric data. These functions are designed to work with iterable data types, such as lists, tuples, and more.
Measures of Central Tendency
mean
Returns the arithmetic mean (average) of data.
import statistics
data = [1, 2, 3, 4, 5]
print(statistics.mean(data)) # 3
Output:
3
fmean
Returns the fast, floating-point arithmetic mean of data.
import statistics
data = [1, 2, 3, 4, 5]
print(statistics.fmean(data)) # 3.0
Output:
3.0
geometric_mean
Returns the geometric mean of data.
import statistics
data = [1, 2, 3, 4, 5]
print(statistics.geometric_mean(data)) # 2.605171084697352
Output:
2.6051710846973517
harmonic_mean
Returns the harmonic mean of data.
import statistics
data = [1, 2, 3, 4, 5]
print(statistics.harmonic_mean(data)) # 2.18978102189781
Output:
2.18978102189781
median
Returns the median (middle value) of data.
import statistics
data = [1, 2, 3, 4, 5]
print(statistics.median(data)) # 3
Output:
3
median_low
Returns the low median of data.
import statistics
data = [1, 2, 3, 4, 5, 6]
print(statistics.median_low(data)) # 3
Output:
3
median_high
Returns the high median of data.
import statistics
data = [1, 2, 3, 4, 5, 6]
print(statistics.median_high(data)) # 4
Output:
4
median_grouped
Returns the median (50th percentile) of grouped data.
import statistics
data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
print(statistics.median_grouped(data)) # 3.25
Output:
3.1666666666666665
mode
Returns the single mode (most common value) of discrete or nominal data.
import statistics
data = [1, 2, 2, 3, 4]
print(statistics.mode(data)) # 2
Output:
2
multimode
Returns a list of modes (most common values) of discrete or nominal data.
import statistics
data = [1, 1, 2, 3, 3, 4]
print(statistics.multimode(data)) # [1, 3]
Output:
[1, 3]
Measures of Dispersion
pstdev
Returns the population standard deviation of data.
import statistics
data = [1, 2, 3, 4, 5]
print(statistics.pstdev(data)) # 1.4142135623730951
Output:
1.4142135623730951
pvariance
Returns the population variance of data.
import statistics
data = [1, 2, 3, 4, 5]
print(statistics.pvariance(data)) # 2.0
Output:
2
stdev
Returns the sample standard deviation of data.
import statistics
data = [1, 2, 3, 4, 5]
print(statistics.stdev(data)) # 1.5811388300841898
Output:
1.5811388300841898
variance
Returns the sample variance of data.
import statistics
data = [1, 2, 3, 4, 5]
print(statistics.variance(data)) # 2.5
Output:
2.5
Other Statistical Functions
quantiles
Divides data into intervals with equal probability.
import statistics
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(statistics.quantiles(data, n=4)) # [2.75, 5.5, 8.25]
Output:
[2.75, 5.5, 8.25]
Examples
Calculating Mean, Median, and Mode
import statistics
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
mean = statistics.mean(data)
median = statistics.median(data)
mode = statistics.mode(data)
print(f"Mean: {mean}, Median: {median}, Mode: {mode}")
Output:
Mean: 5.5, Median: 5.5, Mode: 1
Calculating Standard Deviation and Variance
import statistics
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
pstdev = statistics.pstdev(data)
pvariance = statistics.pvariance(data)
stdev = statistics.stdev(data)
variance = statistics.variance(data)
print(f"Population Standard Deviation: {pstdev}")
print(f"Population Variance: {pvariance}")
print(f"Sample Standard Deviation: {stdev}")
print(f"Sample Variance: {variance}")
Output:
Population Standard Deviation: 2.8722813232690143
Population Variance: 8.25
Sample Standard Deviation: 3.0276503540974917
Sample Variance: 9.166666666666666
Using Quantiles
import statistics
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
quartiles = statistics.quantiles(data, n=4)
deciles = statistics.quantiles(data, n=10)
print(f"Quartiles: {quartiles}")
print(f"Deciles: {deciles}")
Output:
Quartiles: [2.75, 5.5, 8.25]
Deciles: [1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9]
Real-World Use Case
Analyzing Student Grades
Suppose you have a list of student grades and you want to calculate various statistical measures to analyze the performance.
import statistics
grades = [88, 92, 79, 93, 85, 91, 76, 95, 89, 84]
mean = statistics.mean(grades)
median = statistics.median(grades)
mode = statistics.mode(grades)
stdev = statistics.stdev(grades)
variance = statistics.variance(grades)
quartiles = statistics.quantiles(grades, n=4)
print(f"Mean: {mean}")
print(f"Median: {median}")
print(f"Mode: {mode}")
print(f"Standard Deviation: {stdev}")
print(f"Variance: {variance}")
print(f"Quartiles: {quartiles}")
Output:
Mean: 87.2
Median: 88.5
Mode: 88
Standard Deviation: 6.178816859057871
Variance: 38.17777777777778
Quartiles: [82.75, 88.5, 92.25]
Conclusion
The statistics
module in Python provides a comprehensive suite of functions for performing statistical calculations on numeric data. These functions are useful for data analysis, scientific research, financial modeling, and more.