Python statistics Module

The statistics module in Python provides functions for calculating mathematical statistics of numeric data. These functions are useful for performing statistical calculations, including mean, median, mode, variance, and more.

Table of Contents

  1. Introduction
  2. Measures of Central Tendency
    • mean
    • fmean
    • geometric_mean
    • harmonic_mean
    • median
    • median_low
    • median_high
    • median_grouped
    • mode
    • multimode
  3. Measures of Dispersion
    • pstdev
    • pvariance
    • stdev
    • variance
  4. Other Statistical Functions
    • quantiles
  5. Examples
    • Calculating Mean, Median, and Mode
    • Calculating Standard Deviation and Variance
    • Using Quantiles
  6. Real-World Use Case
  7. Conclusion
  8. References

Introduction

The statistics module provides a wide range of functions to perform statistical calculations on numeric data. These functions are designed to work with iterable data types, such as lists, tuples, and more.

Measures of Central Tendency

mean

Returns the arithmetic mean (average) of data.

import statistics

data = [1, 2, 3, 4, 5]
print(statistics.mean(data))  # 3

Output:

3

fmean

Returns the fast, floating-point arithmetic mean of data.

import statistics

data = [1, 2, 3, 4, 5]
print(statistics.fmean(data))  # 3.0

Output:

3.0

geometric_mean

Returns the geometric mean of data.

import statistics

data = [1, 2, 3, 4, 5]
print(statistics.geometric_mean(data))  # 2.605171084697352

Output:

2.6051710846973517

harmonic_mean

Returns the harmonic mean of data.

import statistics

data = [1, 2, 3, 4, 5]
print(statistics.harmonic_mean(data))  # 2.18978102189781

Output:

2.18978102189781

median

Returns the median (middle value) of data.

import statistics

data = [1, 2, 3, 4, 5]
print(statistics.median(data))  # 3

Output:

3

median_low

Returns the low median of data.

import statistics

data = [1, 2, 3, 4, 5, 6]
print(statistics.median_low(data))  # 3

Output:

3

median_high

Returns the high median of data.

import statistics

data = [1, 2, 3, 4, 5, 6]
print(statistics.median_high(data))  # 4

Output:

4

median_grouped

Returns the median (50th percentile) of grouped data.

import statistics

data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
print(statistics.median_grouped(data))  # 3.25

Output:

3.1666666666666665

mode

Returns the single mode (most common value) of discrete or nominal data.

import statistics

data = [1, 2, 2, 3, 4]
print(statistics.mode(data))  # 2

Output:

2

multimode

Returns a list of modes (most common values) of discrete or nominal data.

import statistics

data = [1, 1, 2, 3, 3, 4]
print(statistics.multimode(data))  # [1, 3]

Output:

[1, 3]

Measures of Dispersion

pstdev

Returns the population standard deviation of data.

import statistics

data = [1, 2, 3, 4, 5]
print(statistics.pstdev(data))  # 1.4142135623730951

Output:

1.4142135623730951

pvariance

Returns the population variance of data.

import statistics

data = [1, 2, 3, 4, 5]
print(statistics.pvariance(data))  # 2.0

Output:

2

stdev

Returns the sample standard deviation of data.

import statistics

data = [1, 2, 3, 4, 5]
print(statistics.stdev(data))  # 1.5811388300841898

Output:

1.5811388300841898

variance

Returns the sample variance of data.

import statistics

data = [1, 2, 3, 4, 5]
print(statistics.variance(data))  # 2.5

Output:

2.5

Other Statistical Functions

quantiles

Divides data into intervals with equal probability.

import statistics

data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(statistics.quantiles(data, n=4))  # [2.75, 5.5, 8.25]

Output:

[2.75, 5.5, 8.25]

Examples

Calculating Mean, Median, and Mode

import statistics

data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

mean = statistics.mean(data)
median = statistics.median(data)
mode = statistics.mode(data)

print(f"Mean: {mean}, Median: {median}, Mode: {mode}")

Output:

Mean: 5.5, Median: 5.5, Mode: 1

Calculating Standard Deviation and Variance

import statistics

data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

pstdev = statistics.pstdev(data)
pvariance = statistics.pvariance(data)
stdev = statistics.stdev(data)
variance = statistics.variance(data)

print(f"Population Standard Deviation: {pstdev}")
print(f"Population Variance: {pvariance}")
print(f"Sample Standard Deviation: {stdev}")
print(f"Sample Variance: {variance}")

Output:

Population Standard Deviation: 2.8722813232690143
Population Variance: 8.25
Sample Standard Deviation: 3.0276503540974917
Sample Variance: 9.166666666666666

Using Quantiles

import statistics

data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
quartiles = statistics.quantiles(data, n=4)
deciles = statistics.quantiles(data, n=10)

print(f"Quartiles: {quartiles}")
print(f"Deciles: {deciles}")

Output:

Quartiles: [2.75, 5.5, 8.25]
Deciles: [1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9]

Real-World Use Case

Analyzing Student Grades

Suppose you have a list of student grades and you want to calculate various statistical measures to analyze the performance.

import statistics

grades = [88, 92, 79, 93, 85, 91, 76, 95, 89, 84]

mean = statistics.mean(grades)
median = statistics.median(grades)
mode = statistics.mode(grades)
stdev = statistics.stdev(grades)
variance = statistics.variance(grades)
quartiles = statistics.quantiles(grades, n=4)

print(f"Mean: {mean}")
print(f"Median: {median}")
print(f"Mode: {mode}")
print(f"Standard Deviation: {stdev}")
print(f"Variance: {variance}")
print(f"Quartiles: {quartiles}")

Output:

Mean: 87.2
Median: 88.5
Mode: 88
Standard Deviation: 6.178816859057871
Variance: 38.17777777777778
Quartiles: [82.75, 88.5, 92.25]

Conclusion

The statistics module in Python provides a comprehensive suite of functions for performing statistical calculations on numeric data. These functions are useful for data analysis, scientific research, financial modeling, and more.

References

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top