Dec 03, 2024 | 667 words | 7 min read
3.1.1. Materials#
MS Excel - Descriptive Statistics#
Descriptive statistics is a way to describe data using measures of central tendencies and variation (how spread out the data is). These allow us to make evidence-based decisions.
Basic Descriptors#
Function |
Description |
Formula |
---|---|---|
Count |
The total number of data points in the set |
|
Minimum |
Smallest value in the data set |
|
Maximum |
Largest value in the data set |
|
Range |
Differenence between the maximum and minimum values |
|
Note
The \(Range\) function is helpful in identifying outliers
Measures of Central Tendency#
Function |
Description |
Formula |
---|---|---|
Mean |
Sum of all the data points divided by the total number of data points |
|
Meadian |
Middle number of the data set* |
|
Mode |
Most frequently occurring number |
|
If you expect one mode |
|
|
If you expect multiple modes |
|
Note
*- when data is listed from smallest to largest
The \(Mean\) function is sensitive to outliers. An outlier is a data point that is far away from the rest of the data
Data Spread#
Function |
Description |
Formula |
---|---|---|
Standard Deviation |
The measure of the amount of variation in a data set |
|
square root of the variance |
||
Variance |
Average of the squared difference of each data point from the mean |
|
standard deviation squared |
Note
A low standard deviation suggests that the data points are close to the mean.
A high standard deviation suggests that the data points are more spread out.
Equations#
Average
Standard Deviation of the sample
Variance of the sample
where,
\(\bar{x}\) is the average (mean) value of the sample.
\(x_i\) represents each individual data point.
\(n\) is the total number of data points.
\(\sigma\) represents the standard deviation of the sample.
\(\sigma^2\) represents the variance of the sample.
MS Excel - Histograms#
Installing the Data Analysis ToolPak#
Follow the steps below to install the Data Analysis ToolPak as an Add-in on your system.
From the File menu select Options (bottom left).
In the Options window, select Add-ins and then click on Go.
In the Add-ins menu, select Analysis ToolPak and then click OK.
Now you can find the Analysis ToolPak under the Data tab in the Analysis section of the ribbon.
From the Tools menu select Excel Add-ins.
In the Add-ins window, check the box next to Analysis ToolPak and then click on OK.
Now you can launch the Analysis ToolPak by navigating to the Data tab and clicking on Data Analysis.
Creating a histogram using the Data Analysis ToolPak#
Here is the example worksheet used in the above video - HistogramExample_Landfillwaste.xlsx
Note
A work through for both options above is provided here Excel_HistogramExample_FabricIgnition.xlsx
.
Note
If a histogram displays continuous data, there should be no gap between your bins.
Basic Flowcharts#
Click on this link to learn how to create a basic flowchart in MS Visio.
Click on this link to watch a video on how to create a basic flowchart in MS Visio.
Skewness#
Skewness is a measure of a dataset’s symmetry (or lack of symmetry).
MS Excel function is SKEW()
A perfectly symmetrical data set will have a skewness of 0 and is often known as a normal distribution. A skew less than zero (negative) has a long tail to the left. A skew greater than zero (positive) has a long tail to the right.
Note
Please note that the relationship between mean and median only serves as a rule of thumb for the skewness. If possible, you should always compute the skewness itself instead of relying on a rule of thumb. The skewness can be computed in MS Excel via the SKEW()
function (which computes the adjusted Fisher-Pearson standardized moment coefficient). Read more about this here.
Kurtosis#
Kurtosis is a measure of the peakedness and flatness of a distribution.
MS Excel function is KURT()
.
Distributions with negative kurtosis exhibit tail data exceeding the tails of normal distribution. Distributions with positive kurtosis exhibit tail data that is generally less extreme than the tails of the normal distribution.
Note
Skewness differentiates extreme values in one versus the other tail, kurtosis measures extreme values in either tail.