\[ \begin{align}\begin{aligned}\newcommand\blank{~\underline{\hspace{1.2cm}}~}\\% Bold symbols (vectors) \newcommand\bs[1]{\mathbf{#1}}\\% Differential \newcommand\dd[2][]{\mathrm{d}^{#1}{#2}} % use as \dd, \dd{x}, or \dd[2]{x}\\% Poor man's siunitx \newcommand\unit[1]{\mathrm{#1}} \newcommand\num[1]{#1} \newcommand\qty[2]{#1~\unit{#2}}\\\newcommand\per{/} \newcommand\squared{{}^2} \newcommand\cubed{{}^3} % % Scale \newcommand\milli{\unit{m}} \newcommand\centi{\unit{c}} \newcommand\kilo{\unit{k}} \newcommand\mega{\unit{M}} % % Percent \newcommand\percent{\unit{{\kern-4mu}\%}} % % Angle \newcommand\radian{\unit{rad}} \newcommand\degree{\unit{{\kern-4mu}^\circ}} % % Time \newcommand\second{\unit{s}} \newcommand\s{\second} \newcommand\minute{\unit{min}} \newcommand\hour{\unit{h}} % % Distance \newcommand\meter{\unit{m}} \newcommand\m{\meter} \newcommand\inch{\unit{in}} \newcommand\foot{\unit{ft}} % % Force \newcommand\newton{\unit{N}} \newcommand\kip{\unit{kip}} % kilopound in "freedom" units - edit made by Sri % % Mass \newcommand\gram{\unit{g}} \newcommand\g{\gram} \newcommand\kilogram{\unit{kg}} \newcommand\kg{\kilogram} \newcommand\grain{\unit{grain}} \newcommand\ounce{\unit{oz}} % % Temperature \newcommand\kelvin{\unit{K}} \newcommand\K{\kelvin} \newcommand\celsius{\unit{{}^\circ C}} \newcommand\C{\celsius} \newcommand\fahrenheit{\unit{{}^\circ F}} \newcommand\F{\fahrenheit} % % Area \newcommand\sqft{\unit{sq\,\foot}} % square foot % % Volume \newcommand\liter{\unit{L}} \newcommand\gallon{\unit{gal}} % % Frequency \newcommand\hertz{\unit{Hz}} \newcommand\rpm{\unit{rpm}} % % Voltage \newcommand\volt{\unit{V}} \newcommand\V{\volt} \newcommand\millivolt{\milli\volt} \newcommand\mV{\milli\volt} \newcommand\kilovolt{\kilo\volt} \newcommand\kV{\kilo\volt} % % Current \newcommand\ampere{\unit{A}} \newcommand\A{\ampere} \newcommand\milliampereA{\milli\ampere} \newcommand\mA{\milli\ampere} \newcommand\kiloampereA{\kilo\ampere} \newcommand\kA{\kilo\ampere} % % Resistance \newcommand\ohm{\Omega} \newcommand\milliohm{\milli\ohm} \newcommand\kiloohm{\kilo\ohm} % correct SI spelling \newcommand\kilohm{\kilo\ohm} % "American" spelling used in siunitx \newcommand\megaohm{\mega\ohm} % correct SI spelling \newcommand\megohm{\mega\ohm} % "American" spelling used in siunitx % % Capacitance \newcommand\farad{\unit{F}} \newcommand\F{\farad} \newcommand\microfarad{\micro\farad} \newcommand\muF{\micro\farad} % % Inductance \newcommand\henry{\unit{H}} \newcommand\H{\henry} \newcommand\millihenry{\milli\henry} \newcommand\mH{\milli\henry} % % Power \newcommand\watt{\unit{W}} \newcommand\W{\watt} \newcommand\milliwatt{\milli\watt} \newcommand\mW{\milli\watt} \newcommand\kilowatt{\kilo\watt} \newcommand\kW{\kilo\watt} % % Energy \newcommand\joule{\unit{J}} \newcommand\J{\joule} % % Composite units % % Torque \newcommand\ozin{\unit{\ounce}\,\unit{in}} \newcommand\newtonmeter{\unit{\newton\,\meter}} % % Pressure \newcommand\psf{\unit{psf}} % pounds per square foot \newcommand\pcf{\unit{pcf}} % pounds per cubic foot \newcommand\pascal{\unit{Pa}} \newcommand\Pa{\pascal} \newcommand\ksi{\unit{ksi}} % kilopound per square inch \newcommand\bar{\unit{bar}} \end{aligned}\end{align} \]

Dec 04, 2025 | 764 words | 8 min read

3.1.1. Materials#

Descriptive Statistics#

Descriptive statistics describe data using measures of central tendencies and measures of variability (how spread out the data is). These allow us to make evidence-based decisions.

Basic Descriptors#

Table 3.1 Basic descriptors.#

Function

Description

MS Excel Formula

Count

The total number of data points in the set.

=COUNT()

Minimum

The smallest value in the data set.

=MIN()

Maximum

The largest value in the data set.

=MAX()

Measures of Central Tendency#

Table 3.2 Measures of central tendency.#

Function

Description

MS Excel Formula

Mean

Sum of all the data points divided by the total number of data points.

=AVERAGE()

Median

Middle number of the data set when data is listed from smallest to largest.

=MEDIAN()

Mode

Most frequently occurring number.

=MODE()

If you expect one mode

=MODE.SNGL()

If you expect multiple modes

=MODE.MULT()

Note

The mean is sensitive to outliers. An outlier is a data point that is far away from the rest of the data.

Measures of Variability#

Table 3.3 Measures of variability#

Function

Description

MS Excel Formula

Range

Difference between the maximum and minimum values.

=MAX() - MIN()

Variance

Average of the squared difference of each data point from the mean. Is equal to the standard deviation squared.

=VAR.S()

Standard Deviation

A measure of the amount of variation in a data set. Is equal to the square root of the variance.

=STDEV.S()

Note

A low standard deviation suggests that the data points are close to the mean while a high standard deviation suggests that the data points are more spread out.

Key Equations#

Average

(3.1)#\[\overline{x} = \frac{1}{n} \sum_{i=1}^{n} x_i\]

Variance of a Sample

(3.2)#\[\sigma^2 = \frac{\sum_{i=1}^{n} (x_i - \overline{x})^2}{n-1}\]

Standard Deviation of a Sample

(3.3)#\[\sigma = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \overline{x})^2}{n-1}}\]

where, \(n\) is the total number of data points, \(x_i\) represents each individual data point, \(\overline{x}\) is the average (mean) value of the sample, \(\sigma^2\) represents the variance of the sample, and \(\sigma\) represents the standard deviation of the sample.

Note

A population is defined as the entire group that you want to draw conclusions about. A sample is a subset of a population. Equations for the variance (\(\sigma^2\)) and standard deviation (\(\sigma\)) of a population are slightly different than those for a sample. When calculating these statistics for a population, we divide by \(n\) (the total number of data points) instead of \(n-1\).

(3.4)#\[\sigma^2 = \frac{\sum_{i=1}^{n} (x_i - \mu)^2}{n}\]
(3.5)#\[\sigma = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \mu)^2}{n}}\]

where, \(\mu\) is the average (mean) value of the population.

Distributions#

Histograms#

Histograms can be used to visualize the distribution of a dataset. We will use the Data Analysis ToolPak in MS Excel to create histograms.

Installing the Data Analysis ToolPak#

Follow the steps below to install the Data Analysis ToolPak as an Add-in on your system.

  1. From the File menu select Options (bottom left).

  2. In the Options window, select Add-ins and then click on Go.

    ../../_images/windows_options_menu.png

    Fig. 3.1 The MS Excel options menu.#

  3. In the Add-ins menu, select Analysis ToolPak and then click OK.

    ../../_images/windows_add-ins_menu.png

    Fig. 3.2 The MS Excel Add-ins menu.#

  4. Now you can find the Analysis ToolPak under the Data tab in the Analysis section of the ribbon.

    ../../_images/windows_ribbon.png

    Fig. 3.3 The MS Excel ribbon with Analysis ToolPak installed.#

  1. From the Tools menu select Excel Add-ins.

    ../../_images/mac_options_menu.png

    Fig. 3.4 The MS Excel options menu.#

  2. In the Add-ins window, check the box next to Analysis ToolPak and then click on OK.

    ../../_images/mac_add-ins_menu.png

    Fig. 3.5 The MS Excel Add-ins menu.#

  3. Now you can launch the Analysis ToolPak by navigating to the Data tab and clicking on Data Analysis.

    ../../_images/mac_ribbon.png

    Fig. 3.6 The MS Excel ribbon with Analysis ToolPak installed.#

Creating a histogram using the Data Analysis ToolPak#

The example worksheet used in the above video can be found in HistogramExample_Landfillwaste.xlsx. A work through for making histograms with and without the Data Analysis ToolPak is provided in Excel_HistogramExample_FabricIgnition.xlsx.

Note

If a histogram displays continuous data, there should be no gap between your bins.

Skewness#

Skewness is a measure of a dataset’s symmetry (or lack of symmetry). MS Excel function is SKEW()

A perfectly symmetrical data set will have a skewness of 0 and is often known as a normal distribution. A skew less than zero (negative) has a long tail to the left. A skew greater than zero (positive) has a long tail to the right.

../../_images/mean_med_mod.png

Fig. 3.7 Relationship between mean, median, mode with the skewness.#

Note

Please note that the relationship between mean and median only serves as a guide for the skewness. If possible, you should always compute the skewness itself instead of relying on the relationship between mean and median. The skewness can be computed in MS Excel via the SKEW() function (which computes the adjusted Fisher-Pearson standardized moment coefficient). Read more about this here.

Kurtosis#

Kurtosis is a measure of the peakedness and flatness of a distribution. MS Excel function is KURT().

Distributions with negative kurtosis exhibit tail data exceeding the tails of normal distribution. Distributions with positive kurtosis exhibit tail data that is generally less extreme than the tails of the normal distribution.

../../_images/kurtosis.jpg

Fig. 3.8 Kurtosis Distribution.#

Note

Skewness differentiates extreme values in one versus the other tail, kurtosis measures extreme values in either tail.

Flowcharts#

Flowcharts are a visual representation of a process or algorithm, showing the steps as boxes of various kinds, and their order by connecting these boxes with arrows.

Click here to watch a video on how to create a basic flowchart in Miro.