Know Well About Central Tendency

Dhaval Raval
4 min readMay 25, 2023
Photo by Matthew Hamilton on Unsplash

A measure of central tendency is a single value that attempts to describe a set of data by identifying the central position within that set of data. They are also classed as summary statistics.

Once you receive your data, your objective is to figure out where does your center lies on.

Objective: — Figure out that where is my center lies inside this entire data.

The mean, median and mode are all valid measures of central tendency, but under different conditions, some measures of central tendency become more appropriate to use than others. Let’s discuss the measures below.

Image source: — Google Image

Mean:- The mean is equal to the sum of all the values in the dataset divided by the number of values in the dataset.

So, if we have n values in a dataset and they have values x1, x2, …, xn the sample mean, usually denoted by ‘x-bar’, is:

Example 1: Considering a class we picked up randomly 10 students i.e., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 -> maybe first got 1st rank, second got 2nd rank, etc.

So, mean = (Sum of all data points) ÷ (Number of data points) = 55 ÷ 10 = 5.5

Example 2: Inside second class where as I have 10 students with me where each and every student got a rank as 5.5 itself i.e., 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5.

So, calculating average:- mean = 5.5.

Question:- So which value will be more reliable or trustable from both the examples ?

If you are going to join on of the class, which class you are going to join ? e.g., 1 or e.g., 2 ?

Solution:- In e.g., 1, there is a higher probability of or equal probability of you to get a score of rank 1 or rank 10.

In e.g., 2, there is a proper consistency where, on an average, every student is getting a row of average 5.5.

Median:- The median is the middle score for a set of data that has been arranged in order of magnitude. The median is less affected by outliers and skewed data.

For example, we have data below as:

65, 55, 89, 56, 35, 14, 56, 55, 87, 45, 92.

We first need to rearrange that data into order of magnitude.

14, 35, 45, 55, 55, 56, 56, 65, 87, 89, 92.

Our median mark is the middle mark, i.e., 56. It is the middle mark because there are 5 scores before it and 5 scores after it. This works fine when you have odd number of scores.

But what happens when you have even numbers like 10 scores ?

Well, you simply have to take the middle two scores and average the result.

For example, we have data below as:

65, 55, 89, 56, 35, 14, 56, 55, 87, 45, 92.

We first need to rearrange that data into order of magnitude.

14, 35, 45, 55, 55, 56, 56, 65, 87, 89.

Only now, we have to take the fifth and sixth score in our dataset and average them to get a median value.

i.e., 55 + 56 ÷ 2 = 111 ÷ 2 = 55.5.

So, the median is 55.5.

Note:- If sequence is odd (example 1), median will be center of that sequence.

If sequence is even (example 2), median will be the average of two center numbers.

Mode:- The mode is the most frequent score in our dataset. On a histogram it represents the highest bar in a chart or histogram.

For example, model = ‘red’, ‘green’, ‘red’, ‘orange’.

So, mode will be ‘red’, because that is the most frequent value inside the data.

Graphical example can be shown as:

Image source: — Google Image

However, one of the problem with mode is that it is not unique, so it leaves us with problems when we have two or more values that shares the highest frequency.

Summary of when to use the mean, median and mode:-

Nominal -> Mode

Ordinal -> Median

Interval/Ratio(no skewed) -> Mean

Interval/Ratio(skewed) -> Median

Conclusion:-

In conclusion, measures of central tendency are important statistical tools used to understand the typical or central value of a dataset.

By understanding and utilizing measures of central tendency, researchers can gain insights into the central value of the data and make informed decisions based on the characteristics of the dataset.

However, it’s important to consider other measures of dispersion and the overall shape of the distribution for a more comprehensive understanding of the data.

--

--