Once we have collected data about a business process, the next step would be describe it properly so that it can be communicated effectively. In statistics, we often you descriptive statistics: central tendency, dispersion and shape, in order to perform this task.
In this article we will specifically look at one of descriptive statistics i.e. central tendency.
What is central tendency?
Central tendency is a single value measure which represents central position of a probability distribution. Imagine we have collected data about height of 5th grade students in U.S.A and it may consist of thousands of data points. We would like to know single representative number which may represent height of an average 5th grader - central tendency. In statistics, there are three possible ways to measure central tendency of a distribution: mean, median and mode.
Also referred as arithmetic mean, is calculated by adding all data points and dividing by the number of data points. For example, the mean of 4, 3, 6, 7 and 5 is (4+3+6+7+5)/5 = 25/5 = 5. Mean is sensitive to outliers - data points distant from rest of data.
Order all data points and picking out the one in the middle is called median. If there are two middle numbers, then taking the mean of those two numbers. For example, the median of 222, 111, and 444 is 222 because when we order these numbers, the number 222 is in the middle. Median is preferred over mean for data with outliers as median is less sensitive to outliers.
The most frequent value that occurs in a set of data points. For example: The mode of 444, 222, 444, 333, 222, 222 is 222 because 222 occurs three times, which is more than any other number’s frequency. If there are two modes is will called bimodal, three modes will called tri-modal or more modes within larger sets of numbers are also possible. Though mode is also not sensitive to outliers, but it is not used as frequently as median.