6 CPI Drawbacks Data Scientists Must Know




Consumer price index (CPI) measures the changes in the composite price of a specified “basket” of goods and services during the given period as compared with base period. The so-called “basket” would comprise various commodities consumed and services received in the base period or the given period, grouped under the main headings:

  • Food and beverages
  • Clothing and footwear
  • Fuel and lighting
  • Housing
  • Services
  • Miscellaneous

It is customary to exclude the durable goods and non-consumption monetary transactions such as the contribution to Provident Fund, Savings Certificates, etc. The quantities consumed or the expenditures incurred on various groups are used as weights for the average retail prices prevailing in the locality concerned during the base and given periods.

In computational finance and retail industry, data scientists frequently use CPI data in statistical and machine learning models. Therefore, understanding drawbacks of CPI are important for data scientists, business analysts and machine learning experts. Following are some of key drawbacks of CPI numbers:

  1. It is practically difficult to clearly demarcate one category of people from another.

  2. As the construction of consumer price indices involves the sampling of goods and services, the sampling errors and biases may affect indices and render them to suspect. Moreover, the frames used for household consumption inquiry may be incomplete and outdated.

  3. In case of certain goods, it is difficult to collect prices actually needed. For example, the prices for clothing usually relate to cloth and not to tailored clothes.

  4. It is also difficult to eliminate the effect of changes in quality and grade of goods and services purchased by households.

  5. During the course of household budget inquiry, the price of goods and services and their demand may change as, some commodities may change in quality, other may disappear and some new goods may enter the market.

  6. The consumer price indices cannot be used for comparing the price changes in consumption goods and services in two localities or in two households in the same locality as no two households can be homogeneous, i.e. they can neither have precisely the same pattern of consumptions nor precisely the same basket of goods and services.

Got A Data Science Question?

Ask our experts anything about machine learning, analytics or statistics.