Extraction of the Essential Constituents of the S&P 500 Index

Ray R. Hashemi, Georgia Southern University
Omid Ardakani, Georgia Southern University
Azita Bahrami, IT Consultation Company
Jeffery A. Young

© Copyright 2020 IEEE - All rights reserved.

Abstract

The S&P 500 index is a leading indicator of the stock market and U.S. equities which is highly influenced by its essential constituents. Traditionally, such constituents are identified by the market capitalization weighting scheme. However, the literature rejects the efficiency of the weighting method. In contrast, we introduce data mining approaches of the entropy and rough sets as two separate methods for extraction of the essential S&P 500 constituents. The legitimacy of the findings in comparison with the S&P 500 weighting scheme have been investigated using the discrete time Markov Chain Models (MCM) and Hidden Markov Chain Models (HMCM) which lend themselves easily to the nature of the time-series data. The investigation is done against data for the full sample and pre/post crisis subsamples collected for the period of 16 years. We find the entropy method provides the highest forecasting accuracy measure for the full sample and post-crisis subsample.