Skip to main content Skip to main navigation menu Skip to site footer
  • Register
  • Login
  • Menu
  • Home
  • Journal Information
  • Current
  • Archives
  • Editorial Team
  • Fees
  • Ethics and Policies
  • Submission
  • Register
  • Login

Ekonomia i Prawo. Economics and Law

Does pre-processing affect the correlation indicator between Twitter message volume and stock market trading volume?
  • Home
  • /
  • Does pre-processing affect the correlation indicator between Twitter message volume and stock market trading volume?
  1. Home /
  2. Archives /
  3. Vol. 19 No. 4 (2020) /
  4. Articles

Does pre-processing affect the correlation indicator between Twitter message volume and stock market trading volume?

Authors

  • Joanna Michalak Nicolaus Copernicus University in Toruń https://orcid.org/0000-0002-1061-401X

DOI:

https://doi.org/10.12775/EiP.2020.048

Keywords

twitter sentiment analysis, behavioral economy, data mining

Abstract

Motivation: More and more authors empirically verify the relationship between the volume of tweets and the stock market indicators. The patterns explored from Twitter most often take the form of time series that represent user’s activity on different level of granularity (moods, emotions, relevant topic or query-related messages). Sentiment analysis is a technique used to transform text data into information on the mood and related behavioral categories. Supervised machine learning is the most commonly used approach to sentiment analysis. Thus, the results of an empirical analysis of the relationship between social media and stock depend on the quality of results of classification task. The quality of the features used to learn the classifier plays a key role. The feature space is modified using various data pre-processing scenarios that aim to increase accuracy of classification. The impact of pre-processing data on the quality of classification is often discussed in studies. Very few authors discuss the impact of pre-processing on the correlation indicator between Twitter and stock market.

Aim: Analysis of the impact of tweets pre-processing on the Pearson correlation indicator between the mood of Twitter users and stock market trading volume.

Results: The correlation between the volume of stock market trading and the volume of tweets has been empirically confirmed. The effect of pre-processing on the correlation index was noted for the variables ‘all_tweets’ and ‘negative_tweets’. This is because the training set has a significant amount of tweets with negation. However, the results are not conclusive. The differences between the Pearson correlation index calculated for scenario one and scenario four are not significant. However, this indicates that the effect of noise data may reduce the quality and precision of conclusions. Especially in the case of frequent repetition of a certain category of noise.

References

Agarwal, A., Xie, B., Vovsha, I., Rambow, O., & Passonneau, R. (2011). Sentiment analysis of Twitter data. In M. Nagarajan, & M. Gamon (Eds.), LSM’11: proceedings of the workshop on languages in social media. Stroudsburg: ACL.

Antweiler, W., & Frank, M.Z. (2004). Is all that talk just noise: the information content of internet stock message boards. The Journal of Finance, 59(3). doi:10.1111/j.1540-6261.2004.00662.x.

Bollen, J., Mao, H., & Pepe A. (2011). Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In N. Nicolov, & J.G. Shanahan (Eds.), Proceedings of the fifth international AAAI Conference on weblogs and social media. Barcelona: AAAI.

Chen, E.E., & Wojcik, S.P. (2016). A practical guide to big data research in psychology. Psychological Methods, 21(4). doi:10.1037/met0000111.

Freedman, D.A. (2009). Statistical models: theory and practice. Leiden: Cambridge University Press.

Go, A., Bhayani, R., & Huang, L. (2009). Twitter sentiment classification using distant supervision. Retrieved 01.04.2020 from https://www-cs.stanford.edu.

Haddi, E., Liu, X., & Shi, Y. (2013). The role of text pre-processing in sentiment analysis. Procedia Computer Science. 17. doi:10.1016/j.procs.2013.05.005.

Ishikawa, H. (2015). Social big data mining. Boca Raton: CRC Press.

doi:10.1201/b18223.

Liu, B. (2012). Sentiment analysis and opinion mining. San Rafael: Morgan & Claypool Publishers.

Mao, Y., Wei, W., Wang, B., & Liu, B. (2012). Correlating S&P 500 stocks with Twitter data. In X. Fu, P. Gloor, & J. Tang (Eds.), Proceedings of the first ACM international workshop on hot topics on interdisciplinary social networks research. New York: ACM. doi:10.1145/2392622.2392634.

Mittal, A., & Goel, A. (2012). Stock prediction using twitter sentiment analysis. Retrieved 01.04.2020 from http://cs229.stanford.edu.

Nisar, T.M., & Yeung, M. (2018). Twitter as a tool for forecasting stock market movements: a short-window event study. The Journal of Finance and Data Science, 4(2). doi:10.1016/j.jfds.2017.11.002.

Oh, C., & Sheng, O. (2011). Investigating predictive power of stock micro blog sentiment in forecasting future stock price directional movement. In D.F. Galletta & T.P. Liang (Eds.), Proceedings of the international conference on information systems. Atlanta: AIS.

Olshannikova, E., Olsson, T., Huhtakamäki, J., & Kärkkäinen, H. (2017). Conceptualizing big social data. Journal of Big Data, 4(1).doi:10.1186/s40537-017-0063-x.

Paudel, S., Prasad, P.W.C., Alsadoon, A., Islam, M.R., & Elchouemi, A. (2019). Feature selection approach for Twitter sentiment analysis and text classification based on Chi-Square and Naïve Bayes. In J. Abawajy, K.R. Choo, R. Islam, Z. Xu, & M. Atiquzzaman (Eds.), International conference on applications and techniques in cyber security and intelligence ATCI 2018: applications and techniques in cyber security and intelligence. Cham: Springer. doi:10.1007/978-3-319-98776-7_30.

Porshnev, A., Lakshina, V., & Redkin, I. (2016). Could emotional markers in Twitter posts add information to the stock market ARMAX–GARCH Model. Higher School of Economics Research Paper, 54/FE/2016. doi:10.2139/ssrn.2763583.

Rao, T., & Srivastava, S. (2013). Modeling movements in oil, gold, forex and market indices using search volume index and Twitter sentiments. In H. Davis, H. Halpin, & A. Pentland (Eds.), WebSci’13: Proceedings of the 5th annual ACM web science conference. New York: ACM. doi:10.1145/2464464.2464521.

Singh, T., & Kumari, M. (2016). Role of text pre-processing in Twitter sentiment analysis. Procedia Computer Science, 89, 549. doi:10.1016/j.procs.2016.06.095.

Strauß, N., Vliegenthart, R., & Verhoeven, P. (2018). Intraday news trading: the reciprocal relationships between the stock market and economic news. Communication Research, 45(7). doi:10.1177/0093650217705528.

Symeonidis, S., Effrosynidis, D., & Arampatzis, A. (2018). A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis. Expert Systems with Applications, 110. doi:10.1016/j.eswa.2018.06.022.

Uysal, A.K., & Gunal, S. (2014). The impact of preprocessing on text classification. Information Processing & Management, 50(1). doi:10.1016/j.ipm.2013.08.006.

Wysocki, P.D. (1999). Cheap talk on the web: the determinants of postings on stock message boards. University of Michigan Business School Working Paper, 98025. doi:10.2139/ssrn.160170.

Zhang, X., Fuehres, H., & Gloor, P.A. (2011). Predicting stock market indicators through Twitter ‘I hope it is not as bad as I fear’. Procedia: Social and Behavioral Sciences, 26. doi:10.1016/j.sbspro.2011.10.562.

Zobal, V. (2017). Sentiment analysis of social media and its relation to stock market. Unpublished bachelor thesis, Charles University, Prague. Retrieved 01.04.2020 from https://is.cuni.cz.

Tweeter Developer. (2020). Retrieved 01.04.2020 from https://developer.twitter.com.

Michailidis, M. (2017). Sentiment 140 dataset with 1.6 million tweets. Retrieved 01.04.2020 from https://www.kaggle.com.

Ekonomia i Prawo. Economics and Law

Downloads

  • PDF

Published

2020-12-31

How to Cite

1.
MICHALAK, Joanna. Does pre-processing affect the correlation indicator between Twitter message volume and stock market trading volume?. Ekonomia i Prawo. Economics and Law. Online. 31 December 2020. Vol. 19, no. 4, pp. 739-755. [Accessed 1 July 2025]. DOI 10.12775/EiP.2020.048.
  • ISO 690
  • ACM
  • ACS
  • APA
  • ABNT
  • Chicago
  • Harvard
  • IEEE
  • MLA
  • Turabian
  • Vancouver
Download Citation
  • Endnote/Zotero/Mendeley (RIS)
  • BibTeX

Issue

Vol. 19 No. 4 (2020)

Section

Articles

Stats

Number of views and downloads: 819
Number of citations: 0

Search

Search

Browse

  • Browse Author Index
  • Issue archive

Information

  • For Readers
  • For Authors
  • For Librarians

User

User

Contact

Principal Contact
Piotr Wiśniewski
psw@umk.pl
Support Contact
Grzegorz Kopcewicz
Phone (56) 611 26 93
greg@umk.pl

cross_check

The journal content is indexed in CrossCheck, the CrossRef initiative to prevent scholarly and professional plagiarism

Up

Akademicka Platforma Czasopism

Najlepsze czasopisma naukowe i akademickie w jednym miejscu

apcz.umk.pl

Partners

  • Akademia Ignatianum w Krakowie
  • Akademickie Towarzystwo Andragogiczne
  • Fundacja Copernicus na rzecz Rozwoju Badań Naukowych
  • Instytut Historii im. Tadeusza Manteuffla Polskiej Akademii Nauk
  • Instytut Kultur Śródziemnomorskich i Orientalnych PAN
  • Instytut Tomistyczny
  • Karmelitański Instytut Duchowości w Krakowie
  • Ministerstwo Kultury i Dziedzictwa Narodowego
  • Państwowa Akademia Nauk Stosowanych w Krośnie
  • Państwowa Akademia Nauk Stosowanych we Włocławku
  • Państwowa Wyższa Szkoła Zawodowa im. Stanisława Pigonia w Krośnie
  • Polska Fundacja Przemysłu Kosmicznego
  • Polskie Towarzystwo Ekonomiczne
  • Polskie Towarzystwo Ludoznawcze
  • Towarzystwo Miłośników Torunia
  • Towarzystwo Naukowe w Toruniu
  • Uniwersytet im. Adama Mickiewicza w Poznaniu
  • Uniwersytet Komisji Edukacji Narodowej w Krakowie
  • Uniwersytet Mikołaja Kopernika
  • Uniwersytet w Białymstoku
  • Uniwersytet Warszawski
  • Wojewódzka Biblioteka Publiczna - Książnica Kopernikańska
  • Wyższe Seminarium Duchowne w Pelplinie / Wydawnictwo Diecezjalne „Bernardinum" w Pelplinie
Ekonomia i Prawo. Economics and Law
Katedra Ekonomii 
Wydział Nauk Ekonomicznych i Zarządzania 
Uniwersytet Mikołaja Kopernika w Toruniu 
ul. Gagarina 13A 
87-100 Toruń

Principal Contact

Piotr Wiśniewski
psw@umk.pl

Support Contact

Grzegorz Kopcewicz
Phone (56) 611 26 93
greg@umk.pl

© 2021- Nicolaus Copernicus University Accessibility statement Shop