Stacking of Machine Learning Classifiers for Bot Detection using Account Level Data

Jwala Sharma, Samarjeet Borah

Abstract


This research addresses the challenge of identifying social media bots (SMB) that can rapidly disseminate information or misinformation on platforms like Twitter. It contributes to the field by reviewing literature to define bot behaviours and exploring advanced machine learning classifiers for effective bot detection using account-level data. The study employed Spearman's rank correlation coefficient to select relevant features for SMB classification, then trained six different machine learning models: Decision Tree, Random Forest, Logistic Regression, Support Vector Machine, and K-Nearest Neighbour. To further improve accuracy, a classifier stacking technique has been applied. Key findings revealed that while individual classifiers performed variably—with Random Forest leading at 89% accuracy - the stacked classifier approach outperformed all single-classifier methods with an impressive 90% accuracy rate. The results underscore the potential of combining multiple classifiers to enhance the precision of social media bot detection efforts.

Keywords


Social Media Bots, Feature Selection, Machine Learning, Classification, stacking classifier

Full Text:

PDF

References


V. S. Subrahmanian et al., “The DARPA Twitter bot challenge,” Computer (Long. Beach. Calif)., vol. 49, no. 6, pp. 38–46, 2016.

M. Forelle, P. Howard, A. Monroy-Hernández, and S. Savage, “Political bots and the manipulation of public opinion in Venezuela,” arXiv Prepr. arXiv1507.07109, 2015.

E. Ferrara, O. Varol, C. Davis, F. Menczer, and A. Flammini, “The rise of social bots,” Commun. ACM, vol. 59, no. 7, pp. 96–104, 2016.

R. Gallotti, F. Valle, N. Castaldo, P. Sacco, and M. De Domenico, “Assessing the risks of ‘infodemics’ in response to COVID-19 epidemics,” Nat. Hum. Behav., vol. 4, no. 12, pp. 1285–1293, 2020.

A. Bessi and E. Ferrara, “Social bots distort the 2016 US Presidential election online discussion,” First monday, vol. 21, no. 11–7, 2016.

S. Kudugunta and E. Ferrara, “Deep neural networks for bot detection,” Inf. Sci. (Ny)., vol. 467, pp. 312–322, 2018.

Z. Gilani, R. Farahbakhsh, G. Tyson, L. Wang, and J. Crowcroft, “Of bots and humans (on twitter),” in Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, 2017, pp. 349–354.

S. Cresci, R. Di Pietro, M. Petrocchi, A. Spognardi, and M. Tesconi, “The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race,” in Proceedings of the 26th international conference on world wide web companion, 2017, pp. 963–972.

K.-C. Yang, O. Varol, P.-M. Hui, and F. Menczer, “Scalable and generalizable social bot detection through data selection,” in Proceedings of the AAAI conference on artificial intelligence, 2020, pp. 1096–1103.

E. Van Der Walt and J. Eloff, “Using machine learning to detect fake identities: bots vs humans,” IEEE access, vol. 6, pp. 6540–6549, 2018.

R. J. Oentaryo, A. Murdopo, P. K. Prasetyo, and E.-P. Lim, “On profiling bots in social media,” in Social Informatics: 8th International Conference, SocInfo 2016, Bellevue, WA, USA, November 11-14, 2016, Proceedings, Part I 8, 2016, pp. 92–109.

M. Li, E. Ch’ng, A. Y. L. Chong, and S. See, “Multi-class Twitter sentiment classification with emojis,” Ind. Manag. & Data Syst., vol. 118, no. 9, pp. 1804–1820, 2018.

B. Mønsted, P. Sapieżyński, E. Ferrara, and S. Lehmann, “Evidence of complex contagion of information in social media: An experiment using Twitter bots,” PLoS One, vol. 12, no. 9, p. e0184148, 2017.

G. Sarailidis, T. Wagener, and F. Pianosi, “Integrating scientific knowledge into machine learning using interactive decision trees,” Comput. & Geosci., vol. 170, p. 105248, 2023.

F. Örnbratt, J. Isaksson, and M. Willing, “A comparative study of social bot classification techniques,” 2019.

O. Beatson, R. Gibson, M. C. Cunill, and M. Elliot, “Automation on twitter: Measuring the effectiveness of approaches to bot detection,” Soc. Sci. Comput. Rev., vol. 41, no. 1, pp. 181–200, 2023.

A. Borodulin, A. Gladkov, A. Gantimurov, V. Kukartsev, and D. Evsyukov, “Using machine learning algorithms to solve data classification problems using multi-attribute dataset,” in BIO Web of Conferences, 2024, p. 2001.

L. Hagen, S. Neely, T. E. Keller, R. Scharf, and F. E. Vasquez, “Rise of the machines? Examining the influence of social bots on a political discussion network,” Soc. Sci. Comput. Rev., vol. 40, no. 2, pp. 264–287, 2022.

S. Saad et al., “Detecting P2P botnets through network behavior analysis and machine learning,” in 2011 Ninth annual international conference on privacy, security and trust, 2011, pp. 174–180.

M. S. Kaiser, ICT: Smart Systems and Technologies: Proceedings of ICTCS 2023, vol. 4. Springer Nature, 2024.

Z. Ellaky and F. Benabbou, “Political social media bot detection: Unveiling cutting-edge feature selection and engineering strategies in machine learning model development,” Sci. African, vol. 25, p. e02269, 2024, doi: 10.1016/j.sciaf.2024.e02269.

S. Džeroski and B. Ženko, “Is combining classifiers with stacking better than selecting the best one?,” Mach. Learn., vol. 54, pp. 255–273, 2004.

J. Velasco-Mata, V. González-Castro, E. F. Fernández, and E. Alegre, “Efficient detection of botnet traffic by features selection and decision trees,” IEEE Access, vol. 9, pp. 120567–120579, 2021.

A. Ramalingaiah, S. Hussaini, and S. Chaudhari, “Twitter bot detection using supervised machine learning,” in Journal of Physics: Conference Series, 2021, p. 12006.

L. Mandloi and R. Patel, “Twitter sentiments analysis using machine learninig methods,” in 2020 International Conference for Emerging Technology (INCET), 2020, pp. 1–5.

A. Lubis et al., “Deep neural networks approach with transfer learning to detect fake accounts social media on Twitter,” Indones. J. Electr. Eng. Comput. Sci, vol. 33, p. 269, 2024.




DOI: http://doi.org/10.11591/ijict.v15i2.pp477-487

Refbacks

  • There are currently no refbacks.


Copyright (c) 2026 Institute of Advanced Engineering and Science

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The International Journal of Informatics and Communication Technology (IJ-ICT)
p-ISSN 2252-8776, e-ISSN 2722-2616
This journal is published by the Intelektual Pustaka Media Utama (IPMU).

Web Analytics View IJICT Stats