Roberta-LightGBM: A hybrid model of deep fake detection with pre-trained and binary classification

Main Article Content

Rajkumar V
priyadharshini G


In May 2023, a fake image of an explosion near the Pentagon gained widespread social media traction. It dragged down US markets momentarily, perhaps marking the first time an artificial intelligence (AI)-generated image has affected the market. The fictitious image originally surfaced on Facebook and showed a large column of smoke that a Facebook user said was close to the US military headquarters in Virginia. In this research, we proposed Roberta by combining lightGBM to construct the Roberta-LightGBM technique framework. This paper aims to reduce tampered fake content in media with good accuracy and faster mechanisms by designing these two approaches to detecting fake content using a natural language model and a machine learning algorithm combined to develop the proposed work. Roberta's NLP model helps us train large datasets in minimum time,  compared to traditional techniques like the BERT technique, which requires ten times larger datasets to be trained in a wide range of applications. LightGBM was used to identify the solution of a machine learning algorithm using a decision tree to involve binary classification to predict whether the retrieved data was real or fake. It improved the faster training speed in handling large datasets with high accuracy; memory usage was reduced, resulting in better accuracy. As a result of the analysis, the proposed framework achieves the goal of this research when compared to alternative techniques such as the XGBoost technique, the Roberta-LightGBM technique gives 95.36% accuracy, the overall computational time is 4.4 seconds, and the implementation of Roberta to get 92.17% efficiency is shown experimentally in this paper.

Article Details

How to Cite
V, R., & G, priyadharshini. (2024). Roberta-LightGBM: A hybrid model of deep fake detection with pre-trained and binary classification. INFOCOMP Journal of Computer Science, 23(1). Retrieved from
Machine Learning and Computational Intelligence


A. A. Deshmukh and S. Govilkar, "Fake News Detection on Datasets," 2022 5th International Conference on Advances in Science and Technology (ICAST), Mumbai, India, 2022, pp. 274-279, doi: 10.1109/ICAST55766.2022.10039650.

A. Qureshi, D. Megías and M. Kuribayashi, "Detecting Deepfake Videos using Digital Watermarking," 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Tokyo, Japan, 2021, pp. 1786-1793. (4)

A. Yazdinejad, R. M. Parizi, G. Srivastava and A. Dehghantanha, "Making Sense of Blockchain for AI Deepfakes Technology," 2020 IEEE Globecom Workshops (GC Wkshps, Taipei, Taiwan, 2020, pp. 1-6, doi: 10.1109/GCWkshps50303.2020.9367545.

C. C. Ki Chan, V. Kumar, S. Delaney, and M. Gochoo, "Combating Deepfakes: Multi-LSTM and Blockchain as Proof of Authenticity for Digital Media," 2020 IEEE / ITU International Conference on Artificial Intelligence for Good (AI4G), Geneva, Switzerland, 2020, pp. 55-62, doi: 10.1109/AI4G50087.2020.9311067. (3)

E. Z. Mathews and N. Preethi, "Fake News Detection: An Effective Content-Based Approach Using Machine Learning Techniques," 2022 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2022, pp. 1-7, doi: 10.1109/ICCCI54379.2022.9741049.

Essa, E., Omar, K. & Alqahtani, A. Fake news detection based on a hybrid BERT and LightGBM models. Complex Intell. Syst. (2023).

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: a highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 3149–3157.

H. E. Wynne and K. T. Swe, "Fake News Detection in Social Media using Two-Layers Ensemble Model," 2022 37th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC), Phuket, Thailand, 2022, pp. 411-414, doi: 10.1109/ITC-CSCC55581.2022.9894967.

Huafeng Zeng, Qiang Yuan, Li Guo, and Shibiao Xu. 2023. Song popularity prediction model based on multi-modal feature fusion and LightGBM. In Proceedings of the 8th International Conference on Communication and Information Processing (ICCIP '22). Association for Computing Machinery, New York, NY, USA, 28–32.

K. L. Tan, C. P. Lee, K. S. M. Anbananthen and K. M. Lim, "RoBERTa-LSTM: A Hybrid Model for Sentiment Analysis With Transformer and Recurrent Neural Network," in IEEE Access, vol. 10, pp. 21517-21525, 2022, doi: 10.1109/ACCESS.2022.3152828.

M. Kandari, V. Tripathi and B. Pant, "A Comprehensive Review of Media Forensics and Deepfake Detection Technique," 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 2023, pp. 392-395. (1)

M. Mayank, S. Sharma and R. Sharma, "DEAP-FAKED: Knowledge Graph based Approach for Fake News Detection," 2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Istanbul, Turkey, 2022, pp. 47-51, doi: 10.1109/ASONAM55673.2022.10068653.

M. Onoja, A. Jegede, J. Mazadu, G. Aimufua, A. Oyedele and K. Olibodum, "Exploring the Effectiveness and Efficiency of LightGBM Algorithm for Windows Malware Detection," 2022 5th Information Technology for Education and Development (ITED), Abuja, Nigeria, 2022, pp. 1-6, doi: 10.1109/ITED56637.2022.10051488.

M. Weerawardana and T. Fernando, "Deepfakes Detection Methods: A Literature Survey," 2021 10th International Conference on Information and Automation for Sustainability (ICIAfS), Negambo, Sri Lanka, 2021, pp. 76-81, doi: 10.1109/ICIAfS52090.2021.9606067. 5

Orhan, A. Fake news detection on social media: the predictive role of university students’ critical thinking dispositions and new media literacy. Smart Learn. Environ. 10, 29 (2023).

Sai, M.J., Chettri, P., Panigrahi, R. et al. An Ensemble of Light Gradient Boosting Machine and Adaptive Boosting for Prediction of Type-2 Diabetes. Int J Comput Intell Syst 16, 14 (2023).

Sangjun Lee, Donggeun Ko, Jinyong Park, Saebyeol Shin, Donghee Hong, and Simon S. Woo. 2022. Deepfake Detection for Fake Images with Facemasks. In Proceedings of the 1st Workshop on Security Implications of Deepfakes and Cheapfakes (WDC '22). Association for Computing Machinery, New York, NY, USA, 27–30.

Songbai Zhu and Guolai Yang. 2023. Research on LightGBM-based fault prediction for electrical equipment in artillery fire control system. In Proceedings of the 4th International Conference on Advanced Information Science and System (AISS '22). Association for Computing Machinery, New York, NY, USA, Article 30, 1–6.

V. Gupta, R. S. Mathur, T. Bansal and A. Goyal, "Fake News Detection using Machine Learning," 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON), Faridabad, India, 2022, pp. 84-89, doi: 10.1109/COM-IT-CON54601.2022.9850560.

Yunxin Liang, Jiyu Wu, Wei Wang, Yujun Cao, Biliang Zhong, Zhenkun Chen, and Zhenzhang Li. 2019. Product marketing prediction based on XGboost and LightGBM algorithm. In Proceedings of the 2nd International Conference on Artificial Intelligence and Pattern Recognition (AIPR '19). Association for Computing Machinery, New York, NY, USA, 150–153.

Yuxiang Zhang, Jingze Lu, Xingming Wang, Zhuo Li, Runqiu Xiao, Wenchao Wang, Ming Li, and Pengyuan Zhang. 2022. Deepfake Detection System for the ADD Challenge Track 3.2 Based on Score Fusion. In Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia (DDAM '22). Association for Computing Machinery, New York, NY, USA, 43–52.

Z. Zhang, "Microsoft Malware Prediction Using LightGBM Model," 2022 3rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), Xi’an, China, 2022, pp. 41-44, doi: 10.1109/ICBAIE56435.2022.9985850. 19

Zhanbo Li and Xiaoyang Li. 2022. Intrusion Detection Method Based on Genetic Algorithm of Optimizing LightGBM. In Proceedings of the 2021 5th International Conference on Electronic Information Technology and Computer Engineering (EITCE '21). Association for Computing Machinery, New York, NY, USA, 1366–1371.