Enhancing classification accuracy on code-mixed and imbalanced data using an adaptive deep autoencoder and XGBoost

Published

29-07-2024

DOI:

https://doi.org/10.58414/SCIENTIFICTEMPER.2024.15.3.27

Keywords:

Sentiment analysis, Deep learning, Code-mixing, Autoencoder, Imbalance classification.

Dimensions Badge

Issue

Section

SECTION C: ARTIFICIAL INTELLIGENCE, ENGINEERING, TECHNOLOGY

Authors

  • Ayesha Shakith Department of Computer Science, St. Joseph’s College (Autonomous), Affiliated to Bharathidasan University, Trichy, India.
  • L. Arockiam Department of Computer Science, St. Joseph’s College (Autonomous), Affiliated to Bharathidasan University, Trichy, India.

Abstract

This study introduces a pioneering approach for enhancing classification accuracy on code-mixed and imbalanced data by integrating an adaptive deep autoencoder with dynamic sampling techniques. Targeting the intricate challenges of sentiment analysis within such datasets, this methodology employs an enhanced XGBoost classifier, optimized to leverage the nuanced features extracted by the autoencoder. The experimental evaluation across diverse datasets, predominantly involving Tamil-English code-mixed texts, demonstrates a notable improvement in performance metrics: accuracy reached 84.2%, precision was recorded at 74.8%, recall stood at 78.4%, and the F1-Score achieved 76.6%. This marks an enhancement over existing methods by 0.5% to 1.5%, substantiating the model's robust capability in effectively handling linguistic diversity and class imbalances. The novelty of this research lies in the seamless integration of dynamic sampling within the autoencoder's training loop, significantly boosting the adaptability and effectiveness of the machine-learning model in real-world applications.

How to Cite

Ayesha Shakith, & L. Arockiam. (2024). Enhancing classification accuracy on code-mixed and imbalanced data using an adaptive deep autoencoder and XGBoost. The Scientific Temper, 15(03), 2598–2608. https://doi.org/10.58414/SCIENTIFICTEMPER.2024.15.3.27

Downloads

Download data is not yet available.