A general stochastic model to handle deduplication challenges using hidden Markov model in big data analytics

Published

27-12-2023

DOI:

https://doi.org/10.58414/SCIENTIFICTEMPER.2023.14.4.50

Keywords:

Hidden markov model, Markov chain transition, Likelihood estimation, Poisson distribution.

Dimensions Badge

Issue

Section

SECTION C: ARTIFICIAL INTELLIGENCE, ENGINEERING, TECHNOLOGY

Authors

  • Sahaya Jenitha A Department of Computer Science, Cauvery College for Women, Bharathidasan University, Tiruchirappalli, Tamil Nadu, India.
  • Sinthu J. Prakash Department of Computer Science, Cauvery College for Women, Bharathidasan University, Tiruchirappalli, Tamil Nadu, India.

Abstract

Background: Since increased interest of consumers, cloud computing is needed to store and access the information about their data in their convenient way. In recent days, cloud computing offers many services stipulated by the internet. Data duplication is one of the main challenges in big data analytics that leads to increased data storage and processing time. Therefore, there is a need to develop a data deduplication process. It eliminates excessive copies of data as well as decreases the storage space. In order to preserve the accurate data information without any duplication, joint probability distribution computes the likelihood of two events occurring together at the same time and thus it leads to removing the redundant data before data is sent to the cloud server.
Methods: this paper presents a GSM algorithm that uses hidden markov model, likelihood estimation, markov chain transition, and poisson distribution model.
Findings: Joint probability distribution computes the likelihood of two events occurring together at the same time and thus it leads to removing the redundant data before data is sent to the cloud server.
Novelty and applications: This paper proposes the general stochastic model (GSM) to handle redundant data by a multi-level process using hidden markov model (HMM), likelihood estimation, transition probability and poisson distribution model (PDM).

How to Cite

A, S. J., & Prakash, S. J. (2023). A general stochastic model to handle deduplication challenges using hidden Markov model in big data analytics. The Scientific Temper, 14(04), 1398–1403. https://doi.org/10.58414/SCIENTIFICTEMPER.2023.14.4.50

Downloads

Download data is not yet available.