A pattern-driven Huffman encoding and positional encoding for DNA compression
Downloads
Published
Keywords:
Compression Ratio, Deoxyribonucleic Acid, Huffman Coding, Positional Encoding TechniqueDimensions Badge
Issue
Section
License
Copyright (c) 2025 The Scientific Temper

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Researchers from bioinformatics, biology, biotechnology, and medical sciences who are engaged in genetic data analysis face significant challenges in the manipulation and storage of large datasets. Compression algorithms are essential for increasing storage capacity and reducing the number of bits required to represent nucleotide bases. The Pattern-driven Huffman Encoding and Positional Encoding for DNA Compression (P2DNAComp) algorithm is designed to compress both non-repetitive and repetitive pattern bases within DNA sequences. This demonstrates the algorithm’s adaptability across various pattern types in genomic data. P2DNAComp employs a systematic approach to efficiently compress DNA sequences. It reads the sequences and constructs a symbol table to maintain the positional values of repeated patterns. Using Huffman coding, the algorithm determines the optimal bit representation for each repeated pattern to maximize storage efficiency. For non-repetitive patterns, a coded table is created to store positional values. Subsequently, a positional encoding technique is applied to minimize the number of bits needed for efficient representation. The maximum positional value is set as the upper limit, and the minimum number of bits required is computed using a binary logarithm function. The final compressed sequence is generated by encoding both repetitive and non-repetitive patterns. Using standard datasets from the GenBank database, the performance of the P2DNAComp algorithm was evaluated based on compression ratio, compression/decompression time, and compression gain. The algorithm achieved an average compression ratio of 1.09 bits per base (bpb), an average compression gain of 86.279%, and average compression and decompression times of 0.547 and 0.563 seconds, respectively.Abstract
How to Cite
Downloads
Similar Articles
- Ajay Kumar, Sunder S. Arya, Neha Yadav, Mamta Sawariya, Naveen Kumar, Himanshu Mehra, Sunil Kumar, Assessing the role of EDTA and SA in mustard under Cd and Pb stress , The Scientific Temper: Vol. 15 No. 01 (2024): The Scientific Temper
- AMIR ALI, PERWEZ AHMAD, STUDIES ON TOTAL PLASMA VOLUME, CORPUSCULAR VOLUME AND BLOOD WEIGHT IN RELATION TO BODY WEIGHT IN A FRESH WATER TELEOSTEAN FISH MYSTUS CAVASIUS (HAM.) , The Scientific Temper: Vol. 10 No. 1&2 (2019): The Scientific Temper
- Kurubara Amaresh, M. S. Ganachari, Revanasiddappa Devarinti , Enhancing participant understanding and ethical considerations in clinical trial biospecimen research: Insights from an oncology setting in India , The Scientific Temper: Vol. 15 No. 02 (2024): The Scientific Temper
- Sangeeta ., Jitander S. Sikka, Meenal Malik, Static deformation of a two-phase medium consisting of a rigid boundary elastic layer and an isotropic elastic half-space induced by a very long tensile fault , The Scientific Temper: Vol. 15 No. 02 (2024): The Scientific Temper
- Shahala Sheikh, Lalsingh Khalsa, Nitin Chandel, Vinod Varghese, Hygrothermoelastic large deflection behaviour in a thin circular plate with non-Fourier and non-Fick law , The Scientific Temper: Vol. 15 No. 02 (2024): The Scientific Temper
- P. L. Parmar, P. M. George, Study and optimization of process parameters for deformation machining stretching mode , The Scientific Temper: Vol. 15 No. 02 (2024): The Scientific Temper
- Ali Dakheel, Ismaeil Mammani, Jiyar Naji, The effect of human periodontal pathogenic bacteria on immediate basal implant placement: A comparative study in beagle dogs , The Scientific Temper: Vol. 15 No. 02 (2024): The Scientific Temper
- D. Padma Prabha, C. Victoria Priscilla, A combined framework based on LSTM autoencoder and XGBoost with adaptive threshold classification for credit card fraud detection , The Scientific Temper: Vol. 15 No. 02 (2024): The Scientific Temper
- Akanksha Singh, Nand Kumar, Analysis of renewable energy and economic growth of Germany , The Scientific Temper: Vol. 15 No. 02 (2024): The Scientific Temper
- Krishna P. Kalyanathaya, Krishna Prasad K, A novel method for developing explainable machine learning framework using feature neutralization technique , The Scientific Temper: Vol. 15 No. 02 (2024): The Scientific Temper
<< < 10 11 12 13 14 15 16 17 18 > >>
You may also start an advanced similarity search for this article.

