Computer Science: Faculty Publications
COGRAM: A Computational Pipeline for Genome Assembly and Reconstruction via Optimized K-mer Sampling and De Bruijn Graph Networks
Document Type
Article
Publication Date
1-26-2026
Publication Title
Social Networks Analysis and Mining - 17th International Conference, ASONAM 2025, Proceedings
DOI
10.1007/978-3-032-13513-1_31
ISBN
9783032135124
Abstract
Genome assembly and annotation accuracy fundamentally depend on optimal selection of parameters and robust computational approaches. Here we introduce COGRAM (Coggins-Ramasamy Genomic Assembly Method), a novel bioinformatics pipeline that enhances genome assembly and reconstruction by optimizing k-mer parameters, leveraging graph theory, and incorporating machine learning techniques. Initially, COGRAM identifies the optimal k-mer length using methods inspired by KMERGENIE and grid search techniques, followed by random genomic sampling at the optimal resolution. It then conducts a comprehensive analysis of the frequency distributions of k-mer and GC-content across the sampled genome windows. Subsequently, the pipeline constructs a detailed de Bruijn framework graph from parsed genomic data. Using this graph, COGRAM trains a network to model genomic structures effectively, enhancing accuracy and scalability. Genome reconstruction is accomplished through rigorous cross-validation with a greedy algorithm designed to refine the quality of genome assembly iteratively. We demonstrate the effectiveness of COGRAM through benchmark tests on the E. coli genome. This pipeline represents a powerful tool for genomic projects with potential for expansion to other projects.
Recommended Citation
Coggins, William, Vijayalakshmi Ramasamy.
2026.
"COGRAM: A Computational Pipeline for Genome Assembly and Reconstruction via Optimized K-mer Sampling and De Bruijn Graph Networks."
Social Networks Analysis and Mining - 17th International Conference, ASONAM 2025, Proceedings: 391-412.
doi: 10.1007/978-3-032-13513-1_31 source: https://link.springer.com/chapter/10.1007/978-3-032-13513-1_31 isbn: 9783032135124
https://digitalcommons.georgiasouthern.edu/compsci-facpubs/323
Copyright
This work is archived and distributed under the repository's Standard Copyright and Reuse License (opens in new tab). End users may copy, store, and distribute this work without restriction. For all other uses, permission must be obtained from the copyright owners or their authorized agents.