Segmentation of DNA into coding and noncoding regions based on recursive entropic segmentation and stop-codon statistics (Q1773696)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Segmentation of DNA into coding and noncoding regions based on recursive entropic segmentation and stop-codon statistics |
scientific article; zbMATH DE number 2163808
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Segmentation of DNA into coding and noncoding regions based on recursive entropic segmentation and stop-codon statistics |
scientific article; zbMATH DE number 2163808 |
Statements
Segmentation of DNA into coding and noncoding regions based on recursive entropic segmentation and stop-codon statistics (English)
0 references
3 May 2005
0 references
Summary: Heterogeneous DNA sequences can be partitioned into homogeneous domains that are comprised of the four nucleotides A, C, G, and T and the stop-codons. Recursively, we apply a new entropic segmentation method on DNA sequences using Jensen-Shannon and Jensen-Rényi divergences in order to find the borders between coding and noncoding DNA regions. We have chosen 12- and 18-symbol alphabets that capture (i) the differential nucleotide composition in codons, and (ii) the differential stop-codon composition along all the three phases in both strands of the DNA. The new segmentation method is based on the Jensen-Rényi divergence measure, nucleotide statistics, and stop-codon statistics in both DNA strands. The recursive segmentation process requires no prior training on known datasets. Consequently, for three entire genomes of bacteria, we find that the use of nucleotide composition, stop-codon composition, and Jensen-Rényi divergence improve the accuracy of finding the borders between coding and noncoding regions in DNA sequences.
0 references
information divergence measures
0 references
Bayesian information criterion
0 references
0.7351032495498657
0 references