A fast algorithm to "de novo" genome wide tandem repeats discovery.

Tandem Repeats (TR) are sequences where the same pattern repeats consecutively. They have been used as genomic markers (microsatellite and minisatéllite) since the begining of the genomic era. Recently, new studies have associated TR to important regulatory processes which substantialy increased the interest in TR. The exponential reduction cost of sequencing caused by the new technologies, resulted in the proliferation of genome projects, and particularly of novel model organisms. Very often, the first sequence analysis is the identification of genetic markers such as SNPs and TRs. As the former is a by product of the assembly phase, the real chalenge resides in the latter since the TRs identification must be done de novo. This scenario requires a faster and more efficient algorithms to perform de novo TR discovery. In this paper, we propose a new strategy to address this problem. Our algorithm is able to deal with large genomes in a reduced computational time (on average 30% to 50% faster than other the approaches). Furthermore, our algorithm finds all TR in a genome while some popular algorithms do not as will be shown. Consequently, as our algorithm is faster and find all TR, it may be used in new genomes and old genomes as well to discover eventually missed TR.

Saved in:
Bibliographic Details
Main Authors: NARCISO, M., YAMAGISHI, M.
Other Authors: MARCELO GONCALVES NARCISO, CNPAF; MICHEL EDUARDO BELEZA YAMAGISHI, CNPTIA.
Format: Anais e Proceedings de eventos biblioteca
Language:English
eng
Published: 2011-12-06T11:11:11Z
Subjects:Algoritmo, Marcadores genéticos, Polimorfismo de nucleotídeo único, Repetições em Tandem, Algorithms, Microsatellite repeats, Genetic markers, Tandem repeat sequences, Single nucleotide polymorphism,
Online Access:http://www.alice.cnptia.embrapa.br/alice/handle/doc/908712
Tags: Add Tag
No Tags, Be the first to tag this record!