Analise In silico e In vivo da diversidade nucleotidica em Coffea spp. : [Preprint]

Single nucleotide polymorphisms are the most abundant polymorphisms in the genomes analyzed to date. They are becoming the main choice of molecular markers for breeding, genotyping, and diagnosis purposes, due to the large amount of sequences data available. Identification of those nucleotide polymorphisms will provide useful markers for genetic mapping, population genetics and association studies. It will also provide criteria to infer the evolutionary history of the analyzed genes, which can be relevant to select the best candidate genes to test in future association studies. For those reasons, the objectives of this work were: 1) identify and validate both in silico and in vivo, the SNPs and INDELS existing in EST resources; and 2) analyze the nucleotide diversity in Coffea spp., in addition to selected C. arabica cultivars. A Pipeline for identification of SNPs and INDELS was generated using sequences from the Brazilian ESTs Coffee Genome Project as well as other Coffea sequences available in GenBank. The pipeline was carried out by a haplotype -based strategy to detect reliable SNPs in 23.019 contigs assembled. A total of 23.062 SNPs e 2.165 INDELS were identified in 5184 contigs with more than four ESTs assembled. With the haplotype-based strategy, it was possible to define the probable ancestral of C. arabica transcripts. The majority of ESTs from C. arabica, came from only two different alleles, providing molecular evidences about C. arabica speciation. According to our analysis, approximately 55% of C. arabica sequences were derived from C. eugenioides, and 45% were considered as come from C. canephora. Interestedly, C. eugenioides contributes mostly with genes related to basal metabolism and the secondary metabolism, while genes C. canephora genes are involved with signal transduction and gene expression regulation. The in vivo analyses are being performed by sequencing PCR fragments of several genes in 24 Coffea genotypes corresponding respectively to 12 C. arabica, 9 C. canephora and 3 Coffea spp. Genotypes belonging to C. arabica and C. canephora were chosen in order to represent the largest diversity possible. Sequencing results from Sucrose Phosphate Sintase gene in those genotypes reveal the presence of SNPs mainly in interspecific sequences. A higher number of SNPs intraspecific was also observed for C. canephora. Intraespecifc SNPs for C. arabica were the same observed in the two ancestral genomes, C. canephora and C. eugenioides.

Saved in:
Bibliographic Details
Main Authors: Vidal, Ramon, Yanagui, Karina, Ferreira, Lucia Pires, Lannes, Sérgio Dias, Vieira, Luiz Gonzaga Esteves, Mondego, Jorge M.C., Carazzolle, Marcelo Falsarella, Pereira, Gonçalo Amarante Guimarães, Pot, David, Pereira, Luiz Filipe P.
Format: conference_item biblioteca
Language:por
Published: s.n.
Subjects:F30 - Génétique et amélioration des plantes, Coffea, http://aims.fao.org/aos/agrovoc/c_1720, http://aims.fao.org/aos/agrovoc/c_1070,
Online Access:http://agritrop.cirad.fr/551305/
http://agritrop.cirad.fr/551305/1/document_551305.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Single nucleotide polymorphisms are the most abundant polymorphisms in the genomes analyzed to date. They are becoming the main choice of molecular markers for breeding, genotyping, and diagnosis purposes, due to the large amount of sequences data available. Identification of those nucleotide polymorphisms will provide useful markers for genetic mapping, population genetics and association studies. It will also provide criteria to infer the evolutionary history of the analyzed genes, which can be relevant to select the best candidate genes to test in future association studies. For those reasons, the objectives of this work were: 1) identify and validate both in silico and in vivo, the SNPs and INDELS existing in EST resources; and 2) analyze the nucleotide diversity in Coffea spp., in addition to selected C. arabica cultivars. A Pipeline for identification of SNPs and INDELS was generated using sequences from the Brazilian ESTs Coffee Genome Project as well as other Coffea sequences available in GenBank. The pipeline was carried out by a haplotype -based strategy to detect reliable SNPs in 23.019 contigs assembled. A total of 23.062 SNPs e 2.165 INDELS were identified in 5184 contigs with more than four ESTs assembled. With the haplotype-based strategy, it was possible to define the probable ancestral of C. arabica transcripts. The majority of ESTs from C. arabica, came from only two different alleles, providing molecular evidences about C. arabica speciation. According to our analysis, approximately 55% of C. arabica sequences were derived from C. eugenioides, and 45% were considered as come from C. canephora. Interestedly, C. eugenioides contributes mostly with genes related to basal metabolism and the secondary metabolism, while genes C. canephora genes are involved with signal transduction and gene expression regulation. The in vivo analyses are being performed by sequencing PCR fragments of several genes in 24 Coffea genotypes corresponding respectively to 12 C. arabica, 9 C. canephora and 3 Coffea spp. Genotypes belonging to C. arabica and C. canephora were chosen in order to represent the largest diversity possible. Sequencing results from Sucrose Phosphate Sintase gene in those genotypes reveal the presence of SNPs mainly in interspecific sequences. A higher number of SNPs intraspecific was also observed for C. canephora. Intraespecifc SNPs for C. arabica were the same observed in the two ancestral genomes, C. canephora and C. eugenioides.