Sting_RDB: a relational database of structural parameters for protein analysis with support for data warehousing and data mining.

Abstract. An effective strategy for managing protein databases is to provide mechanisms to transform raw data into consistent, accurate and reliable information. Such mechanisms will greatly reduce operational inefficiencies and improve one's ability to better handle scientific objectives and interpret the research results. To achieve this challenging goal for the STING project, we introduce Sting_RDB, a relational database of structural parameters for protein analysis with support for data warehousing and data mining. In this article, we highlight the main features of Sting_RDB and show how a user can explore it for efficient and biologically relevant queries. Considering its importance for molecular biologists, effort has been made to advance Sting_RDB toward data quality assessment. To the best of our knowledge, Sting_RDB is one of the most comprehensive data repositories for protein analysis, now also capable of providing its users with a data quality indicator. This paper differs from our previous study in many aspects. First, we introduce Sting_RDB, a relational database with mechanisms for efficient and relevant queries using SQL. Sting_rdb evolved from the earlier, text (flat file)-based database, in which data consistency and integrity was not guaranteed. Second, we provide support for data warehousing and mining. Third, the data quality indicator was introduced. Finally and probably most importantly, complex queries that could not be posed on a text-based database, are now easily implemented.

Saved in:
Bibliographic Details
Main Authors: OLIVEIRA, S. R. de M., ALMEIDA, G. V., SOUZA, K. R. R., RODRIGUES, D. N., KUSER-FALCÃO, P. R., YAMAGISHI, M. E. B., SANTOS, E. H. dos, VIEIRA, F. D., JARDINE, J. G., NESHICH, G.
Other Authors: STANLEY ROBSON DE MEDEIROS OLIVEIRA, CNPTIA; PAULA REGINA KUSER FALCAO, CNPTIA; MICHEL EDUARDO BELEZA YAMAGISHI, CNPTIA; EDGARD HENRIQUE DOS SANTOS, CNPTIA; FABIO DANILO VIEIRA, CNPTIA; JOSE GILBERTO JARDINE, CNPTIA; GORAN NESHICH, CNPTIA.
Format: Artigo de periódico biblioteca
Language:English
eng
Published: 2007-12-07
Subjects:Bioinformática, Análise de estrutura de proteínas, Mineração de dados, Base de dados Sting, Data mining, Data warehousing, Proteína, Bioinformatics, Proteins, Databases,
Online Access:http://www.alice.cnptia.embrapa.br/alice/handle/doc/814
Tags: Add Tag
No Tags, Be the first to tag this record!