File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

박종화

Bhak, Jong
KOrean GenomIcs Center
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.citation.endPage 464 -
dc.citation.number 5 -
dc.citation.startPage 458 -
dc.citation.title BIOINFORMATICS -
dc.citation.volume 16 -
dc.contributor.author Bhak, Jong Hwa -
dc.contributor.author Holm, Liisa -
dc.contributor.author Heger, Andreas -
dc.contributor.author Chothia, Cyrus -
dc.date.accessioned 2023-12-22T12:07:47Z -
dc.date.available 2023-12-22T12:07:47Z -
dc.date.created 2015-08-03 -
dc.date.issued 2000-05 -
dc.description.abstract Motivation: Biological sequence databases are highly redundant for two main reasons. 1. various databanks keep redundant sequences with many identical and nearly identical sequences 2. natural sequences often have high sequence identities due to gene duplication. We wanted to know how many sequences call be removed before the databases start losing homology information. Can a database of sequences with mutual sequence identity of 50% or less provide us with the same amount of biological information as the original full database ? Results: Comparisons of nine representative sequence databases (RSDB) derived from full protein databanks showed that the information content of sequence databases is not linearly proportional to its size. An RSDB reduced to mutual sequence identity of around 50% (RSDB50) was equivalent to the original full database irt terms of the effectiveness of homology searching. It was a third of the full database size which resulted in a six times faster iterative profile searching. The RSDBs are produced at different granularity for efficient homology searching. Availability: All the RSDB files generated ann the full analysis results are available through internet: ftp://ftp.ebi.ac.uk/pub/contrib/jong/RSDB/ http://cyrah.ebi. ac.uk:1111/Proj/Bio/RSDB Contact: jong@biosophy/org -
dc.identifier.bibliographicCitation BIOINFORMATICS, v.16, no.5, pp.458 - 464 -
dc.identifier.doi 10.1093/bioinformatics/16.5.458 -
dc.identifier.issn 1367-4803 -
dc.identifier.scopusid 2-s2.0-0033940118 -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/13253 -
dc.identifier.url http://bioinformatics.oxfordjournals.org/content/16/5/458 -
dc.identifier.wosid 000088444600006 -
dc.language 영어 -
dc.publisher OXFORD UNIV PRESS -
dc.title.alternative RSDB: representative protein sequence databases have high information content. -
dc.title RSDB: representative protein sequence databases have high information content -
dc.type Article -
dc.description.journalRegisteredClass scopus -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.