File Download

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

GrzybowskiBartosz Andrzej

Grzybowski, Bartosz A.
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Linguistic measures of chemical diversity and the "keywords" of molecular collections

Author(s)
Wozniak, MichalWolos, AgnieszkaModrzyk, UrszulaGorski, Rafal L.Winkowski, JanBajczyk, MichalSzymkuc, SaraGrzybowski, Bartosz A.Eder, Maciej
Issued Date
2018-05
DOI
10.1038/s41598-018-25440-6
URI
https://scholarworks.unist.ac.kr/handle/201301/24194
Fulltext
https://www.nature.com/articles/s41598-018-25440-6
Citation
SCIENTIFIC REPORTS, v.8, pp.7598
Abstract
Computerized linguistic analyses have proven of immense value in comparing and searching through large text collections ("corpora"), including those deposited on the Internet-indeed, it would nowadays be hard to imagine browsing the Web without, for instance, search algorithms extracting most appropriate keywords from documents. This paper describes how such corpus-linguistic concepts can be extended to chemistry based on characteristic "chemical words" that span more than traditional functional groups and, instead, look at common structural fragments molecules share. Using these words, it is possible to quantify the diversity of chemical collections/databases in new ways and to define molecular "keywords" by which such collections are best characterized and annotated.
Publisher
NATURE PUBLISHING GROUP
ISSN
2045-2322
Keyword
COMMON SUBSTRUCTURESVOCABULARY RICHNESSDRUG DISCOVERYTOKEN RATIOPERSPECTIVEREACTIVITYCHEMISTRYQUALITYDESIGNGENRES

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.