Article,

Predicting novel substrates for enzymes with minimal experimental effort with active learning.

D. Pertusi, M. Moura, J. Jeffryes, S. Prabhu, B. Walters Biggs, and K. Tyo.
Metabolic engineering, (November 2017)

Abstract

Enzymatic substrate promiscuity is more ubiquitous than previously thought, with significant consequences for understanding metabolism and its application to biocatalysis. This realization has given rise to the need for efficient characterization of enzyme promiscuity. Enzyme promiscuity is currently characterized with a limited number of human-selected compounds that may not be representative of the enzyme's versatility. While testing large numbers of compounds may be impractical, computational approaches can exploit existing data to determine the most informative substrates to test next, thereby more thoroughly exploring an enzyme's versatility. To demonstrate this, we used existing studies and tested compounds for four different enzymes, developed support vector machine (SVM) models using these datasets, and selected additional compounds for experiments using an active learning approach. SVMs trained on a chemically diverse set of compounds were discovered to achieve maximum accuracies of \~80\% using \~33\% fewer compounds than datasets based on all compounds tested in existing studies. Active learning-selected compounds for testing resolved apparent conflicts in the existing training data, while adding diversity to the dataset. The application of these algorithms to wide arrays of metabolic enzymes would result in a library of SVMs that can predict high-probability promiscuous enzymatic reactions and could prove a valuable resource for the design of novel metabolic pathways. Copyright \copyright 2017 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.

BibTeX key: Pertusi2017Predicting
entry type: article
year: 2017
month: nov
journal: Metabolic engineering
pages: 171--181
volume: 44
citeulike-article-id: 14510716
citeulike-linkout-1: http://www.hubmed.org/display.cgi?uids=29030274
pmid: 29030274
priority: 2
posted-at: 2018-01-01 09:55:20
issn: 1096-7184
citeulike-linkout-0: http://view.ncbi.nlm.nih.gov/pubmed/29030274
url: http://view.ncbi.nlm.nih.gov/pubmed/29030274

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Journal Article %1 Pertusi2017Predicting %A Pertusi, Dante A. %A Moura, Matthew E. %A Jeffryes, James G. %A Prabhu, Siddhant %A Walters Biggs, Bradley %A Tyo, Keith E. J. %D 2017 %J Metabolic engineering %K retrosynthesis underground-metabolism %P 171--181 %T Predicting novel substrates for enzymes with minimal experimental effort with active learning. %U http://view.ncbi.nlm.nih.gov/pubmed/29030274 %V 44 %X Enzymatic substrate promiscuity is more ubiquitous than previously thought, with significant consequences for understanding metabolism and its application to biocatalysis. This realization has given rise to the need for efficient characterization of enzyme promiscuity. Enzyme promiscuity is currently characterized with a limited number of human-selected compounds that may not be representative of the enzyme's versatility. While testing large numbers of compounds may be impractical, computational approaches can exploit existing data to determine the most informative substrates to test next, thereby more thoroughly exploring an enzyme's versatility. To demonstrate this, we used existing studies and tested compounds for four different enzymes, developed support vector machine (SVM) models using these datasets, and selected additional compounds for experiments using an active learning approach. SVMs trained on a chemically diverse set of compounds were discovered to achieve maximum accuracies of \~80\% using \~33\% fewer compounds than datasets based on all compounds tested in existing studies. Active learning-selected compounds for testing resolved apparent conflicts in the existing training data, while adding diversity to the dataset. The application of these algorithms to wide arrays of metabolic enzymes would result in a library of SVMs that can predict high-probability promiscuous enzymatic reactions and could prove a valuable resource for the design of novel metabolic pathways. Copyright \copyright 2017 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.

@article{Pertusi2017Predicting, abstract = {Enzymatic substrate promiscuity is more ubiquitous than previously thought, with significant consequences for understanding metabolism and its application to biocatalysis. This realization has given rise to the need for efficient characterization of enzyme promiscuity. Enzyme promiscuity is currently characterized with a limited number of human-selected compounds that may not be representative of the enzyme's versatility. While testing large numbers of compounds may be impractical, computational approaches can exploit existing data to determine the most informative substrates to test next, thereby more thoroughly exploring an enzyme's versatility. To demonstrate this, we used existing studies and tested compounds for four different enzymes, developed support vector machine ({SVM}) models using these datasets, and selected additional compounds for experiments using an active learning approach. {SVMs} trained on a chemically diverse set of compounds were discovered to achieve maximum accuracies of \~{}80\% using \~{}33\% fewer compounds than datasets based on all compounds tested in existing studies. Active learning-selected compounds for testing resolved apparent conflicts in the existing training data, while adding diversity to the dataset. The application of these algorithms to wide arrays of metabolic enzymes would result in a library of {SVMs} that can predict high-probability promiscuous enzymatic reactions and could prove a valuable resource for the design of novel metabolic pathways. Copyright {\copyright} 2017 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.}, added-at = {2018-12-02T16:09:07.000+0100}, author = {Pertusi, Dante A. and Moura, Matthew E. and Jeffryes, James G. and Prabhu, Siddhant and Walters Biggs, Bradley and Tyo, Keith E. J.}, biburl = {https://www.bibsonomy.org/bibtex/20eb703e0a53e3a929915603c48644a4f/karthikraman}, citeulike-article-id = {14510716}, citeulike-linkout-0 = {http://view.ncbi.nlm.nih.gov/pubmed/29030274}, citeulike-linkout-1 = {http://www.hubmed.org/display.cgi?uids=29030274}, interhash = {5216c886b1adfa42476dce320b7a51b6}, intrahash = {0eb703e0a53e3a929915603c48644a4f}, issn = {1096-7184}, journal = {Metabolic engineering}, keywords = {retrosynthesis underground-metabolism}, month = nov, pages = {171--181}, pmid = {29030274}, posted-at = {2018-01-01 09:55:20}, priority = {2}, timestamp = {2018-12-07T10:33:07.000+0100}, title = {Predicting novel substrates for enzymes with minimal experimental effort with active learning.}, url = {http://view.ncbi.nlm.nih.gov/pubmed/29030274}, volume = 44, year = 2017 }

BibSonomy

Predicting novel substrates for enzymes with minimal experimental effort with active learning.

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on