Convolution Neural Network-Based Prediction of Protein Thermostability.

TitleConvolution Neural Network-Based Prediction of Protein Thermostability.
Publication TypeJournal Article
Year of Publication2019
AuthorsFang X, Huang J, Zhang R, Wang F, Zhang Q, Li G, Yan J, Zhang H, Yan Y, Xu L
JournalJ Chem Inf Model
Date Published2019 Oct 28
ISSN1549-960X
Abstract

Most natural proteins exhibit poor thermostability, which limits their industrial application. Computer-aided rational design is an efficient purpose-oriented method that can improve protein thermostability. Numerous machine-learning-based methods have been designed to predict the changes in protein thermostability induced by mutations. However, all of these methods have certain limitations due to existing mutation coding methods that overlook protein sequence features. Here we propose a method to predict protein thermostability using convolutional neural networks based on an in-depth study of thermostability-related protein properties. This method comprises a three-dimensional coding algorithm, including protein mutation information and a strategy to extract neighboring features at protein mutation sites based on multiscale convolution. The accuracies on the S1615 and S388 data sets, which are widely used for protein thermostability predictions, reached 86.4 and 87%, respectively. The Matthews correlation coefficient was nearly double those produced using other methods. Furthermore, a model was constructed to predict the thermostability of lipase mutants based on the S3661 data set, a single amino acid mutation data set screened from the ProTherm protein thermodynamics database. Compared with the RIF strategy, which consists of three algorithms, i.e., Rosetta ddg monomer, I Mutant 3.0, and FoldX, the accuracy of the proposed method was higher (75.0 vs 66.7%), and the negative sample resolution was simultaneously enhanced. These results indicate that our prediction method more effectively assessed the protein thermostability and distinguished its features, making it a powerful tool to devise mutations that enhance the thermostability of proteins, particularly enzymes.

DOI10.1021/acs.jcim.9b00220
Alternate JournalJ Chem Inf Model
PubMed ID31657922