Currently, large quantities of data are being generated by new-generation sequencing technologies and high-throughput experiments for genes and proteins, many of which remain uncharacterized. Particularly, for enzymatic networks reconstruction a correct annotation of enzyme function is critical. In this work, we introduce a novel computational approach to predict enzymatic functions of protein-coding genes, from reconstructed genome-scale metabolic networks that incorporate several data types (enZYme Function prediction COmbining Networks, ZYFCON).
This method consists in the construction of an organismic enzymatic network, which is weighted by similarity measures between genes for different data types, deriving a weighted network for each type of data. Subsequently, all weighted networks are combined in a final integrated organismic enzymatic network, in which enzymatic activity predictions of protein-coding genes are stored.
We applied ZYFCON on retrieved datasets (expression, GO biological process annotation, GO molecular function annotation, transcription factor regulation interaction, Pfam family annotation, and protein-protein interaction data) for Saccharomyces cerevisiae, Caenorhabditis elegans, and Arabidopsis thaliana, assessed the quality of the constructed networks and confirmed inferred predictions by literature validation.
ZYFCON is available at https://github.com/andquintero/ZYFCON
|