ReconManagerΒΆ
Recon-manager procedure
This project is to develop a framework that systematically predicts Gene-Transcript-Protein-Reaction Associations (GeTPRA) in human metabolims and updates a human genome-scale metabolic model (GEM) accordingly. Recon manager is a part of the project, which is a collection of scripts to generate Recon 2M.1 and simulate Recon models.
Features Recon manager contains scripts that implement following tasks independently:
Convert GPR to TPR associations
Update metabolite information
Calculate model statistics
Evaluate functionality of metabolic model
Reconstruct personal GEMs using tINIT
Installation Major dependencies - [gurobipy](http://www.gurobi.com/)
Procedure Note: This source code was developed in Linux, and has been tested in Ubuntu 14.04.5 LTS (i7-4770 CPU @ 3.40GHz)
Clone the repository
Create and activate virtual environment
$ virtualenv venv
$ source venv/bin/activate
Install packages at the root of the repository
$ pip install pip --upgrade
$ pip install -r requirements.txt
Install [gurobipy](http://www.gurobi.com/)
In our case, we installed gurobipy in the root of a server, and created its symbolic link in venv:
$ ln -s /usr/local/lib/python2.7/dist-packages/gurobipy/ $HOME/recon-manager/venv/lib/python2.7/site-packages/
Feature: Convert GPR to TPR associations Input arguments and corresponding files Following working input files can be found in: ./input_data/GPR_to_TPR_inputs. These files were used for the data presented in the manuscript.
-o : Output directory
- -modelCOBRA-compliant SBML file (generic human GEM)
File name in the source: Recon2M.1_Entrez_Gene.xml
- -gene_transcript_informationA list of gene IDs and their matching transcript IDs
File name in the source: Ensembl88_GRCh38_all_transcript_information.txt
File format
NCBI gene ID Gene stable ID Transcript stable ID RefSeq mRNA ID UCSC Stable ID 2733 ENSG00000119392 ENST00000309971 NM_001003722 uc004bvj.4 2733 ENSG00000119392 ENST00000372770 NM_001499 uc004bvi.4 5690 ENSG00000126067 ENST00000373237 NM_002794 uc001bzf.4 5690 ENSG00000126067 ENST00000373237 NM_001199779 uc001bzf.4 5690 ENSG00000126067 ENST00000621781 NM_001199780 uc021olh.3
Download procedure
Go to [Ensembl BioMarts](http://www.ensembl.org/biomart/martview)
Click Dataset on the left menu
Select Ensembl Genes 89 in the drop-down menu CHOOSE DATABASE
Select Human genes (GRCh38.p10) in the drop-down menu CHOOSE DATASET
Click Filters on the left menu
Click GENE: in the main menu (center)
Check Gene type and select protein_coding
Click Attributes on the left menu
Check Features in the center
Click both GENE: and EXTERNAL: in the main menu (center)
GENE: -> Ensembl -> Uncheck Gene stable ID and Transcript stable ID
Check following items in order:
EXTERNAL: -> External References (max 3) -> NCBI gene ID
GENE: -> Ensembl -> Gene stable ID
GENE: -> Ensembl -> Transcript stable ID
EXTERNAL: -> External References (max 3) -> RefSeq mRNA ID
EXTERNAL: -> External References (max 3) -> UCSC Stable ID
Click the button Results on the top left
Click the button Go in the top center
Implementation Note: Running this script takes ~ 4 m
$ python model_GPR_to_TPR_converter.py \
-o ./results/GPR_to_TPR_results/ \
-gene_transcript_information ./input_data/GPR_to_TPR_inputs/Ensembl88_GRCh38_all_transcript_information.txt \
-model ./input_data/GPR_to_TPR_inputs/Recon2M.1_Entrez_Gene.xml
Feature: Update metabolite information Input arguments and corresponding files Following working input files can be found in: ./input_data/metabolite_information_update_inputs. These files were used for the data presented in the manuscript.
-o : Output directory
- -modelCOBRA-compliant SBML file (generic human GEM)
File name in the source: Recon2M.1_Entrez_Gene.xml
- -mnx_xrefInfo on chemical identifiers from [MetaNetX](http://www.metanetx.org/)
File name in the source: chem_xref.tsv
Click [chem_xref.tsv](http://www.metanetx.org/cgi-bin/mnxget/mnxref/chem_xref.tsv) for downloading
- -mnx_propInfo on chemical structures from [MetaNetX](http://www.metanetx.org/)
File name in the source: chem_prop.tsv
Click [chem_prop.tsv](http://www.metanetx.org/cgi-bin/mnxget/mnxref/chem_prop.tsv) for downloading
- -biggInfo on BiGG metabolites from [BiGG Models](http://bigg.ucsd.edu/)
File name in the source: bigg_models_metabolites.txt
Click [bigg_models_metabolites.txt](http://bigg.ucsd.edu/static/namespace/bigg_models_metabolites.txt) for downloading
- -chebiInfo on ChEBI and InChI from [ChEBI](https://www.ebi.ac.uk/chebi/init.do)
File name in the source: chebiId_inchi.tsv
Click [chebiId_inchi.tsv](ftp://ftp.ebi.ac.uk/pub/databases/chebi/Flat_file_tab_delimited/chebiId_inchi.tsv) for downloading
Implementation Note: Running this script takes ~ 5 s
$ python model_update_metabolite_information.py \
-o ./results/metabolite_information_update_results/ \
-model ./input_data/metabolite_information_update_inputs/Recon2M.1_Entrez_Gene.xml \
-mnx_xref ./input_data/metabolite_information_update_inputs/chem_xref.tsv \
-mnx_prop ./input_data/metabolite_information_update_inputs/chem_prop.tsv \
-bigg ./input_data/metabolite_information_update_inputs/bigg_models_metabolites.txt \
-chebi ./input_data/metabolite_information_update_inputs/chebiId_inchi.tsv
Feature: Calculate model statistics Input arguments and corresponding files Following working input files can be found in: ./input_data/model_function_inputs. These files were used for the data presented in the manuscript.
-o : Output directory
- -modelCOBRA-compliant SBML file (generic human GEM)
File name in the source: Recon2M.1_Entrez_Gene.xml
- -mediumA representative medium (RPMI-1640 medium)
File name in the source: RPMI1640_medium.txt
File format
EX_gly_LPAREN_e_RPAREN_ -0.05 1000 EX_arg_L_LPAREN_e_RPAREN_ -0.05 1000 EX_asn_L_LPAREN_e_RPAREN_ -0.05 1000 EX_asp_L_LPAREN_e_RPAREN_ -0.05 1000 EX_cys_L_LPAREN_e_RPAREN_ -0.05 1000 EX_glu_L_LPAREN_e_RPAREN_ -0.05 1000 EX_his_L_LPAREN_e_RPAREN_ -0.05 1000
Implementation Note: Running this script takes ~ 47 m
$ python model_metabolic_model_statistics.py \
-o ./results/model_statistics_results/ \
-medium ./input_data/model_function_inputs/RPMI1640_medium.txt \
-model ./input_data/model_function_inputs/Recon2M.1_Entrez_Gene.xml
Feature: Evaluate functionality of metabolic model Input arguments and corresponding files Following working input files can be found in: ./input_data/model_function_inputs. These files were used for the data presented in the manuscript.
-o : Output directory
- -modelCOBRA-compliant SBML file (generic human GEM)
File name in the source: Recon2M.1_Entrez_Gene.xml
- -mediumA representative medium (RPMI-1640 medium)
File name in the source: RPMI1640_medium.txt
- -defined_mediumA defined minimal medium
File name in the source: Defined_medium.txt
- -es_genesA list of essential genes
File name in the source: Essential_genes_from_wang_et_al.txt
- -ne_genesA list of non-essential genes
File name in the source: Non_essential_genes_from_wang_et_al.txt
- -c_sourceA list of carbon sources
File name in the source: atp_carbon_source.txt
-biomass : Reaction ID for biomass generation equation
-oxygen : Reaction ID for oxygen uptake
-atp : Reaction ID for ATP production
Implementation Note: Directly insert reaction IDs in terminal for -biomass, -oxygen and -atp
Note: Running this script takes ~ 7 m
$ python model_metabolic_function.py \
-o ./results/model_function_results/ \
-model ./input_data/model_function_inputs/Recon2M.1_Entrez_Gene.xml \
-medium ./input_data/model_function_inputs/RPMI1640_medium.txt \
-defined_medium ./input_data/model_function_inputs/Defined_medium.txt \
-es_genes ./input_data/model_function_inputs/Essential_genes_from_wang_et_al.txt \
-ne_genes ./input_data/model_function_inputs/Non_essential_genes_from_wang_et_al.txt \
-c_source ./input_data/model_function_inputs/atp_carbon_source.txt \
-biomass biomass_reaction \
-oxygen EX_o2_LPAREN_eRPAREN \
-atp DM_atp_c_
Feature: Reconstruct personal GEMs using [tINIT](http://msb.embopress.org/content/10/3/721.long) Input arguments and corresponding files Following working input files can be found in: ./input_data/tINIT_inputs. These files were used for the data presented in the manuscript.
-o : Output directory
- -modelCOBRA-compliant SBML file (generic human GEM)
File name in the source: Recon2M.1_Entrez_Gene.xml
- -mediumA representative medium (RPMI-1640 medium)
File name in the source: RPMI1640_medium.txt
- -taskA list of metabolic tasks
File name in the source: MetabolicTasks.csv
- -present_reactionA list of reactions that should be present in model
File name in the source: essential_reactions.txt
- -present_metaboliteA list of metabolites that should be present in model
File name in the source: essential_metabolites.txt
- -iOmics data
File name in the source: BLCA_T_TTL.csv
-biomass : Reaction ID for biomass generation equation
Implementation Note: Directly insert a reaction ID in terminal for -biomass
Note: Running this script for a Recon model takes ~ 8 m
$ python personal_GEM_tINIT.py \
-o ./results/tINIT_results/ \
-medium ./input_data/tINIT_inputs/RPMI1640_medium.txt \
-model ./input_data/tINIT_inputs/Recon2M.1_Entrez_Gene.xml \
-biomass biomass_reaction \
-task ./input_data/tINIT_inputs/MetabolicTasks.csv \
-present_reaction ./input_data/tINIT_inputs/essential_reactions.txt \
-present_metabolite ./input_data/tINIT_inputs/essential_metabolites.txt \
-i ./input_data/tINIT_inputs/BLCA_N_TTL.csv
Feature: Predict flux using transcript-level RNA-Seq data and GeTPRA Input arguments and corresponding files Following working input files can be found in: ./input_data/Flux_prediction. These files were used for the data presented in the manuscript.
-o : Output directory
- -g_modelCOBRA-compliant SBML file (generic human GEM)
File name in the source: Recon2M.2_BiGG_UCSC_Transcript.xml
- -c_modelCOBRA-compliant SBML file (context-specific human GEM)
File name in the source: LIHC_TCGA-BC-A10Q.xml
- -getpraGeTPRA file
File name in the source: GeTPRA.txt
- -use_getpraOption for flux prediction using transcript-level data
Insert`yes` for flux prediction using transcript-level data
Insert`no` for flux prediction using gene-level data
Implementation Note: Running this script takes ~ 7 m
$ python flux_prediction.py \
-o ./results/Flux_prediction/ \
-i ./input_data/Flux_prediction/LIHC_TCGA-BC-A10Q.csv \
-getpra ./input_data/Flux_prediction/GeTPRA.txt \
-g_model ./input_data/Flux_prediction/Recon2M.2_BiGG_UCSC_Transcript.xml \
-c_model ./input_data/Flux_prediction/LIHC_TCGA-BC-A10Q.xml \
-use_getpra yes
Publication Jae Yong Ryu 1, Hyun Uk Kim 1 & Sang Yup Lee. Framework and resource for more than 11,000 gene-transcript-protein-reaction associations in human metabolism., Proc. Natl. Acad. Sci. U.S.A., 2017, http://www.pnas.org/content/early/2017/10/23/1713050114