These files are MySQL database dumps used to seed a new cBioPortal database instance. They include essential data for a fully operational cBioPortal website, such as cancer types, genes, uniprot-mappings, drug information, and network data.
The instructions for building and updating seedDBs can be found here.
Gene and gene alias tables in seedDB are updated every 6 months.
This schema is required for cBioPortal release versions:
- 5.3.14 or higher
For release versions > 5.3.14, there might be a need to migrate to a new database schema. The migration process is described here.
Schema 2.13.1: SQL file with create table statements
Seed database: seed-cbioportal_hg19_hg38_v2.13.1.sql.gz
md5sum d8e328d43089c817dc26e144b2524e8a
Updates to seed database:
- Entrez Gene IDs, gene symbols, and gene aliases have been updated based on the HGNC Oct 1, 2023 release. The detailed changes are listed here.
- Gene Sets have been updated from MSigDB v2023.2.Hs.
From this release onwards, we offer a combined seed database for both hg19 and hg38. To access seed databases from previous versions, please refer to the respective archive folders.
This schema is required for cBioPortal release versions:
- 5.3.0 or higher
For release versions > 5.3.0, there might be a need to migrate to a new database schema. The migration process is described here.
Schema 2.13.0: SQL file with create table statements
Seed database: seed-cbioportal_hg19_hg38_v2.13.0.sql.gz
md5sum b9e4035a9cc94dc01bbf6f5595842071
Updates to seed database:
- Entrez Gene IDs, gene symbols, and gene aliases have been updated based on the HGNC April 1, 2023 release. You can find the detailed changes listed here.
- Gene Sets have been updated from MSigDB v2023.1.Hs.
This schema is required for cBioPortal release versions:
- 5.0.0 or higher
For release versions > 5.0.0, there might be a need to migrate to a new database schema. The migration process is described here.
Schema 2.12.14: SQL file with create table statements
Seed database: seed-cbioportal_hg19_v2.12.14.sql.gz
md5sum 05481d66334b65512aef0364ce282fe6
Updates to seed database:
- Entrez Gene IDs, gene symbols, and gene aliases have been updated based on the HGNC Oct 1, 2022 release. The detailed changes are listed here.
- Gene Sets have been updated from MSigDB 7.5.1.
This schema is required for cBioPortal release versions:
- 3.6.0 or higher
For release versions > 2.0.0, there might be a need to migrate to a new database schema. The migration process is described here.
Schema 2.12.12: SQL file with create table statements
Seed database: seed-cbioportal_hg19_v2.12.12.sql.gz
md5sum 7d805d56aebcee85e2a8690e040310dd
Contents of seed database:
- Entrez Gene IDs, gene symbols, and gene aliases have been updated based on the HGNC Jan 1, 2022 release.
- Modifications (supplemental genes, miRNA, and phosphoprotein genes) are implemented using the script.
- Gene Sets have been updated from MSigDB 7.5.1.
- All data files in DATAHUB are updated to reflect the gene entry updates. The script/process is described here.
This schema is required for cBioPortal release versions:
- 3.6.0 or higher
For release versions > 2.0.0, there might be a need to migrate to a new database schema. The migration process is described here.
Schema 2.12.8: SQL file with create table statements
Seed database: seed-cbioportal_hg19_v2.12.8.sql.gz
md5sum f8d2c65f8d9db795da47ed5cf6f592a9
Contents of seed database:
- Entrez Gene IDs, gene symbols, and gene aliases have been updated based on the HGNC Feb 20, 2021 release with small modifications listed below.
- To minimize data loss, we have preserved certain gene entries that are not available in the current HGNC in a supplemental file - Complete lists here
- Updated outdated gene entries - Complete list here.
- Removed duplicate
symbol <> entrez_ID
mapping - Complete list here - 7 genes dropped from gene panels - Complete list here
- All data files in DATAHUB are updated to reflect the gene entry updates. The script/process is described HERE.
- Gene Sets have been updated from MSigDB 6.1.
This schema is required for cBioPortal release versions:
- 2.0.0
For release versions > 2.0.0, a migration step to a new database schema might be required. The migration process is described here.
Schema 2.7.3: SQL file with create table statements
Seed database: seed-cbioportal_hg19_v2.7.3.sql.gz
md5sum 85444ce645104dbc00610fc1f15e8c7a
Contents of seed database:
- Entrez Gene IDs, gene symbols, and gene aliases updated in December 2018 from NCBI.
- Gene lengths retrieved from Gencode Release 29 (mapped to GRCh37).
- Pfam graphics fetched in August 2017.
- Gene Sets from MSigDB 6.1.
- Cancer Types from OncoTree (fetched December 2018 from http://oncotree.mskcc.org).
This schema is required for cBioPortal release versions:
- 1.18.0
For release versions > 1.18.0, a migration step to a new database schema might be required. The migration process is described here.
Schema 2.7.2: SQL file with create table statements
Seed database: seed-cbioportal_hg19_v2.7.2.sql.gz
md5sum b0a4e11b94d00a7291129c30ee4e0f70
Contents of seed database:
- Entrez Gene IDs, gene symbols, and gene aliases updated in April 2018 from NCBI.
- Gene lengths retrieved from Gencode Release 27 (mapped to GRCh37).
- Pfam graphics fetched in August 2017.
- Gene Sets from MSigDB 6.1.
- Cancer Types from OncoTree (fetched July 2018 from http://oncotree.mskcc.org).
This schema is required for cBioPortal release versions:
- 1.12.x
- 1.13.x
- 1.14.0
For release versions > 1.14.0, a migration step to a new database schema might be required. The migration process is described here.
Schema 2.6.0: SQL file with create table statements
Seed database: seed-cbioportal_hg19_v2.6.0.sql.gz
md5sum aafc9da7b72a29f3978ddca31004b8f5
Contents of seed database:
- Entrez Gene IDs, gene symbols, and gene aliases updated in April 2018 from NCBI.
- Gene lengths retrieved from Gencode Release 27 (mapped to GRCh37).
- Pfam graphics fetched in August 2017.
- Gene Sets from MSigDB 6.1.
- Cancer Types from OncoTree (fetched July 2018 from http://oncotree.mskcc.org)
This schema is required for cBioPortal release versions:
- 1.9.0
For release versions > 1.9.0, a migration step to a new database schema might be required. The migration process is described here.
Schema 2.4.0: SQL file with create table statements
Seed database : seed-cbioportal_hg19_v2.4.0.sql.gz
md5sum 1014ed1f9d72103f2b46e5615aacbc2f
cBioPortal 1.9.0 with database schema 2.4.0 removed PDB annotations from the database.
Contents of seed database:
- Entrez Gene IDs, gene symbols, and aliases updated in August 2017 from NCBI.
- Gene lengths retrieved from Gencode Release 26 (mapped to GRCh37).
- Pfam graphics fetched in August 2017.
This schema is required for cBioPortal release versions:
- 1.7.1
- 1.7.2
- 1.7.3
- 1.8.0
For release versions > 1.8.0, a migration step to a new database schema might be required. The migration process is described here.
Schema 2.3.1: SQL file with create table statements
Seed database part1 (no PDB tables): seed-cbioportal_hg19_v2.3.1.sql.gz
md5sum 324be3d975d22019ee0c82ce0542bcc3
Seed database part2 (optional, only PDB tables): seed-cbioportal_hg19_v2.3.1_only-pdb.sql.gz
md5sum 5774a7947cdf5ef78fd737f1bea688cc
Contents of seed database:
- Entrez Gene Ids, Hugo symbols and aliases updated in August 2017 from NCBI.
- Gene lengths retrieved from Gencode Release 26 (mapped to GRCh37).
- Pfam graphics fetched in August 2017.
This schema is required for older cBioPortal release versions:
- 1.5.0
- 1.5.1
- 1.5.2
When using this older seed database with a release version > 1.5.2, a migration step to a new database schema is required. The migration process is described here.
Schema 2.1.0: SQL file with create table statements
Seed database part1 (no PDB tables): seed-cbioportal_hg19_v2.1.0.sql.gz
md5sum fe4e8502034f72f182733a72b50dbbc8
Seed database part2 (optional, only PDB tables): seed-cbioportal_hg19_v2.1.0_only-pdb.sql.gz
md5sum 5774a7947cdf5ef78fd737f1bea688cc
Contents of seed database:
- Entrez Gene Ids, Hugo symbols and aliases updated in September 2016 from NCBI.
- Gene lengths retrieved from Gencode Release 25 (mapped to GRCh37).
- Pfam graphics fetched in September 2016.