The Cancer Genomic Data Server (CGDS) web service interface provides direct programmatic access to all genomic data stored within the server. This enables you to easily access data from your favorite programming language, such as Python, Java, Perl, R or MatLab. The CGDS web service is REST-based, meaning that client applications create a query consisting of parameters appended to a URL, and receive back either either text or an XML response. For CGDS, all responses are currently tab-delimited text. Clients of the CGDS web service can issue the following types of queries:
- What cancer studies are stored on the server?
- What genetic profile types are available for cancer study X? For example, does the server store mutation and copy number data for the TCGA Glioblastoma data?
- What case sets are available for cancer study X? For example, what case sets are available for TCGA Glioblastoma?
Additionally, clients can easily retrieve "slices" of genomic data. For example, a client can retrieve all mutation data from PTEN and EGFR in the TCGA Glioblastoma data.
Please note that the example queries below are accurate, but they are not guaranteed to return data, as our database is constantly being updated.
The CGDS R Package
If you are interested in accessing CGDS via R, please check out our CGDS-R library.
Basic Query Syntax
All web queries are available at: webservice.do. All calls to the Web interface are constructed by appending URL parameters. Within each call, you must specify:
- cmd = the command that you wish to execute. The command must be equal to one of the following: getTypesOfCancer, getNetwork, getCancerStudies, getGeneticProfiles, getProfileData, getCaseLists, getClinicalData, or getMutationData.
- optional additional parameters, depending of the command (see below).
For example, the following query will request all case lists for the TCGA GBM data:
Response Header and Error Messages
The first line of each response begins with a hash mark (#), and will contain data regarding the server status. For example:
# CGDS Kernel: Data served up fresh at: Wed Oct 27 13:02:30 EDT 2010
If any errors have occurred in processing your query, this will appear directly after the status message. Error messages begin with the "Error:" tag. Warning messages begin with the "# Warning:" tag. Unrecoverable errors are reported as errors. For example:
# CGDS Kernel: Data served up fresh at: Wed Oct 27 13:02:30 EDT 2010 Error: No case lists available for cancer_study_id: gbs.
Recoverable errors, such as invalid gene symbols are reported as warnings. Multiple warnings may also be returned. For example:
# CGDS Kernel: Data served up fresh at: Wed Oct 27 13:06:34 EDT 2010 # Warning: Unknown gene: EGFR11 # Warning: Unknown gene: EGFR12
As of August, 2011:
In previous versions of the API, the getCancerStudies command was referred to as getCancerTypes. For backward compatibility, getCancerTypes still works, but is now considered deprecated.
In previous versions of the API, the cancer_study_id parameter was referred to as cancer_type_id. For backward compatibility,, cancer_type_id still works, but is now considered deprecated.
|Get All Types of Cancer|
|Description||Retrieves a list of all the clinical types of cancer stored on the server.|
|Query Format||cmd=getTypesOfCancer (required)|
A tab-delimited file with two columns:
|Example||Get all Types of Cancer.|
|Get All Cancer Studies|
|Description||Retrieves meta-data regarding cancer studies stored on the server.|
|Query Format||cmd=getCancerStudies (required)|
A tab-delimited file with three columns:
|Example||Get all Cancer Studies.|
|Get All Genetic Profiles for a Specific Cancer Study|
|Description||Retrieves meta-data regarding all genetic profiles, e.g. mutation or copy number profiles, stored about a specific cancer study.|
A tab-delimited file with six columns:
|Example||Get all Genetic Profiles for Glioblastoma (TCGA).|
|Get All Case Lists for a Specific Cancer Study|
|Description||Retrieves meta-data regarding all case lists stored about a specific cancer study. For example, a within a particular study, only some cases may have sequence data, and another subset of cases may have been sequenced and treated with a specific therapeutic protocol. Multiple case lists may be associated with each cancer study, and this method enables you to retrieve meta-data regarding all of these case lists.|
A tab-delimited file with five columns:
|Example||Get all Case Lists for Glioblastoma (TCGA).|
|Get Profile Data|
|Description||Retrieves genomic profile data for one or more genes.|
You can either:
When requesting one or multiple genes and a single genetic profile ID (see above), you will receive a tab-delimited matrix with the following columns:
Response Format 2
When requesting a single gene and multiple genetic profile IDs (see above), you will receive a tab-delimited matrix with the following columns:
|Example||See Query Format above.|
|Get Extended Mutation Data|
|Description||For data of type EXTENDED_MUTATION, you can request the full set of annotated extended mutation data. This enables you to, for example, determine which sequencing center sequenced the mutation, the amino acid change that results from the mutation, or gather links to predicted functional consequences of the mutation.|
A tab-delimited file with the following columns:
|Get Clinical Data|
|Description||Retrieves overall survival, disease free survival and age at diagnosis for specified cases. Due to patient privacy restrictions, no other clinical data is available.|
A tab-delimited file with the following columns:
|Example||Get Clinical Data for All TCGA Ovarian Cases.|
|Get Protein/Phosphoprotein Antibody Information|
|Description||Retrieves information on antibodies used by reverse-phase protein arrays (RPPA) to measure protein/phosphoprotein levels.|
You will receive a tab-delimited matrix with the following 4 columns:
|Get RPPA-based Proteomics Data|
|Description||Retrieves protein and/or phosphoprotein levels measured by reverse-phase protein arrays (RPPA).|
|Response Format 1||
If the parameter of array_info is not specified or it is not 1, you will receive a tab-delimited matrix with the following columns:
|Response Format 2||
If the parameter of array_info is 1, you will receive a tab-delimited matrix with the following columns:
Linking to Us
Once you have a cancer_study_id, it is very easy to create stable links from your web site to the cBio Portal. Stable links must point to ln, and can include the following parameters:
- q=[a query following Onco Query Language, e.g. a space separated list of HUGO gene symbols] (required)
- cancer_study_id=[cancer study ID] (if not specified, do a cross cancer query)
- report=[report to display; can be one of: full (default), oncoprint_html]
For example, here is a link to the TCGA GBM data for EGFR and NF1:
And a link to TP53 mutations across all cancer studies: