Vignette: Exploring focal gene hits in Progenetix and arrayMap
When exploring a candidate oncogene, one of the interesting questions is the frequency of copy number abnormalities involving the gene's locus in different cancer types. While Progenetix offers a powerful platform to detect cancers of interest, the specifics of those changes can be explored with the help of arrayMap.
Example: Focal gain/amplification involving the MYCN locus
1 Go to "Gene CNA Frequencies" in Progenetix
2 Start to type the gene's name and select the correct one
3 Options
- select "More Options" and change the region size to 5000 (kb)
- change the region type from "9" to "1" (only gains)
4 Receive the scores
- for the different subsets, the relative number (percentage) of samples with the hit is shown
- a "score" valu weighs this by the overall genome complexity in the subset (i.e. higher complexity => reduced score)
(... to be continued)
Vignette: Prepare annotated files for upload and processing
This is a workflow for processing you own data, including annotation fields (e.g. diagnoses, clinical data) for group visualisation.
- process your samples (e.g. from segmentation file)
- click on "Download Files ..." to show the options, and select "PROGENETIX TAB FILE"
- open this in a spreadsheet software (e.g. OpenOffice or LibreOffice; or in a text editor and copy the content into a spreadsheet)
- fill in the missing data
- save as a tab-delimited text file (preferably Unix line feed endings, fields not quoted)
- reload your file and select the "tab delimited" format; use the correct aCGH or cCGH assignment
- process ...

2012 Publication - arrayMap oncogenomic database announcement
The PDF of our PLoS ONE publication announcing the arrayMap cancer genome database.
2012 Publication: Specific Genomic Regions Are Differentially Affected by Copy Number Alterations ...
Specific Genomic Regions Are Differentially Affected by Copy Number Alterations across Distinct Cancer Types, in Aggregated Cytogenetic Data
Nitin Kumar, Haoyang Cai, Christian von Mering, Michael Baudis
PLoS One (2012) 7: e43689. [PubMed]
30449 cancer genome profiles from 1001 publications now included in Progenetix
By adding some new data sets and annotating some more of the evaluated data from arrayMap, Progenetix now has more than 30000 cancer genome copy number profiles, from 1001 publications. The data consists of 20531 chromosomal CGH and 10024 genomic array profiles, and covers 364 diagnostic entities according to ICD-O 3.
@progenetix | arrayMap Changes
2013-05-22
- bug fix: fixing lack of clustering for CNA frequency profiles in th eanalysis section
- removed "Series Search" from the arrayMap side bar; kind of confusing - just search for the samples & select the series
2013-05-12
- introduced a method to combine sample annotations and segmentation files for user data processing (see "FAQ & GUIDE")
- fixed some array plot presentation and replotting problems
2013-05-05
- consolidation of script names - again, don't use deep links (besides for "api.cgi?...")
- moving of remaining sample selection options (random sample number, segments number, age range) to the sample selection page, leaving the pre-analysis page (now "prepare.cgi") for plotting/grouping options
- fixed the KM-style survival plots
2013-04-10
- re-factoring of the cytobands plotting for histograms and heatmaps; this also fixes missing histogram tiles
- analysis output page: the circular histogram/connections plot and group specific histograms are now all available as both SVG and PNG image files
2013-04-06
Some changes to the plotting options:
- the circular plot is now added as a default; and connections are drawn in for <= 30 samples (subject to change)
- one can now mark up multiple genes (or other loci of interest), for all plot types
2013-03-25
- added option to create custom analysis groups based on text match values
- rewritten circular plot code
2013-02-27
- copied data for PMIDs 17327916, 17311676, 18506749 and 18246049 from arrayMap to Progenetix
2013-02-24
- bug fix: gene selector was broken for about a week; fixed
2013-02-17
- In many places, images are now converted sever side to PNG data streams and embedded into the web pages. This will substantially decrease web data traffic and page download times. Fully linked SVG images (including region links etc.) are still available through the analysis pipeline.
2013-02-13
- data fix: PMID 18160781 had missing loss values (due to irregular character encoding); fixed, thanks to Emanuela Felley-Bosco for the note!
2012-12-14
- moved the region filter from the analysis to the sample selection page
- added a "mark region" option to the analysis page: one now can highlight a genome region in histograms and matrix plots
2012-11-29
- added "select all" option to entity lists
- implemented first version of sample-to-entity match score
- added single sample annotation input field to "User File Processing"; i.e. one can now type in CNA data for a single case, and have this visualised and similar cases listed
- added per sample CNA visualisation to the samples details listings (currently if up to 100 samples)
- added direct access to sample details listing to the subsets pages
2012-11-09
- adding of abstract search to the publication search page
2012-10-25
- introduction of a matching function for similar cases by CNA profile, accessible through the sample details pages of both Progenetix and arraymap
2012-10-22
- Introduction of SEER groups
2012-09-26
The database now contains the copy number status for different interval sizes (e.g. 1MB). With this, users can now create their own data plots (histograms etc.) using more than 10000 cancer copy number profiles with a high resolution. The options here are still being tested and improved - comments welcome!
2012-09-18
- added a new export file format "ANNOTATED SEGMENTS FILE", which uses the first columns for standard segment annotation, followed by some diagnostic and clinical data; i.e., the information for a case is repeated for each segment:
GSM255090 22 25063244 25193559 1 NA C50 8500/3 breast Infiltrating duct carcinoma, NOS Carcinomas: breast ca. NA 1 51 0.58
GSM255090 22 25368299 48899534 -1 NA C50 8500/3 breast Infiltrating duct carcinoma, NOS Carcinomas: breast ca. NA 1 51 0.58
GSM255091 1 2224111 30146401 -1 NA C50 8500/3 breast Infiltrating duct carcinoma, NOS Carcinomas: breast ca. NA 0 72 0.54
GSM255091 1 35418712 37555461 1 NA C50 8500/3 breast Infiltrating duct carcinoma, NOS Carcinomas: breast ca. NA 0 72 0.54
2012-09-13
- added gene selection for region specific replotting of array data
2012-08-22
- the gene database has been changed to the last version of the complete (HUGO names only) Ensembl gene list for HG18; previously, only a subset of "cancer related genes" was offered in the gene selection search fields
2012-07-04
- some interface and form elements have been streamlined (e.g. less commonly used selector fields, sample selection options)
- some common options are now displayed only if activated (e.g. "mouse over" to see all files available for download)
- icon quality has been enhanced for all but the details pages
2012-06-13
- New: All pre-generated histogram and ideogram plots are now produced based on a 1Mb matrix, with a 500Kb minimum size filter to remove CNV/platform dependent background from some high resolution array platforms. The unfiltered data can still be visualized through the standard analysis procedures.
- Bug fix: Interactive segment size filtering so far only worked for region specific queries, but not as a general filter (see above). This has been fixed; a minimum segment size in the visualization options now will remove all smaller segments.
2012-06-01
- NEW: change log; that is what is shown here
- FEATURE: The interval selector now has options to include the p-arms of acrocentric chromosomes (though the data itself there may be incompletely annotated!). Feature requested by Melody Lam.
API: Progenetix/arrayMap CNA plots
2013-02-18
Change: Standard plot format is nom PNG. SVG images can be called by adding "&imgFormat=svg" to the call.
Examples:
PNG (hepatocellular carcinoma from default=Progenetix database)
http://www.progenetix.org/cgi-bin/api.cgi?ICDO3=8170/3
SVG
http://www.progenetix.org/cgi-bin/api.cgi?ICDO3=8170/3&imgFormat=svg
SVG with regional links to the UCSC browser
http://www.progenetix.org/cgi-bin/api.cgi?ICDO3=8170/3&imgFormat=svg&plotLinks=1
SVG of a random 50 sample subset of ICD-O 3 8170/3
http://www.progenetix.org/cgi-bin/api.cgi?ICDO3=8170/3&imgFormat=svg&plotLinks=1&randSampleNo=50
2013-02-07
We now provide real-time copy number frequency plots, for both our Progenetix and arrayMap collections. At this time, the API calls will deliver SVG images only; they are the qualitatively best solution (scalable, clickable, embeddable ...), but may fail in ancient browsers - please use recent editions of Safari/Firefox/Chrome etc.
The link structure is shown below. We'll try to keep this stable; however, please let us know if implementing these links in production environments. And please follow our Twitter feed @progenetix.
Since the plots are generated in real-time and are rather complex (i.e. >1MB for a histoplot with 1Mb resolution), it may take some seconds until the image is returned & interpreted.
The base constructor starts with
http://www.arraymap.org/cgi-bin/api.cgi?
or
http://www.progenetix.org/cgi-bin/api.cgi?
... followed by one of the required base parameters
- ICDO3=nnnn/n
- PMID=nnnnnnnn
- SERIESID=xxxxxxxxxx
Please note that the keys (ICDO3 ...) are all CAPS, and that the values have to be full matches to existing parameters in Progenetix or arrayMap.
Scope: Data is queried in the scope of either the Progenetix or arrayMap collection, and will default to Progenetix (but for the SERIESID to arrayMap).
Correct minimal query examples would be:
- http://www.arraymap.org/cgi-bin/api.cgi?PMID=22824167&project=arraymap
- http://www.progenetix.org/cgi-bin/api.cgi?ICDO3=8170/3&project=progenetix
- http://www.arraymap.org/cgi-bin/api.cgi?SERIESID=GSE6109
Plot options
The standard return will be a histogram of genomic gains/losses (chromosomes 1-22) in the selected dataset, in the format of an SVG vector plot. Other options can be chosen by adding a query parameter "plot", with one of the values"
- adding "&plot=ideogram" will produce CNA frequencies in a standard chromosomal ideogram arrangement
- adding "&plot=chr8" (with "8" being one of the chromosomes) will just deliver this chromosome in an upright gain/loss frequency plot - basically a cut-out from the histogram
- adding "&plotLinks=1" will produce an SVG, in which each interval is linked to the UCSC genome browser; however, the image size will increase dramatically (for a histoplot from ~250kb to 1.5Mb)
- adding "&chr2plot=8,11" to the histoplot (or without plot selection) will produce a histoplot of all the comma separated chromosomes; if less than 3 of those, the image will default to the "linked" version
Examples:
arrayMap and Progenetix interface update
The navigation icons of the arrayMap and Progenetix sites have been updated. This is mostly a cosmetic change, but some of the linking has been streamlined, too.
arrayMap feature update(s)
Over the last weeks, we have introduced a number of new search/ordering features to arrayMap. Some of those mimic functions previously implemented in Progenetix. Overall, the highlights are:
ICD entity aggregation- all ICD-O entities with their according samples
ICD locus aggregation- all tumor loci with their according samples
Clinical group aggregation- clinical super-entities (e.g. "breast ca.": all carcinoma types with locus breast) with their samples
Publication aggregation- all publication with samples in arrayMap
In contrast to Progenetix, we do not offer precomputed SCNA histograms. However, users can generate them on the fly, but should consider the specific challenges in doing so (e.g. noise background in frequency calculations).
arrayMap featured at the Journal of the National Cancer Institute
A news feature by Mike Martin discusses our arrayMap resource in a recent issue of the Journal of the National Cancer Institute (JNCI ).
arrayMap manuscript accepted at PLoS ONE
Cai, H., N. Kumar, and M. Baudis. arrayMap: A Reference Resource for Genomic Copy Number Imbalances in Human Malignancies. PLoS ONE 2012: accepted.
The original version of the manuscript is available at
We will announce the final version as soon as it becomes available.
arrayPlotter feature update
The arrayPlotter module underwent some enhancements:
- BUG FIX: array segments without probe number (e.g. from GP annotated data) are not removed anymore when re-plotting
- NEW: Baseline correction; This is useful for re-plotting arrays which have a shift of the "normal" probe value away from 0;. This correction is automatically applied to the thresholds, too (i.e. a BLC of 0.5 with GTH 0.15 and LTH -0.15 will call original values of -0.4 as gain and -0.6 as loss).
- NEW: More parameters are now shifted towards the "PLOT FACTORS" field, for free text editing. Be careful, though ...
Enjoy!
Browser Compatibility
Pages are created dynamically and mostly are being served as XML. Some browsers have problems with the XHTML/XML doctype. For older browsers, all pages are served as HTML, which on the other hand breaks SVG compatibility.
Working browsers for all features are (oldest compatible versions listed):
- Safari 3
- Safari iOS
- Firefox 3
- Google Chrome
- Internet Explorer 9
Most other recent browsers (Opera etc.) should be fine, too, but haven't been tested. The basic requirements for full display are:
- inline SVG (but possibly can be achieved with plug-in)
- HTML5 canvas support
Citation
Progenetix: For any use of theProgenetixdata, e.g. as a reference for aberration frequencies in a certain locus, it is necessary to cite both the website and the original Bioinformatics publication:
- Baudis, M., & Cleary, M. L. (2001). Progenetix.net: an online repository for molecular cytogenetic aberration data. Bioinformatics, 17(12), 1228-1229.
- Progenetix oncogenomic online resource: www.progenetix.net. Baudis, M. (2012)
In case of citation restrictions, you may just use the Bioinformatics citation, and put the website in the text. A proper citation would look e.g. like:
... according to the Progenetix resource ([1]; www.progenetix.org), copy number ...
... and in the citations:
- Baudis, M., & Cleary, M. L. (2001). Progenetix.net: an online repository for molecular cytogenetic aberration data. Bioinformatics, 17(12), 1228-1229.
arrayMap: For arrayMap data, the same rules apply: Citation of the article and the website:
- Cai, H., Kumar, N., & Baudis, M. 2012. arrayMap: A Reference Resource for Genomic Copy Number Imbalances in Human Malignancies. PLoS One 7(5), e36944.
- arrayMap: Genomic arrays for copy number profiling in human cancer (www.arraymap.org). Baudis, M. (2012)
Google scholar publication search
Selected articles using Progenetix as reference or making use of the online tools
CNA sample profile similarity search
Cancer genome copy number profiles of samples from both Progenetix and arrayMap can now be queried for cases with similar CNA profiles. The function is currently accessible through the sample details pages of both Progenetix and arraymap. Enjoy!
Data Download
Data files can be downloaded after having performed a database search or data analysis procedure. An example download box is shown below:
The corresponding file formats are:
PROGENETIX JSON FILE
This is a standard JSON file structure with each line being a sample entry. You can read it e.g. into a list in Perl:
use JSON; my $json = JSON->new; open FILE, "myPathTo/progenetix.json" or warn "No file myPathTo/progenetix.json $!"; my @filecontent = (); close FILE; chomp @filecontent; my @data; foreach (@filecontent) { push(@data, $json->relaxed(1)->decode( $_ )); }
PROGENETIX TAB FILE
this is a tab-delimited text file containing most of the data fields. CNA segments are concatenated in one entry:
chr1:158800000-247249719:1::chr4:0-107899999:-1::chr6:29900000-45199999:-1
SEGMENTS LIST FILE
CNA segment information saved as a tab-delimited list, including the sample UID in the first column:
sampleID chro basestart basestop segvalue probes GIST-ass-15 1 0 124299999 -1 NA GIST-ass-16 1 142400000 149599999 1 NA ...
Depending on the active page, the value may be the original log2 value from an array or more commonly the status marker. "Probes" will only display a value when plotting array specific data.
DIPG data collaboration meeting in Zurich
On Jan 09 and 10 we were welcoming the members of the DIPG data working, group for a meeting at the University of Zurich.
First copy number profiling data from methylation arrays added
We have added the first series of copy number aberration data from methylation arrays (Sturm et al., PMID 23079654) to Progenetix and arrayMap. Among overall 210 glioma saples, the dataset contains 69 paediatric/young adult DIPG/high grade gliomas which are included in the (DIPG project)[http://dipg.progenetix.org].
We will use this as a pilot project, to work on a future general use of this type of molecular screening data. However, we deem it worthwhile to provide the data in its current state - and we are very excited about these developments.
Haoyang Cai presenting chromothripsis data at Cancer Network Zurich retreat
Haoyang will present his results from the analysis of chromothripsis-like genome patterns at this year's CNZ retreat in Grindelwald:
- Chromothripsis-like patterns are recurring but heterogeneously distributed features in a survey of 22,347 cancer genomes
Happy New Year
Progenetix and arrayMap are back to normal - happy cancer genome data mining! For the next year, we plan some nice data & tool updates - stay posted.
Interface enhancements
Some interface and form elements have been streamlined, including the less commonly used selector fields as well as the sample selection options. Some common options are now displayed only if activated (e.g. "mouse over" to see all files available for download).
Also, icon quality has been enhanced for all but the details pages (where larger icons just would take up too much space).
Enjoy!
Licensing
Access to the site and data downloads are free for academic users.
Any commercial use (e.g. using the data for target validation, including Progenetix or arrayMap data into analysis systems) is dependent on a license granted through Michael Baudis, and managed through the University of Zurich.
New interval options
The interval selector now has options to include the p-arms of acrocentric chromosomes (though the data itself there may be incompletely annotated!). This feature was requested by Melody Lam.
New Progenetix article published at PLoS ONE
Our publication
Nitin Kumar, Haoyang Cai, Christian von Mering and Michael Baudis: Specific genomic regions are differentially affected by copy number alterations across distinct cancer types in aggregated cytogenetic data
has just been published at PLoS ONE.
Enjoy!
Nitin Kumar - a new @progenetix PhD
Oncogenomic Pattern Detection in Cancer Copy Number Alteration Data for Pathway Description and Disease Classification: Nitin Kumar from our group at the University of Zurich has successfully passed his exam for a PhD. Congratulations from the members of the Baudis group!
Nitin has been instrumental in developing some of the analytical algorithms (too be) implemented in Progenetix and arrayMap, and also in the survey of available genomic array data, finally resulting in the arrayMap resource. More to come ...
Progenetix & arrayMap RSS feed
Progenetix and arrayMap news and guide are now available through RSS:
feed://www.progenetix.org/tmp/progenetix/rss.xml ![]()
You can either subscribe to this, or follow it on Twitter @progenetix. Enjoy!
Progenetix and arrayMap API for cancer genome copy number aberration frequency profiles
As of today, we have launched a way to access and integrate different versions of our gain/loss frequency plots into your resources. The information can be found in the guide (API).
Progenetix and arrayMap cancer genome database server errors
There are some database search errors when accessing sample specific data from both Progenetix and arrayMap. We're working on it - should be fixed before the year is over...
Progenetix and arrayMap status update
While the cancer genome profile array CGH data in Progenetix and the arrayMap data are fine, chromosomal CGH copy number profiles are still incorrect. Working on it ...
Registration
As of March 2012, registration is only necessary for
- any commercial use of the database
- maintaining private projects
- participating in collaborative studies containing unpublished material
However, we are happy about feedback and suggestions.
For any use of the site by for-profit entities, an individual license has to be obtained. Starting 2008, licensing proceeds for new licensees has been handled through the Univerity of Zurich.
Please contact Michael Baudis for further information.
Renal cell carcinoma paper published at BMC Cancer
The study
Beleut et al.: Integrative genome-wide expression profiling identifies three distinct molecular subgroups of renal cell carcinoma with different patient outcome
... in which we had participated, has just become available through BMC Cancer. Congratulations to all co-authors!
The processed copy number data of the study can be accessed through Progenetix and arrayMap.
Search Samples
Samples can be queried by specifying a number of parameters and/or keywords. The following example shows a text query for anything with "renal" in diagnosis text, ICD-O text or locus text, limited to platforms containing "affy" in the name, and having a minimum or 45000 probes. Also, there will be a limitation to samples having any change overlapping the CDKN2A locus - the selection is just under way:
After performing the search, the user is presented with selection lists containing parameters encountered in the current samples, for further exclusion options.
SEER categories in Progenetix and arrayMap
To reflect common cancer incidence and information systems, we have introduced an additional classification scheme based on the categories used by the NIH's Surveillance, Epidemiology and End Results (SEER) resource. Compared to what we observe in SEER, we provide additional adjustment of outlier data.
Spring collaboration meeting in Timisoara
From 2013-03-21 - 2013-03-23, the Swiss-Romanian cutaneous NHL collaboration meets in Timisoara.
of the data presented nor the results achieved with the Progenetix tools.



