README for CisGenome Browser: A flexible tool for genomic data visualization

Hui Jiang


Table of Contents

Overview

Download & Installation

Windows
Linux or Mac

Running Time Directory Structure

Supported File Formats

signal track
region track
gene track
conservation track
genome sequence track
motif session
cel session
data plot sesson

Control CisGenome Browser by a Third Party Program

Session File Format

License

Citation

Acknowledgements


Overview

CisGenome Browser is an open source, platform independent tool which can work together with any other data analysis program to serve as a flexible component for genomic data visualization. It can also work by itself as a standalone genome browser. By working as a light-weight web server, CisGenome Browser is a convenient tool for data sharing between labs. It has features that are specifically designed for ultra high throughput sequencing data visualization.

CisGenome Browser runs on Windows, Linux and Mac platforms. The use of CisGenome Browser is very easy, and the interface is similar to other genome browsers, such as the UCSC Genome Browser. CisGenome Browser is suitable for visualizing microarray/sequencing/other biological data. It runs locally so that there is no need to upload the user data via internet like adding custom tracks when using UCSC Genome Browser or any other online browsers.

CisGenome Browser is designed to work with any other programs as a visualization component. It is integrated in CisGenome and JETTA. Here is a nice tutorial on CisGenome Browser written by Hongkai Ji..

Download & Installation

For detailed instructions, see Getting Started.

1. Download the zipped files from its website.

2. Unzip the files to somewhere on the hard drive. If an old version is reviously installed, just unzip the files to the same directory to have the program files overwritten. All the user data will not be touched, and they will be automatically pre-loaded with the new version. To prevent unexpected damage, backup the files first.

3. Click display_server_pi.exe (or display_server.exe if you are using the old Windows only version) to run it. then a tray icon will automatically appear in the tray area. A dialog might pop up asking about blocking the program, just click "Unblock".

4. Right click it and choose "Browse" (double click the icon will do the same). A web page will be opened in your default browser.

5. In the web page, create new sessions or load old sessions, according to the manual.

6. play with it~

The above is for the Windows version. For the linux or Mac versions, follow the instructions given in the Getting Started.

Running Time Directory Structure

The default running time directory structure is as follows:

wwwroot
|
------sessions
|
------temp
|
------templates

The directory "wwwroot" contains the executable file display_server_pi.exe (or display_server.exe if you are using the old Windows only version).All the sessions files (*.ini) are in the "sessions" directory. The directory "templates" contains some necessary html templates that are used by the program. All the temporary files, mostly pictures, are in the "temp" directory.

Files in the "temp" directory can be deleted freely. Files in the "sessions" directory can be deleted if the corresponding sessions are no longer needed. Files in other directories should never be deleted.

Supported File Formats

For signal tracks, Affymetrix BAR file format (binary) or tab-delimited text file format are supported. The text files shoule be in the following format:

Chromosome[tab]Position[tab]value

Sample file.

For region tracks, UCSC BED format is partly supported (text).

Sample file.

It can also be only the first three columns, or add another column for signals.

For gene annotation tracks (for visualizing sequencing reads, you can use this type of track, just one read per line), UCSC refFlat format, UCSC BED format or the BAM format are supported.

Sample file in refFlat format. As another example, the refFlat format annotation file for hg19 can be downloaded at http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/refFlat.txt.gz

Sample file in BED format.

Notice that other gene annotations files downloaded from UCSC, such as knowGenes.txt and etc, are not in exactly the same format as the refFlat.txt, therefore manually converting to the refFlat format is needed before imported into CisGenome Browser.

For DNA sequences, first download .fa (FASTA) files from UCSC, then convert them to .sq files using CisGenome command line tools before imported into CisGenome Browser.

For conservation scores, similar to DNA sequences, download data from UCSC, convert them using CisGenome command line tools before imported into CisGenome Browser. For details, look at the User’s Manual of CisGenome.

For motif session, It should be text file containing the PSSM of the motif. An example file is here. For multiple motif file, current it only accepts the file that output by the motif sample "flex_module" in CisGenome. An example file is here.

For CEL session, it support CEL file formats from Affymetrix v1 to v5.

Control CisGenome Browser by a Third Party Program

CisGenome Browser can be easily controlled by a third party program regardless the programming language in which the program is written. The workflow for a third party program to control CisGenome Browser contains the following six steps:

  1. The third party program prepares its data files in the formats that can be accepted by CisGenome browser.
  2. The third party program prepares a session file in a simple text format (the Windows INI format, see section "Session File Format" for details), which specifies the path of the data files, the genomic region of interest and several other parameters for data visualization. The session file needs to be put in the "session" directory with an extention ".ini".
  3. The third party program opens a web browser and then calls CisGe-nome Browser via a HTTP request "http://server_IP:port/session?name=session_name", where "session_name" is the name of the session file.
  4. Working as a local web server, CisGenome Browser listens on a specific port. When it receives a request, it parses the HTTP header and retrieves the name of the session file. It then opens the session file and reads in all the parameters, including the paths of the data files.
  5. CisGenome Browser loads the entire data files or part of them depending on the parameters specified in the session file.
  6. CisGenome Browser generates the web page according to the parameters and sends it back to the web browser, which then presents the web page to the user.

Session File Format

The files in the "session" directory are in Windows INI format.

The file contain several sections, separated by section name lines in the format "[section_name]".

Each section contains several parameters, each in a line in the format "name=value", where "name" is the name of the parameter and "value" is the value of the parameter.

The first section is "[session]", which has the following possible parameters:

parameter name possible values default values note
type genome, motif, cel, data none type of session. "genome" is for genome browser, "motif" is for motif logos, "cel" is for Affymetrix array image, and "data" is for general plot
seed int 0 internal usage
refresh 0 or 1 false refresh the page

Depending on the type of the session, the second section can be "[genome]", "[motif]", "[cel]" or "[data]".

The "[genome]" section has the following possible parameters:

parameter name possible values default values note
num_tracks int 0 number of tracks
num_hided_tracks int 0 number of hiden tracks
region genomic region chr1:5001000-5002000 genomic region
ucsc_genome_assembly hg18, mm9, etc none genome assembly
pic_width int 800 picture width
pic_margin int 15 picture margin
font_size int 15 font size
left_axis 0 or 1 1 show left axis
right_axis 0 or 1 1 show right axis
ucsc_browser 0 or 1 0 show UCSC genome browser
grid gray, color, gray_shaded or color_shaded gray grid type
fold integers separated by "," none folded tracks

Depending on the number of tracks and hiden tracks, there will be several sections "[track1]", "[track2]", ..., "[hided_track1]", "[hided_track2]", etc. Each of these sections has the following possible parameters:

parameter name possible values default values note
title string none track title
top_axis 0 or 1 0 show top axis
bottom_axis 0 or 1 0 show bottom axis
fast_draw 0 or 1 1 smart draw
pic_filename string none picture file ame
type gene, signal, conservation, region, nucleotide none track type
file_path string none input file path, for track types "conservation" or "nucleotide"
pic_height int 100 picture height, for track types "conservation", "signal" or "region"
plot_type bar, heatmap, line or dot 15 type of plot, for track type"signal", track types "region" or "conservation" only has "bar" or "heatmap"
range_low int 0 range lower bound, for track types "signal" or "region"
range_high int 0 range upper bound, for track types "signal" or "region"
signal_width int 0 signal width, for track type"signal"
zero_line 0 or 1 0 draw line at zero, for track types "signal" or "region"
always_include_zero 0 or 1 1 always include 0 in the range, for track types "signal" or "region"
max_draw_line_distance int 0 maxum distance between lines, for track type "signal"
count_window int 0 count data points in a window, for track type "signal"
src_filename file names separated by "," none data file names, for track types "signal" or "region", track type "gene" only has one file name
color black, red, blue, green, purple, pink, brown, orange or color code such as 0x004080 separated by "," none colors for data files, for track type "signal" or "region"
draw_exon_num 0 or 1 0 numbering exons, for track type "gene"
do_caching 0 or 1 0 cache the gene in memory, for track type "gene"
annotation 0 or 1 1 include track in search, for track type "gene"
max_height int 400 maximum track height, for track type "gene"

The "[motif]" section has the following possible parameters:

parameter name possible values default values note
multiple_motifs 0 or 1 0 draw multiple motifs
src_filename string none data file name
pic_filename string none picture file ame

The "[cel]" section has the following possible parameters:

parameter name possible values default values note
src_filename string none data file name
pic_filename string none picture file ame

The "[data]" section has the following possible parameters:

parameter name possible values default values note
src_filename string none data file name
pic_filename string none picture file ame

License

Anyone can use the source codes, documents or the excutable file of CisGenome Browser free of charge for non-commercial use. For commercial use, please contact the author.

Citation

Jiang, H., Wang, F., Dyer, N.P., Wong, W.H. (2010)
CisGenome Browser: A Flexible Tool For Genomic Data Visualization
Bioinformatics, in press. [online]

Acknowledgements

CisGenome Browser was developed and tested with the help of the members and several collaborators of the Wong lab.


(last modified on September 9, 2011 3:57 PM )