The fast DB user’s guide

 

 

 

 

 

Table of contents

 

Table of contents.. 2

THE FAST DB HOME PAGE.. 4

The fast DB search page. 6

Keyword search. 7

Blast search. 8

Multiple queries. 8

The fast DB advanced search page. 8

List of housekeeping genes. 9

List of the transcriptional coregulators from NURSA.. 9

List of the nuclear receptors from NURSA.. 9

Forum / Documentation. 9

Communications. 10

Data & statistics. 10

Links. 10

Contact us. 10

Number of visits and number of visitors. 10

Last update date. 10

MAIN PAGE.. 11

Navigation bar. 12

Graphical gene representation. 12

Link to the UCSC genome browser. 12

Number of exons. 13

Legend of the gene graphical representation. 14

Table of the transcription initiation & first exon(s) 15

Prediction of promoter 16

Transcription factor binding sites. 18

5’ UTR analysis. 18

Table of the transcription termination & last exon(s) 18

3’ UTR analysis on selected sequence. 19

3’ UTR analysis on all the gene sequence. 19

Table of alternative splicing. 19

Splicing factor binding site prediction. 20

Splice site strength scoring. 20

Links to alternative splicing databases. 20

Names / Symbols.. 21

PUBMED.. 23

TRANSCRIPTS VIEW... 25

Gene graphical representation. 26

Transcript graphical representation. 27

Genbank accession number 27

Pubmed link. 28

Tissue information. 28

ORFs prediction. 28

NMD prediction. 28

microRNA/transcript interaction site prediction. 28

Transcript sequences. 28

Functional protein domain analysis. 28

Transcript exon analysis. 29

MicroRNA/transcript interaction sites. 31

Frequency of inclusion/skipping of exons. 32

Multi-alignment of translated ORF sequences. 33

TISSUE-SPECIFICITY.. 35

Tissue distribution of all gene transcripts. 35

Tissue-specificity of alternative events. 36

IN SILICO PCR.. 38

Multi-alignment of transcript sequences. 38

In silico PCR.. 39

PROBE ALIGNMENT.. 41

PDF.. 44

SEQUENCES.. 46

Genomic sequence. 47

Transcript sequence. 47

Custom sequence. 48

CONTACT US.. 49

ANALYSIS WITH HUMAN ESTs.. 50

ANALYSIS WITH MOUSE mRNAs.. 52

FIGURES REFERENCE.. 54

 


THE FAST DB HOME PAGE

 

By typing the fast DB address (http://www.fast-db.com) in the URL box of your navigator, the user gets access to the fast DB presentation page (Figure1).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 1: the fast DB presentation page

 

 

By clicking on the “fast DB” link, the user gains access to the fast DB home page (Figure 2). This page is divided in two parts. The left part contains list of buttons corresponding to the different available resources and the right part provides page of the selected resource.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 2: the fast DB home page

 

 

The fast DB search page

 

The first resource available from the fast DB home page (Figure 2, item 1) is the fast DB search page (Figure 3). This page is also displayed on the right window by default.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 3: the fast DB search page

 

 

Fast DB provides two kinds of search page: the basic search page (i.e. search page) and the advanced search page (Figure 5), which can be accessed by clicking on the corresponding link (Figure 3, item 1).

The fast DB search page is divided in three parts. Indeed, three distinct methods can be used for retrieve a gene or a list of gene with the fast DB search page. The user can find a gene using a keyword search (Figure 3, item 2), a blast search (Figure 3, item 3) or by uploading a list of gene IDs (Figure 3, item 5). For each of these search methods, a quick help is available (Figure 3, items 6).

 

Keyword search

 

The fast DB keyword search engine allows several types of keywords to retrieve a gene:

 

 

After the “search” button was clicked, fast DB provides the search results. For example Figure 4 shows keyword search result with “protein arginine” as keyword. Several genes can correspond to the user’s request (Figure 4, item 1). All these genes are displayed in a list with their EnsEMBL definition where inputted keywords are in red (Figure 4, item 2), their chromosomal localization (Figure 4, item 3), and their number of exon(s) (Figure 4, item 4).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 4: result of a keyword search

 

Blast search

 

The user can also find a gene by pasting a sequence. This sequence must be at least 20 nucleotide long. After the “search” button was clicked, fast DB provides the search results. Several genes can correspond to the user’s request. All these genes are displayed in a list with their EnsEMBL definition, their chromosomal localization, their number of exon(s), and the E-value from blast. When the “Align your sequence with the graphical gene representation” box is checked in the search page (Figure 3, item 4), inputted sequence localization is displayed under the graphical gene representation. The inputted sequence is also multi-aligned with all transcripts of the corresponding gene in the “in silico PCR” page.

 

Multiple queries

 

A multiple queries interface is also available. The User can upload a file (rtf, doc, txt…) with EnsEMBL stable ID of several genes. Each line should contain only one EnsEMBL stable ID. The fast DB search engine will provide the list of the genes corresponding to each input EnsEMBL IDs. Each gene is clickable to be analysed. Alternatively, the query result can be saved as html file for later analysis.

 

 

The fast DB advanced search page

 

The fast DB advanced search page (Figure 5) is accessed by clicking on the corresponding link on the basic search page (Figure 3, item 1). The user can retrieve a list of genes with common characteristics such as:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 5: the fast DB advanced search page

 

 

N.B.: All these characteristics have to fit with gene defined by human cDNAs (not by gene defined by human ESTs or by mouse cDNAs).

 

 

List of housekeeping genes

 

By clicking on the corresponding button (Figure 2, item 2), the user can access to the set of 707 putative housekeeping genes compiled in fast DB. The list shows name of the gene (official name from HUGO or EnsEMBL stable ID is provided), its chromosomal localization and its number of exon(s).

 

 

List of the transcriptional coregulators from NURSA

 

By clicking on the corresponding button (Figure 2, item 3), the user can access to a list of 246 transcriptional coregulators gathered by NURSA. The list shows name of the gene (official name from HUGO or EnsEMBL stable ID is provided), its chromosomal localization, its number of exon(s) and a direct link to NURSA for this gene.

 

 

List of the nuclear receptors from NURSA

 

By clicking on the corresponding button (Figure 2, item 4), the user can access to a list of 48 nuclear receptors gathered by NURSA. The list shows name of the gene (official name from HUGO or EnsEMBL stable ID is provided), its chromosomal localization, its number of exon(s) and a direct link to NURSA for this gene.

 

Forum / Documentation

 

By clicking on the corresponding button (Figure 2, item 5), the user can access to different types of help (Figure 6).

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 6: the fast DB “Forum / Documentation” page

 

 

The fast DB documentation (Figure 6, item 2) and this user’s guide (Figure 6, item 3) are available in HTML and PDF formats (Figure 6, items 4 and 5 respectively). Fast DB also provides an interactive help with its Forum (Figure 6, item 1). By clicking on “forum”, the user can post a message on the forum.

 

 

Communications

 

By clicking on the corresponding button (Figure 2, item 6), our website presents gathered publications, oral and poster presentations related to fast DB.

 

N.B.: if you use fast DB, please cite “de la Grange et al. NAR (2005)”.

 

 

Data & statistics

 

By clicking on the corresponding button (Figure 2, item 7), fast DB provides current data content of our database.

 

 

Links

 

A large number of web resources on transcription and splicing are provided by clicking on the corresponding link (Figure 2, item 8).

 

 

Contact us

 

By clicking on the corresponding button (Figure 2, item 9), the user can send a mail to Pierre de la Grange, the administrator of fast DB.

 

 

Number of visits and number of visitors

 

Item 10 of Figure 2 displays the number of visits and the number of distinct visitors. At this time, visitors come from about 50 distinct countries.

 

 

Last update date

 

Item 11 of Figure 2 displays date of the last update. This date can correspond to a bug correction, a new feature, as well as an update of sequences.

 

 


MAIN PAGE

 

Once a gene is clicked on the search page result, its main page is displayed.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 7: the fast DB main page for the PRMT2 gene

 

 

The main page provides the official name of the gene from HUGO if it is available or the EnsEMBL stable ID if not (Figure 7, item 1). If gene is belong in the list of transcriptional coregulators or nuclear receptor from NURSA, a link to NURSA website for this gene is provided (Figure 7, item 2). A navigation bar allows to easily navigate through the fast DB website (Figure 7, item 3). Under this navigation bar, the interactive graphical representation of the gene (Figure 7, item 4) allows to display length and sequence of each exon or intron by clicking on it (Figure 7, item 12). This graphical representation also allows to present alternative events, which are also list in tables (Figure 7, items 9, 10 and 11). Length of gene (Figure 7, item 5), number of exons (Figure 7, item 7), legend of graphical representation (Figure 7, item 8) and link to the UCSC genome browser (Figure 7, item 6) for this gene are also provided under the gene graphical representation.

 

 

Navigation bar

 

The user navigates through the fast DB web site by using the navigation bar. This bar appears in blue color if the gene was analyzed with human mRNAs; in orange color if the analysis was made with human mRNAs and ESTs; in purple color if the gene was analyzed with mouse mRNAs. This bar also provides links to several different pages of the fast DB website, depending on the type of analysis (Figure 8).

 

Figure 8: available fast DB pages for each type of analysis

 

 

Graphical gene representation

 

The graphical gene representation shows the exon/intron structure of the gene and the different alternative splicing events that affect the gene products. Each exon number is printed under each exon. The legend of this graphical representation is accessible by clicking on the corresponding link (Figure 7, item 8). A right click on this chart allows to make a zoom. It seems important to underline that exons of this chart do not correspond to the genomic exons (i.e. most frequent exon splice sites) but to the longest exons. Moreover, some red lines can indicate alternative events that are not listed in the corresponding table (see “the fast DB documentation” for more details).   

 

 

Link to the UCSC genome browser

 

By clicking on the corresponding link (Figure 7, item 6), the user gains access to the UCSC genome browser at the chromosomal localization of the gene defined by fast DB (Figure 9).

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 9: the UCSC genome browser

 

 

Number of exons

 

By clicking on the corresponding link (Figure 7, item 7), fast DB provides the list of the length of all genomic exons and introns defined for the gene. Each intron and exon is also a link to display their sequence (Figure 10). This is a good way to display sequence of an exon, which is too small to click on it on the graphical gene representation.

 

 

 

 

 

 

 

 

 

 

 


Figure 10: length of genomic exons and introns of the PRMT2 gene

 

 

Legend of the gene graphical representation

 

By clicking on the corresponding link (Figure 7, item 8), fast DB provides legend of the gene graphical representation (Figure 11).

 

 

 

 

 

 

 

 

 


Figure 11: legend of the gene graphical representation

 

 

Table of the transcription initiation & first exon(s)

 

Based on transcript analysis, fast DB algorithm can eventually define several “first exons” for a given gene (see “The fast DB documentation” for more information). These “first exons” are marked with a red arrow on the gene graphical representation and define putative upstream alternative promoters (see below). These alternative first exons are also displayed in the corresponding table (Figure 7, item 9) with the number of transcripts that support this exon as a first exon. Genbank accession of these transcripts can be displayed by leaving the mouse cursor on the number. Title of this table also provides links to analyze putative first exons and upstream sequences (Figure 12).

 

 

 

 

 

 

 

 

 


Figure 12: selection of transcription initiation and first exons analysis

 

Prediction of promoter

 

The user can use tools for promoter prediction. Figure 13 shows the selection of exon or intron to be analyzed. Any exon or intron can be chosen but over lined exons corresponds to alternative first exons defined by fast DB. Once exon or intron is selected, the user can choose the number of nucleotides upstream the selected exon to be analyzed (Figure 14).  

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 13: exon/intron selection

 

 

 

 

 

 

 

 


Figure 14: upstream sequence selection

 

Once length of upstream sequence is selected, the user has to choose a website to make the analysis. For promoter prediction, these websites are:

 

 

Transcription factor binding sites

 

On the same principle than for promoter prediction, the user has to choose one exon or intron, length of the upstream sequence and website. For the transcription factor binding site prediction, the websites are:

 

 

 

5’ UTR analysis

 

On the same principle than for promoter prediction, the user has to choose one exon or intron, length of the upstream sequence and website. For the 5’ UTR analysis, the websites are:

 

 

 

Table of the transcription termination & last exon(s)

 

Based on transcript analysis, fast DB algorithm can eventually define several “terminal exons” for a given gene (see “The fast DB documentation” for more information). These “terminal exons” are marked with a red “pA” on the gene graphical representation and define putative downstream alternative termination regions (see below). These alternative terminal exons are also displayed in the corresponding table (Figure 7, item 10) with the number of transcripts that support this exon as a last exon. Genbank accession of these transcripts can be displayed by leaving the mouse cursor on the number. Title of this table also provides links to analyze putative terminal exons and downstream sequences (Figure 15).

 

 

 

 

 

 

 

 

 


Figure 15: selection of transcription termination and last exons analysis

 

3’ UTR analysis on selected sequence

 

The user can use tools for 3’ UTR sequence analysis. The user has to select exon or intron to be analyzed. Any exon or intron can be chosen but over lined exons corresponds to alternative terminal exons defined by fast DB. Once exon or intron is selected, the user can choose the number of nucleotides downstream the selected exon to be analyzed. Finally, the user has to choose a website to make the analysis. For 3’ UTR sequence analysis, these websites are:

 

 

3’ UTR analysis on all the gene sequence

 

The user has just to select a website to make the analysis: the ARE Database or the Polyadenylation Database.

 

 

Table of alternative splicing

 

Based on transcript analysis, fast DB algorithm can define several alternative splicing events for a given gene (see “The fast DB documentation” for more information). These alternative splicing events are displayed in the corresponding table (Figure 7, item 11) with the number of transcripts that support the event. Genbank accession of these transcripts can be displayed by leaving the mouse cursor on the number. Title of this table also provides links to analyze splicing factor binding sites, splice sites, or to others alternative splicing databases for the same gene (Figure 16).

 

 

 

 

 

 

 

 

 


Figure 16: selection of alternative splicing analysis

 

Splicing factor binding site prediction

 

The user can use tools for splicing factor binding site prediction. The user has to select exon or intron to be analyzed. Then, the user can choose the number of nucleotides upstream and downstream the selected exon to be analyzed. Finally, the user has to choose a website to make the analysis. For splicing factor binding site prediction, these websites are:

 

 

Splice site strength scoring

 

The user can use tools for splice site strength scoring. The user has to select exon/intron or intron/exon to be analyzed. Finally, the user has to choose a website to make the analysis. For splice site strength scoring, these websites are:

 

 

Links to alternative splicing databases

 

The user can display gene splicing information from others alternative splicing databases. According to IDs and information gathered on a gene, these databases can be:

 

 


Names / Symbols

 

An example of the “Names / Symbols” page corresponding to the PRMT2 human gene is shown on Figure 17.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 17: “Names / Symbols” page for the PRMT2 human gene

 

 

When they are available, all the following information is found on this page (corresponding link is indicated between brackets):

 

 


PUBMED

 

In addition to supplementary splicing events defined in EST analysis and mouse analysis and to splicing events described by other web resources, the user can obtain additional information on splicing events for the analyzed gene described in the literature using Pubmed.  By clicking on “Pubmed”, fast DB provides a direct link to Pubmed related to the gene and the alternative splicing events defined by its products (Figure 18). 

  

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 18: “Pubmed” page for the PRMT2 human gene

 

 


TRANSCRIPTS VIEW

 

By clicking on the corresponding button, fast DB provides the graphical representation of the exon content of each transcript, aligned under the gene graphical representation (Figure 19).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 19: “Transcripts view” page for the PRMT2 human gene

 

 

Gene graphical representation

 

Item 11 of Figure 19 shows the graphical representation of the gene exon/intron structure. This chart allows to easily visualize the different alternative splicing events defined by gene products (red V-shaped lines). Scale of this scheme is provided on the top-left corner of this page (Figure 19, item 1). A legend is also available by clicking on the corresponding link (Figure 19, item 4). If it is available, fast DB displays graphical representation of the known CDS from the CCDS database (Figure 19, item 12). By clicking on this CDS, the user gains access to the corresponding page of the CCDS database (Figure 20).

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 20: CCDS database

 

 

Transcript graphical representation

 

Graphical representation of each transcript is aligned under the gene chart (Figure 19, item 13). It means that each “track” represents a transcript consisting in exons linked by V-shaped lines that represent splicing events. Therefore, users can easily associate a splicing event on the gene with the corresponding transcript. Each transcript can be clicked on to be further analyzed (see below).

 

Genbank accession number

 

For each transcript, its Genbank accession number is provided (Figure 19, item 5), this accession is also a link to the corresponding Genbank file (Figure 21).

 

 

 

 

 

 

 

 

 

 

 

 


Figure 21: Genbank file

 

Pubmed link

 

If this Genbank accession number is associated with a publication, the corresponding link is provided (Figure 19, item 7).

 

Tissue information

 

Fast DB also provides tissue (Figure 19, item 10) where transcript was cloned if this information is available (see “The fast DB documentation” for more details).

 

ORFs prediction

 

For each transcript, the fast DB algorithm has predicted one or two ORFs (see “The fast DB documentation” for more details). Graphical representations of these ORFs (Figure 19, item 14) are displayed over the corresponding transcript graphical representation. These charts are also interactive and the user can clicked on each transcript ORF.

 

NMD prediction

 

For each ORF, fast DB has also predicted if the corresponding transcript will provide a translated product (symbol “ORF”, Figure 19, item 16) or if this transcript will be targeted by the nonsense-mediated mRNA decay (NMD) in order to be degraded (symbol “NMD”, Figure 19, item 15). If at least one of the predicted ORFs of a given transcript is marked as “NMD”, a symbol “NMD” is also displayed under the transcript Genbank accession number (Figure 19, item 8).

 

microRNA/transcript interaction site prediction

 

If at least one microRNA/transcript interaction site has been predicted on a transcript exon, this exon is marked by a red arrow (Figure 19, item 17) that can be clicked on for further information (see below). As the same principle than NMD, is at least one transcript exon is marked by a red arrow, the symbol “miRNA” is displayed under the transcript Genbank accession number (Figure 19, item 9).

 

Transcript sequences

 

One way in fast DB to recover the transcript sequences is to check the chosen transcripts (Figure 19, item 6) and to click on the “selected” button or the “all” button at the bottom of the page (Figure 19, item 18). Fast DB provided a RTF file with selected transcript sequences in fasta format. The user can also display the translated ORF sequences of selected transcripts.

 

 

Functional protein domain analysis

 

Fast DB allows to predict the functional consequences of alternative splicing events. Indeed, by clicking on a given transcript ORF (Figure 19, item 14), fast DB provides a page allowing to display the translated sequence of one or more selected ORF exons and to make functional protein domain prediction on this sequence (Figure 22).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 22: functional protein domain analysis

 

This page shows a diagram with the graphical representation of the selected ORF aligned under those of the corresponding transcript and gene (Figure 22, item 1). The user can select an exon or a set of exon, all ORF exons are selected by default (Figure 22, item 2). As shown by chart, ORF exons do not always correspond to transcript exons, as well as transcript exons do not always correspond to genomic exons. Moreover, alternative splicing events described in fast DB are provided according to the corresponding genomic exons. For these reasons, a table gives correspondences between ORF exons, transcript exons and genomic exons (Figure 22, item 3). Translated sequence and length of the selected ORF exons is provided on the bottom-right corner of the page (Figure 22, item 4). The user has just to select a website to analyze selected ORF exons for functional protein domain prediction (Figure 22, item 5). These websites are:

 

 

Once website is chosen, the user has to click on the “domain search” button to run the analysis.

 

 

Transcript exon analysis

 

Fast DB allows to further analyze each transcript exon. Indeed, by clicking on a transcript exon (Figure 19, item 13), fast DB provides a page allowing to display sequence of the corresponding transcript exon and to make further analysis on it (Figure 23).

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 23: transcript exon analysis

 

This page shows a diagram with the graphical representation of transcript of the selected exon aligned under those of the corresponding gene (Figure 23, item 1). Selected exon is represented as a green filled rectangle and the corresponding exon number is red (Figure 23, item 2). It seems important to underline that the user can select another exon from the same transcript simply by clicking on it on the scheme. The table under the scheme displays strength of the corresponding splice sites (Figure 23, item 3). The “comment” column can indicate:

 

 

Length and sequence of the selected transcript exon is displayed under this table (Figure 23, item 4). The right part of this screen corresponds to the available analysis. The user has to choose a type of analysis (Figure 23, item 5) and the corresponding website to run it (Figure 23, item 6). All the available analysis are resumed on figure 24.

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 24: available analysis on transcript exons

 

 

 

MicroRNA/transcript interaction sites

 

If at least one microRNA/transcript interaction site has been predicted on a transcript exon, this exon is marked by a red arrow (Figure 19, item 17) that can be clicked on for further information (Figure 25).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 25: microRNA/transcript interaction sites

 

All microRNA/transcript interaction sites provided by fast DB have been predicted by miRBase Targets, using miRanda (see “The fast DB documentation” for more details). A link to miRBase Targets for the same gene is provided (Figure 25, item 1). A table gathers all the predicted microRNA/transcript interaction sites (Figure 25, item 2). Name of microRNA, genomic positions and length of the corresponding alignment are provided as well as a link to this alignment. By clicking on it, fast DB provides alignment of the microRNA sequence with genomic sequence (Figure 25, item 3). Fast DB also provides alignment of all the transcript sequences corresponding to this genomic region (Figure 25, item 4). Finally, fast DB indicates in red sequence variations within transcript sequence in order to help predicting if such variation alters microRNA/transcript interaction (Figure 25, item 5).

 

 

Frequency of inclusion/skipping of exons

 

By clicking on the corresponding link (Figure 19, item 2), fast DB provides a table with length of genomic introns and exons as well as frequency of skipping and inclusion of each exon (Figure 26).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 26: frequency of inclusion/skipping of genomic exons

 

 

Multi-alignment of translated ORF sequences

 

By clicking on the corresponding link (Figure 19, item 3), fast DB provides the multi-alignment of all translated transcript ORF sequences (Figure 26), exon by exon. Before multi-aligning these sequences, fast DB had clusterized them: several groups of sequences can be defined leading to have several multi-alignments. Green overlined sequences correspond to CDS from the CCDS database; yellow amino acids indicate variation between sequences; green bold “M” show methionins; red bold “*” indicate STOP codons; red bold letters indicate codon overlapping two exons: exon where letter is written corresponds to exon providing two nucleotides on the three codon nucleotides.

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 27: multi-alignment of translated ORF transcript sequence

 

TISSUE-SPECIFICITY

 

Tissue distribution of all gene transcripts

 

By clicking on the “Tissue-specificity” button on the navigation bar, fast DB provides the tissue distribution of all gene transcripts (Figure 28). It seems important to underline that although this study is only available from the human cDNAs analysis, all transcripts (cDNAs and ESTs) are compiled (see “The fast DB documentation” for more details).

 

 

 

 

 

 

 

 

 

 

 


Figure 28: tissue distribution of gene transcripts

 

This page displays the histogram of the tissue distribution of gene transcripts (Figure 28, item 2). Colour legend and abbreviation of tissues are provided in a table on the right of this page (Figure 28, item 3). It is also possible to see the histogram of the tissue distribution of gene transcripts for a specific event. All alternative events, except the IED events, are listed on a table on the left of the page and are clickable to display the corresponding histogram (Figure 28, item 1).

 

 

Tissue-specificity of alternative events

 

The Figure 29 below displays the tissue distribution of transcripts for the skipping of exons 9 and 10 of the human PRMT2 gene.

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 29: transcript tissue distribution for skipping of exons 9-10 of the PRMT2 gene

 

It is possible to display the tissue distribution histogram of another alternative event by clicking on the corresponding link (Figure 29, item 1). For each alternative event, fast DB provides the histogram of the tissue distribution of transcripts that define the event or have sequence surrounding the event (Figure 29, item 2; see “The fast DB documentation” for more details). Colour legend of the histogram is provided on the right of the page (Figure 29, item 3). In front each colour, a scheme represent the corresponding event: green filled rectangles represent exon, which number corresponds to genomic exon number. Numbers displayed over or under exons correspond to the fast DB genomic coordinates of splice sites. Tissue abbreviation correspondence is provided by clicking on the corresponding link (Figure 29, item 4). In some cases, number of transcripts per tissue is too important to distinct different groups. For this reason, fast DB provides the table of values under histogram (Figure 29, item 5).

 


IN SILICO PCR

 

Multi-alignment of transcript sequences

 

By clicking on the “In silico PCR” button on the navigation bar, fast DB provides the multi-alignment of all transcript sequences, exon by exon (Figure 30).

 

 

 

 

 

 

 

 


Figure 30: multi-alignment of the PRMT2 gene transcripts

 

All gene transcripts are multi-aligned, exon by exon (and intron in case of a retained intron). Exon numbers are displayed over alignment (Figure 30, item 2), Genbank accession numbers are displayed on the left of the page (Figure 30, item 1) and can be displayed on multi-alignment by leaving the mouse cursor over a specific transcript exon sequence (Figure 30, item 3). Transcript sequences, which do not correspond to genomic sequence, are displayed in the “n/a” columns (Figure 30, item 4).

 

 

In silico PCR

 

This multi-alignment allows for visualizing the common and specific sequences of transcripts. Therefore, it becomes very easy to design probes for downstream experimental applications, in particular PCR amplification. For this reason, we provide an in silico PCR (Figure 31).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 31: in silico PCR

 

This window is divided in four panels:

 

 

The user can select primers directly on the multi-alignment (Figure 31, item 1) and past them in the corresponding boxes (Figure 31, item 2). The user can also input the complement of the reverse primer (Figure 31, item 3). Once the “run in silico PCR” button is clicked, fast DB displays general information concerning the primers (Figure 31, item 4). Length, GC percent, TM, exon localization, as well as 5’->3’ sequence are provided for each inputted primer. It seems important to underline that primer sequences have to present no variation comparing to the genomic sequence in order to provide a result. Fast DB also provides list of the expected PCR product lengths (Figure 31, item 5) and link to display the corresponding sequence (Figure 31, item 6). This sequence is displayed on the bottom-right of this page (Figure 31, item 7). By clicking on this sequence, the user gains access to the list of the restriction enzyme cut sites (provided by RestrictionMapper).

 


PROBE ALIGNMENT

 

By clicking on the “probe align.” button on the navigation bar, fast DB provides a tool to localize any sequences within the gene exon/intron structure, such as probes used in microarrays (Figure 32).

 

 

 

 

 

 

 

 

 

 

 

 


Figure 32: probe alignment

 

This page allows to input one or several sequences in fasta format. Characters following the “>” of the first line are taking into account and are used for identifying corresponding inputted sequence. The user can directly paste its sequences on the “textarea” (Figure 32, item 1) or can upload a file (Figure 32, item 2).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 33: alignment of the 78 affymetrix exon-array probes for the PRMT2 gene

 

Figure 33 displays alignment of the 78 affymetrix exon-array probes for the PRMT2 gene. Each inputted sequence is graphically aligned with the exon/intron gene structure graphical representation (Figure 33, item 1). Current selected sequence is represented by a red rectangle on the scheme (Figure 33, item 2). Each inputted sequence can be clicked on to be selected. Under this scheme, a table lists different information concerning inputted sequence (Figure 33, item 3). Information concerning selected sequence is red bordered (Figure 33, item 4). Name of inputted sequence is displayed on table (Figure 33, item 5) and can also be clicked on to be selected. On the bottom of this page, fast DB provides alignment of selected sequence with the genomic sequence; inputted sequence is under the genomic sequence (Figure 33, item 6).

 

N.B.: inputted sequence must be at least 20 nucleotides long, without discontinuity comparing to the genomic sequence: in case of exon-exon junction sequence, 20 nucleotides from each exon have to be inputted.

 

PDF

 

By clicking on the “PDF” button on the navigation bar, fast DB provides most of its information concerning the gene in a PDF format (Figure 34).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 34: PDF for the human PRMT2 gene

 

The first page of this PDF indicates the gene name and its EnsEMBL stable ID (Figure 34, item 1). Type of analysis and fast DB gene_id are displayed on the top-right corner of this page (Figure 34, item 2). The user can also write its comments on the bottom of the page (Figure 34, item 3). Second page of this PDF version displays the exon/intron gene structure graphical representation (Figure 34, item 4), different information such as gene length, chromosomal localization or number of exons (Figure 34, item 5), other names, symbols or IDs (Figure 34, item 6) and all alternative events defined by products of the gene. The third page displays the length, fast DB positions and chromosomal positions of each genomic exon and intron (Figure 34, items 7 and 8 respectively). The fourth page (and following, except the last page) corresponds to the “transcripts view” of the fast DB web version (Figure 34, item 9). Finally, the last page corresponds to the tissue distribution histogram of all gene transcripts (Figure 34, item 10).

 


SEQUENCES

 

By clicking on the “Sequences” button on the navigation bar, the user can download any sequences of the analysis (Figure 35).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 35: sequence download for the human PRMT2 gene

 

This page is divided in three parts: genomic sequence (Figure 35, item 1), transcript sequence (Figure 35, item 4) and custom sequence (Figure 35, item 6).

 

 

Genomic sequence

 

The user can display the genomic sequence defined by fast DB, i.e. beginning to the first nucleotide of the first genomic exon, ending to the last nucleotide of the last genomic exon. The user can also add supplemental upstream and/or downstream sequences using the corresponding boxes (Figure 35, item 2). Exon sequences are labelled in red (and underlined) on the genomic sequence and sequence is provided as a RTF file if the corresponding box is checked (Figure 35, item 3).

 

 

Transcript sequence

 

To display a specific transcript sequence, the user has just to select a transcript by its Genbank accession number and to click on the “download” button. By default, transcript exon pair are displayed in green and exon transcript impair are displayed in black to distinguish the different exons defined by the transcript. Nucleotides corresponding to the ORF are in upper case and unaligned sequence comparing to the genomic sequence are displayed in red (Figure 36). To display the transcript sequence without these features, the user has to uncheck the corresponding box (Figure 35, item 5).

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 36: sequence of transcript BC000727

 

 

Custom sequence

 

The user can display any sequence of the gene by selecting beginning and end of this sequence. These coordinates can correspond to the fast DB genomic positions or to the chromosomal positions (Figure 35, item 7). Once reference is chosen, the user has to input start and end positions of the sequence (Figure 35, items 8 and 9 respectively).  

 


CONTACT US

 

By clicking on the letter symbol on the navigation bar, the user can send an email to Pierre de la Grange, the fast DB author and administrator.

 

 


ANALYSIS WITH HUMAN ESTs

 

When the link is available (16,053 genes on 18,018), the user can access to the analysis of the same gene using EST sequences (and also full-length and partial mRNA sequences) by clicking on the orange button on the navigation bar. This navigation bar becomes orange (instead of blue) in case of analysis with ESTs (Figure 37).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 37: analysis of the human PRMT2 gene with EST sequences

 

 


 ANALYSIS WITH MOUSE mRNAs

 

When the link is available (13,913 genes on 18,018), the user can access to the analysis of the mouse orthologous gene using mRNA sequences by clicking on the pink button on the navigation bar. This navigation bar becomes pink (instead of blue) in case of analysis with mouse mRNAs (Figure 38).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 38: analysis of the mouse PRMT2 gene with mRNA sequences

 

 


FIGURES REFERENCE

 

Figure 1: the fast DB presentation page. 5

Figure 2: the fast DB home page. 6

Figure 3: the fast DB search page. 7

Figure 4: result of a keyword search. 8

Figure 5: the fast DB advanced search page. 9

Figure 6: the fast DB “Forum / Documentation” page. 10

Figure 7: the fast DB main page for the PRMT2 gene. 12

Figure 8: available fast DB pages for each type of analysis. 12

Figure 9: the UCSC genome browser 13

Figure 10: length of genomic exons and introns of the PRMT2 gene. 14

Figure 11: legend of the gene graphical representation. 15

Figure 12: selection of transcription initiation and first exons analysis. 16

Figure 13: exon/intron selection. 17

Figure 14: upstream sequence selection. 18

Figure 15: selection of transcription termination and last exons analysis. 19

Figure 16: selection of alternative splicing analysis. 20

Figure 17: “Names / Symbols” page for the PRMT2 human gene. 22

Figure 18: “Pubmed” page for the PRMT2 human gene. 24

Figure 19: “Transcripts view” page for the PRMT2 human gene. 26

Figure 20: CCDS database. 27

Figure 21: Genbank file. 28

Figure 22: functional protein domain analysis. 29

Figure 23: transcript exon analysis. 30

Figure 24: available analysis on transcript exons. 31

Figure 25: microRNA/transcript interaction sites. 32

Figure 26: frequency of inclusion/skipping of genomic exons. 33

Figure 27: multi-alignment of translated ORF transcript sequence. 34

Figure 28: tissue distribution of gene transcripts. 36

Figure 29: transcript tissue distribution for skipping of exons 9-10 of the PRMT2 gene. 37

Figure 30: multi-alignment of the PRMT2 gene transcripts. 39

Figure 31: in silico PCR.. 40

Figure 32: probe alignment 42

Figure 33: alignment of the 78 affymetrix exon-array probes for the PRMT2 gene. 43

Figure 34: PDF for the human PRMT2 gene. 45

Figure 35: sequence download for the human PRMT2 gene. 47

Figure 36: sequence of transcript BC000727. 48

Figure 37: analysis of the human PRMT2 gene with EST sequences. 51

Figure 38: analysis of the mouse PRMT2 gene with mRNA sequences. 53