Motif ALTernative Exons Scanner Enrichment of RNA-Seq (aka MAltESERS), is a novel bioinformatic tool that can predict domain changes from alternative splicing. But what is alternative splicing?
While genes in prokaryotic organisms are single units of expression for a single protein, eukaryotic life developed a more complex system for protein synthesis using alternative splicing. In eukaryotes, genes are comprised of expressed coding regions (exons) interrupted by inexpressed regions (introns). Splicing involves a selective extraction of pre-mRNA introns from an mRNA transcript consisting of ligated expressed regions (exons). The spliced mRNA is then translated into protein. This process allows variation in proteins as a gene can be alternatively spliced enabling the production of functionally different proteins from the same gene. This is important as approximately 90% of human genes are spliced. Mutations in non-coding (intronic) regions can affect splice sites, and cause diseases such as Familial Dysautonomia, Neurofibromatosis type 1, Frasier syndrome, atypical cystic fibrosis (intron mutations that alter splice site recognition), Spinal muscular atrophy (exon mutations that disrupt splicing), frontotemporal dementia, and Parkinsonism linked to chromosome 17 (disruption of isoform ratio).
Splicing changes often involve the inclusion or exclusion of a large domain. It would be beneficial to know how changes in alternative splicing affect a protein’s function. Currently, each splice event is treated as an individual case. After a splicing event is identified, manual identification of the specific isoform is necessary:
- It is necessary to check if the isoform is found in the Ensemble database.
- Then check if the mRNA is processed into protein or not.
- Manually assert domain changes corresponding to isoform change.
- Do a literature search on phenotypes associated with each isoform.
While the methods listed above can be productive, there are problems:
- A single experimental treatment often results in hundreds of alternative splice events.
- Researching each splice event individually becomes an impossible and monotonous task.
- Literature searches fail to provide functional results for novel splicing events.
Back to MAltESERS
MAltESERS uses prosite’s ps_scan tool to figure out the domains that are present in the exons affected, as well as the domains in the rest of the gene. The splice event has to be detected before MAltESERS by another tool such as rMATS or DEXseq. MAltESERS then generates a score for each functional domain found for each splice event:
The results are outputted in a few formats. First and most important, it generates a table with all the domains and scores found.
It also generates three plots, each more confusing than the prior. The first shows all the domains lost and gained:
In this plot we see that CK2 phosphorylation was altered in four splice events (shown in the brackets). We can also see the score given for each event (the four points). Two of them were included (red) and two excluded (blue).
The next plot is very similar but with a twist:
Here we see the genes rather than the domains. For example we see that thyroid hormone receptor beta (THRB) had 2 domains with a very high score excluded.
The final plot is a nice heatmap (who does not love heatmaps):
This heatmap represents the average score of the domains detected and the genes. This plot shows very clearly that THRB alternative splicing affects putative CK2 and PKC phosphorylation sites.
This tool was fun to build and feel free to try it out (who doesn’t have random splicing data-sets at home?), check out https://github.com/aLahat/maltese.