Training :
Analysis of alternative splicing with KisSplice

Brief introduction to KisSplice

Introductory slides pdf

The data

The fastq files can be found here
Those correspond to long polyA+ RNAs from SKNSH cell lines. This dataset was generated in the context of the ENCODE project (SRA: SRR315315,SRR315316,SRR534309,SRR534310). There are two experimental conditions (treated with retinoic acid, untreated), and two replicates per condition. Only 10M reads per replicates are analysed here. For an analysis of the full dataset, see here

Running KisSplice

Here is the command in order to run KisSplice on this dataset :

kissplice -r SknshRACellRep1_10M.fastq -r SknshRACellRep2_10M.fastq
-r SknshCellRep3_10M.fastq -r SknshCellRep4_10M.fastq

The job should take approximately 20mn. The file containing the bubbles corresponding to the alternative splicing can be found here : results_k41_coherents_type_1.fa
All output files of kissplice are here

Aligning to the reference genome

STAR can be used to align the alternative splicing events found by kissplice back to the reference genome. Here is the reference genome in fasta format : Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa
And the associated gtf : Homo_sapiens.GRCh38.84.gtf
Building STAR index of the genome :

STAR --runMode genomeGenerate --genomeDir STAR_index --genomeFastaFiles Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa --sjdbGTFfile Homo_sapiens.GRCh38.84.gtf

The build index can be found here


Aligning to the reference genome :

STAR --genomeDir STAR_index --readFilesIn results_k41_coherents_type_1.fa

The alignment (in sam format) can be found here

Running KisSplice2RefGenome

kissplice2refgenome -a Homo_sapiens.GRCh38.84.gtf Aligned.out.sam


The ouput file can be found here : k2rg-v1.0.1_sknsh.txt

Running kissDE

The last version of kissDE can be downloaded here.

#!/usr/bin/Rscript
library(kissDE)
countsData<-kissplice2counts("k2rg-v1.0.1_sknsh.txt", k2rg=TRUE)
conditions<-c("SknshRA","SknshRA","Sknsh", "Sknsh")
results<-diffExpressedVariants(countsData, conditions)
writeOutputKissDE(results, adjPvalMax = 0.05, dPSImin = 0.1, output = "kissDE-output.tab")

Here is the file kissDE-output.tab