RNA-seq analysis pipeline focusing on long ncRNAs (3/11/2016)
Fri, Mar 11, 2016 at 5:09 AM
Customer asks about analysis pipeline for strand-specific RNA-seq data with focus on identifying long non-coding RNAs. He also asks about how to analyze the coding counterpart of the long non-coding transcripts.
Fri, Mar 11, 2016 at 10:42 AM
AccuraScience LB: Assuming that it is a model organism species with a high-quality reference genome and annotation of protein-coding genes in Ensembl, a "typical" analysis pipeline for RNA-seq data focusing on identifying long non-coding RNAs (lncRNAs) would include the following - starting from Cufflinks analysis results: (1) annotate all transcripts based on gene models documented in Ensembl, and produce lists of (a) transcripts that can be annotated as known protein-coding genes, (b) transcripts that can be annotated as known ncRNAs, and (c) transcripts that cannot be annotated as known genes or known ncRNAs - this category of transcripts are called "novel transcripts". (2) Perform coding potential prediction of the novel transcripts, which produces lists of (c1) novel coding RNAs and (c2) novel non-coding transcripts. The whole list of ncRNAs will be (b)+(c2).
Most of the functionally characterized lncRNAs work in cis, i.e., they have “counterpart” protein-coding genes that reside in close proximity to them on the genome. We can produce a list of these protein-coding genes with ease. It is important to note that some lncRNAs work in trans, that is, their “counterparts” reside in remote locations. Whether a lncRNA work in cis or in trans can only be determined through functional studies.
Note: LB stands for Lead Bioinformatician. n AccuraScience LB is a senior bioinformatics expert and leader of an AccuraScience data analysis team.
Disclaimer: This text was selected and edited based on genuine communications that took place between a customer and AccuraScience data analysis team at specified dates and times. The editing was made to protect the customer’s privacy and for brevity. The edited text may or may not have been reviewed and approved by the customer. AccuraScience is solely responsible for the accuracy of the information reflected in this text.