Posts

Showing posts from February, 2015

Screening RNA-seq data for novel transcripts with samtools mpileup and UCSC genome browser

Image
The unbiased nature of deep transcript sequencing makes it the ideal technology to discover novel uncharacterised genes. Lets screen our favourite RNA-seq experiment (azacitidine-treated AML3 human cells, GSE55123) for novel expressed genes. I use Ensembl gene annotations.

We'll start with preparing bed files of exons and gene bodies

grep -w exon Homo_sapiens.GRCh38.76.gtf | tr '" ' '\t' \
| cut -f1,4,5,11 | uniq > Homo_sapiens.GRCh38.76_exons.bed

grep -w gene Homo_sapiens.GRCh38.76.gtf | tr '" ' '\t' \
| cut -f1,4,5,11 > Homo_sapiens.GRCh38.76_genes.bed

Now for convenience, I'll merge the data from the three replicates with samtools.

samtools view -H SRR1171523_1.fastq.sort.bam > header.txt

samtools merge -h header.txt Ctrl.bam SRR1171523_1.bam SRR1171523_2.bam SRR1171524_1.bam SRR1171524_2.bam SRR1171525_1.bam SRR1171525_2.bam

samtools merge -h header.txt Aza.bam SRR1171526_1.bam SRR1171526_2.bam SRR1171527_1.bam SRR1171527_2.bam S…

ENCODE Users Meeting 2015

Just announced from the ENCODE mailing list is the workshop to analyse and make the most of ENCODE data. Attend if you can!

FW: [encode-announce] [Meeting announcement] ENCODE User's Meeting

Dear colleagues,

We would like to announce the first ENCODE User’s Meeting, which will be a 3-day workshop to learn how to navigate, analyze, use, and integrate ENCODE data. The NHGRI-sponsored meeting will be held from June 29 - July 1, 2015 at the Bolger Center in Potomac, MD.

The ENCODE project has generated 3500 datasets in human, more than 1000 datasets each in model organisms including mouse, fly and worm. It provides an extremely valuable resource of potential functional annotations of the human genome and represents an unprecedented opportunity for applications in a variety of biomedical problems. This meeting will have both scientific presentations and hands-on tutorial sessions with the goal of learning how ENCODE data has been used by the scientific community and providing opportuni…