Bioinformatics dance: February 2013

Adding permanent custom annotation tracks to a local UCSC Genome Browser installation

Posted by bioinfoman at 23:49 Tuesday, 26 February 2013 1 comments

As promised in the previous post, in this one, I will show how can one "hack" into the UCSC Genome Browser (local) MySQL database in order to create permanent tracks (annotation, signal or other). One way is of course the track hubs, but they are not permanent in the sense that the user has to load them at least once before start working.
The concept is quite simple and follows the strategy that UCSC programmers use to host ENCODE tracks. In fact, all ENCODE signal tracks are not separate co-ordinate tables in the database structure but they are external bigWig and/or .bam files, stored somewhere (genome) specific in the /gbdb directory. The relative path is then stored in a table in the genome browser database. Please see this post for an explanation of the genome browser directory structure.
So before reading and executing the following MySQL script, you should first create the signal or annotation files. I am going to use the same annotation tracks described in this post, so make sure you have created the .bam files first by reading the post.
We have to touch 2 tables and create as many new ones as our tracks, so the steps are summarized below:

Creating track hubs for the UCSC genome browser with BAM files

Posted by bioinfoman at 12:42 Friday, 22 February 2013 0 comments

One of the projects that I am currently involved deals with the detection of novel long non-coding RNAs regulating the WNT signaling pathway in colon cancer. Since we have had our own local installation of the UCSC genome browser for some time now (and it makes a huge difference), I decided that it would be useful to gather some current knowledge about annotated human and mouse lincRNAs from a few sources. These sources are:

Human

The Ensembl GRCh37 genome annotation
The NONCODE project
The Broad institute lincRNA catalogue

Mouse

The Ensembl GRCm38 genome annotation
The NONCODE project

The NONCODE project contains lincRNAs mapped to the hg19 and mm10 genomes while the Broad institute lincRNA catalogue, lincRNAs mapped to hg19. As our current browser installation has also mm9 and hg18, we will have to use the liftOver tool (in command line) and the transformation chain files to adjust the coordinates.

Installation of UCSC Genome Browser in a local server

Posted by bioinfoman at 12:00 Sunday, 10 February 2013 2 comments

A. Preface and MySQL configuration

The UCSC Genome Browser is one of the most essential tools in genomics research. Its value is ever increasing, proportionally to the current explode in available Next Generation Sequencing data. Its installation is not something mainstream and requires a lot of patience and a little more than basic knowledge of Linux environment and MySQL. Before you try it, make sure that you know how to install linux packages (and also from source), how to perform a basic MySQL and Apache setup and how to run Perl and Shell scripts. This guide is not exaclty a step by step procedure as it refers a lot of times to external sources, blogs and wikis found around the web. Based on the work of others, I tried to install an as customizable as possible version on my server, to be used by several labs at the institution I am currently working in.

My installation is performed on an Ubuntu 12.04 LTS Server. You can adjust it for your distribution. Throughout this guide we assume that our base storage environment is /media/HD2/, so you will see a lot of time the shell variable $STORAGE="/media/HD2". We also assume a temporary directory, $TEMP, by default the /tmp directory

Welcome to another bioinformatics blog

Posted by bioinfoman at 03:50 Friday, 8 February 2013 0 comments

Welcome to another bioinformatics blog which is my first attempt to finally start sharing advice and pieces of code for everyday computational work, which is something that many others have done so wonderfullty before me.

According to Wikipedia, bioinformatics is "...an interdisciplinary field that develops and improves upon methods for storing, retrieving, organizing and analyzing biological data..." and main activities include "...to develop software tools to generate useful biological knowledge..". For me, Bioinformatics has been and remains an inspiring motivation to apply what I have learned my first scientific discipline (Applied Mathematics) to something more "real" than all the theoretical and background knowledge I got during my studies. Others might choose finance and industry to do this. For me the choice was to switch fields (with lots of consequences, the very first being an immediate background knowledge gap) and try to work with biologists. This interaction for some years now (MSc, PhD etc.) has given me a lot of extra knowledge and the opportunity to apply what I learned during my first studies to a quickly expanding and challenging field.

For a long time now, I have been reading postσ in biology and bioinformatics blogs and forums and using very useful things posted there, including pieces of code and a lot of advice. Now it's time that I start doing it too, with a lot of delay! I am pretty sure that I won't really offer anything useful out there, as to this end, there are plenty of smart people for a long time now, maintaining software, blogs and forums. However, it will finally provide a little bit more the feeling of sharing, as Bioinformatics is a discipline where by default, the majority of tools and algorithms are public and open-source.

So keep posted, and I hope that some day you will find something useful in this blog, something that will make you think that this guy has helped me a bit to my work by adding a very small stone to the pileup.

Bioinformatics dance

Pages

Adding permanent custom annotation tracks to a local UCSC Genome Browser installation

Creating track hubs for the UCSC genome browser with BAM files

Installation of UCSC Genome Browser in a local server

A. Preface and MySQL configuration

Welcome to another bioinformatics blog

Blog Archive

Popular Posts

Favorites

About Me

Followers