Bioinformatics dance: Installation of UCSC Genome Browser in a local server

A. Preface and MySQL configuration

The UCSC Genome Browser is one of the most essential tools in genomics research. Its value is ever increasing, proportionally to the current explode in available Next Generation Sequencing data. Its installation is not something mainstream and requires a lot of patience and a little more than basic knowledge of Linux environment and MySQL. Before you try it, make sure that you know how to install linux packages (and also from source), how to perform a basic MySQL and Apache setup and how to run Perl and Shell scripts. This guide is not exaclty a step by step procedure as it refers a lot of times to external sources, blogs and wikis found around the web. Based on the work of others, I tried to install an as customizable as possible version on my server, to be used by several labs at the institution I am currently working in.

My installation is performed on an Ubuntu 12.04 LTS Server. You can adjust it for your distribution. Throughout this guide we assume that our base storage environment is /media/HD2/, so you will see a lot of time the shell variable $STORAGE="/media/HD2". We also assume a temporary directory, $TEMP, by default the /tmp directory

If you have MySQL > 5.5 (which is the default in Ubuntu>=12.04) you must recompile from the source in order to enable the

load data local infile

MySQL command, which is by default disabled in more recent versions. To this end, you can follow the instructions here. In the cmake command, add:

-DWITH_SSL=yes -DMYSQL_UNIX_ADDR=/var/run/mysqld/mysqld.sock -DWITH_INNOBASE_STORAGE_ENGINE=1

Then, edit the /etc/mysql/my.cnf file and comment all the ssl functions under [mysqld] and
change

lc-messages-dir = /usr/share/mysql

lc-messages-dir = /usr/local/mysql/share

While in my.cnf, add the following lines under [mysqld]

key_buffer              = 1024M
max_allowed_packet      = 64M
thread_stack            = 512K
thread_cache_size       = 32
table_cache             = 1024
query_cache_limit       = 16M
query_cache_size        = 1024M
sort_buffer_size        = 16M
read_buffer_size        = 16M
read_rnd_buffer_size    = 32M
myisam_sort_buffer_size = 512M
bulk_insert_buffer_size = 1024M
join_buffer_size        = 512M
innodb_flush_log_at_trx_commit  = 2
innodb_log_buffer_size  = 64M
innodb_log_file_size    = 512M
innodb_buffer_pool_size = 32768M # Watch this as I have 64GB of RAM!
innodb_thread_concurrency = 16
innodb_flush_method=O_DIRECT

the following under [myisamchk]

key_buffer              = 1024M
sort_buffer_size        = 512M
read_buffer             = 64M
write_buffer            = 64M

and the following under [isamchk]

key_buffer              = 1024M
sort_buffer_size        = 512M
read_buffer             = 64M
write_buffer            = 64M

Before starting the newly compiled mysql server, follow EXACTLY the instructions here and in a more friendly version here to properly reconfigure InnoDB to store and process bigger bulk imports, as it is mandatory for the faster function and import of custom tracks. You should also lock the MySQL version you just installed in order not be affected by Ubuntu's updating system. This is quite easy and you can do it by a little googling.

B. Installation of the UCSC Genome Browser web application and session system

This section contains instructions on how to install the Genome Browser application only. The visualization application, the session system and the other UCSC applications (e.g. the Table Browser) are independent of the background databases containing several genomic features. This section assumes basic knowledge about Apache, installing packages from source and basic MySQL administration knowledge.

Create $STORAGE/gbdb and $STORAGE/genomebrowser directories

sudo mkdir $STORAGE/gbdb
sudo mkdir $STORAGE/genomebrowser

Fetch the kent source tree to a $STORAGE/kent directory

sudo mkdir $STORAGE/kent
sudo git clone git://genome-source.cse.ucsc.edu/kent.git

Copy $STORAGE/kent/src/product/scripts to $STORAGE/scripts

sudo mkdir $STORAGE/scripts
sudo cp -r $STORAGE/kent/src/product/scripts $STORAGE/scripts

Open synaptic package manager and install the libmysqlclient-dev packages, and generally other libmysql development packages to get header files. At this point you should be careful not to interfere with the new MySQL installation of section A. It might require some time playing around, but generally, it should work from the first effort. If not, perform this step before installing MySQL from source, in section A.
Optionally, enable SSL in MySQL, see here for detailed instructions. Fix the apparmor by following instructions here.
Install SAM tools from source from here or look for the samtools package in Ubuntu synaptic application.
Edit both your account .bashrc file as well as the /root/.bashrc and add the line
```
export MACHTYPE=x86_64
```
(replace x86_64 with your machine's architecture, it can be found with uname -p) and reload them.
```
source ~.bashrc
su # ...and then your password
source ~.bashrc
```

Edit $STORAGE/scripts/browserEnvironment.txt. Change the following to (changed below):

export KENTHOME=$STORAGE"/kenthome/"
export kentSrc=$STORAGE"/kent"
export GBDB=$STORAGE"/gbdb"
export BROWSERHOME=$STORAGE"/genomebrowser"
export HGSQL="mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD"
#export MYSQLLIBS="/usr/lib/x86_64-linux-gnu/libmysqlclient.a -lz"
export MYSQLLIBS="/usr/local/mysql/lib/libmysqlclient.a -lz"
export MYSQLINC="/usr/local/mysql/include"
export PNGLIB="/usr/lib/x86_64-linux-gnu/libpng.a"
export PNGINCL"-I/usr/include/libpng12"
export USE_BAM=1 (uncomment)
export KNETFILE_HOOKS=1 (uncomment)
export SAMDIR=/opt/NGSTools/SAMTools
export SAMINC=${SAMDIR} (uncomment)
export SAMLIB=${SAMDIR}/libbam.a (uncomment)
export AUTH_MACHINE="sevenofnine"
export AUTH_USER="root"

Prepare Apache

Enable XBitHack

sudo a2enmod include

In /etc/apache2/apache2.conf add the line

XBitHack on

Create a virtual host file called my_prefered_host_name in /etc/apache2/sites-available and copy:

XBitHack on
# Virtual host for genomebrowser
<VirtualHost *:80>
 ServerAdmin your_admin_mail@yourdomain.com
 DocumentRoot /media/HD2/genomebrowser
 ServerName genomebrowser
 <Directory />
  Order deny,allow
  Deny from all
  Options FollowSymLinks
  AllowOverride None
 </Directory>
 <Directory /media/HD2/genomebrowser>
  AllowOverride AuthConfig
  Options +Inlcudes
  Order allow,deny
  allow from all
 </Directory>

 ScriptAlias /cgi-bin/ /media/HD2/genomebrowser/cgi-bin/
 <Directory "/media/HD2/genomebrowser/cgi-bin">
  AllowOverride None
  Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
  Order allow,deny
  Allow from all
  AddHandler cgi-script .cgi .pl
 </Directory>

 ErrorLog /media/HD2/genomebrowser/logs/apache2/error.log
 CustomLog /media/HD2/genomebrowser/logs/apache2/access.log combined
 LogLevel warn

 Alias /doc/ "/usr/share/doc/"
 <Directory "/usr/share/doc/">
  Options Indexes MultiViews FollowSymLinks
  AllowOverride None
  Order deny,allow
  Deny from all
  Allow from 127.0.0.0/255.0.0.0 ::1/128
 </Directory>

 # Some security
 ServerSignature Off
</VirtualHost>

Add the following line to /etc/hosts

127.0.0.254     genomebrowser

Restart the networking service

sudo /etc/init.d/networking restart

Restart Apache

sudo /etc/init.d/apache2/restart

Create a MySQL user using either MySQL command line or webmin or phpMyAdmin (I created gbuser, password MY_PASSWORD, using webadmin). You should have these tools anyway as they are very handy for managing your system.

Create the hg.conf file as described here, in Part 1: Genome Browser engine. Here is mine:

# Configuration file for the UCSC Human Genome server
#
# the format is in the form of name/value pairs, written as 'name=value'
#
# note that there is no space between the name and its value. Also, no blank lines should be in this file.
#
#--------------------------------------------------------------#
#
# db.host is the name of the MySQL host to connect to
db.host=localhost
#
# db.user is the username used when connecting to the host
db.user=THE_USER_CREATED_IN_STEP_9
#
#
# this is the password to use with the above hostname
db.password=THE_PASSWORD
#
db.trackDb=trackDb
# central.host is the name of the host of the central MySQL
# database where stuff common to all versions of the genome
# and the user database is stored.
central.db=hgcentral
central.host=localhost
central.user=THE_USER_CREATED_IN_STEP_9
central.password=THE_PASSWORD
central.domain=
backupcentral.db=hgcentral
backupcentral.host=localhost
backupcentral.user=THE_USER_CREATED_IN_STEP_9
backupcentral.password=THE_PASSWORD
backupcentral.domain=
# required to use hgLogin
login.systemName=hgLogin CGI
# url to server hosting hgLogin
wiki.host=genomebrowser
# name of cookie holding username - do not change!
wiki.userNameCookie=wikidb_mw1_UserName
# name of cookie holding user id - do not change!
wiki.loggedInCookie=wikidb_mw1_UserID
# title of host of browser, this text be shown in the user interface of the login/sign up screens
login.browserName=UCSC Genome Browser @Fleming
# base url of browser install
login.browserAddr=http://genomebrowser
# signature written at the bottom of hgLogin system emails
login.mailSignature=Local administrator: Panagiotis Moulos
# from/return email address used for system emails
login.mailReturnAddr=your_admin_mail@yourdomain.com

The last lines (about login) will enable the independent login system of the browser so as to be able to host different users.

Create a /root/bin/x86_64 directory and $STORAGE/kenthome/bin/x86_64 directory and a symbolic link

sudo mkdir -p /root/bin/x86_64
sudo mkdir -p $STORAGE/kenthome/bin/x86_64
sudo ln -s $STORAGE/kenthome/bin/x86_64 /root/bin/x86_64

Create the /gbdb symlink. Very important...
```
sudo ln -s $STORAGE/media/HD2/gbdb /gbdb
```
Before fetching the html files using updateHtml.sh, I edited the updateHtml.sh kent script in $STORAGE/scripts to also displaty the rsync output in stdout instead of log only. To do this, go to the ${RSYNC} commands towards the end of the script and replace
```
>> ${FETCHLOG} 2>&1
```
with
```
| tee -a ${FETCHLOG} 2>&1.
```
Save the file and then run it. Add also --verbose after ${RSYNC}.
```
sudo sh updateHtml.sh ./browserEnvironment.txt
```
Before fetching and compiling the source, we have to patch SAMTools to enable network support for BAM files. This has to be done manually, as the SAMTools do not yet support it. The patch as well as full instruction on how to apply it can be found here. Please follow them carefully
Now we have to run kentSrcUpdate.sh in order to fetch the latest code and build the binaries and CGIs from source. Open the kentSrcUpdate.sh script. Towards the end, replace the > daily.log etc. of the make commands with | tee -a (see also step 13) to display all messages in STDOUT. Then, run the script
```
sudo sh kentSrcUpdate.sh ./browserEnvironment.txt
```
We must download now the hgcentral database. We use the fetchHgCentral.sh script for that
```
sudo sh fetchHgCentral.sh go > $TEMP/hgcentral.sql
```

We must set up an SQL database to accept the file that we just downloaded, along with a genome browser user. Ideally, we should have a user with SELECT permissions and a user with ALL permissions... We set a user will ALL permissions for now as the browser itself is graphical and does not allow for writing

mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD \
-e "CREATE USER 'gbuser'@'localhost' identified by 'password'; FLUSH PRIVILEGES;"

mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD -e "CREATE DATABASE hgcentral;"

mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD \
-e "GRANT SELECT, INSERT, UPDATE, DELETE, INDEX, LOCK TABLES, \
CREATE, DROP, ALTER, CREATE TEMPORARY TABLES ON hgcentral.* \
TO 'gbuser'@'localhost'; FLUSH PRIVILEGES;"

mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD \
-e "GRANT FILE ON *.* TO 'gbuser'@'localhost'; FLUSH PRIVILEGES;"

mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD -e "CREATE DATABASE hgFixed;"

mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD \
-e "GRANT SELECT ON hgFixed.* TO 'gbuser'@'localhost'; FLUSH PRIVILEGES;"

Import the hgcentral database
```
mysql -ugbuser -ppassword hgcentral < $TEMP/hgcentral.sql
```
The basic genome browser session functionality should be almost ready. We need to create a couple more symbolic links to custom JavaScript and CSS files

Create the following symbolic links:

sudo ln -s $STORAGE/genomebrowser/cgi-bin $STORAGE/genomebrowser/htdocs/cgi-bin
sudo ln -s $STORAGE/genomebrowser/trash $STORAGE/genomebrowser/htdocs/trash

Create the /usr/local/apache/htdocs directory (nothing there) and then the following symbolic links:

sudo mkdir -p /usr/local/apache/htdocs
sudo ln -s $STORAGE/genomebrowser/htdocs/js /usr/local/apache/htdocs/js
sudo ln -s $STORAGE/genomebrowser/htdocs/style /usr/local/apache/htdocs/style
sudo ln -s $STORAGE/genomebrowser/htdocs/inc /usr/local/apache/htdocs/inc
sudo ln -s $STORAGE/genomebrowser/htdocs/images /usr/local/apache/htdocs/images
sudo ln -s $STORAGE/genomebrowser/htdocs/goldenPath/help /usr/local/apache/htdocs/goldenPath/help/

At this point the website must be partially functional. Now we have to install some genome databases

Change the ownership of the contents of the genomebrowser directory to www-data and restart apache
```
sudo chown -R www-data:www-data $STORAGE/genomebrowser
sudo /etc/init.d/apache2 restart
```

C. Installation of minimal genome databases

Create a file named my.minimal.db.list.txr and type the following (for 5 organisms):
```
hg18
hg19
mm9
dm3
hgFixed
```
Fetch the minimal gbdb information for these organisms by running the script fetchMinimalGbdb.sh. Before running, edit and replace to the last lines, where fetchOne is called, > with | tee -a to display information as before. Add also --verbose option in the ${RSYNC} commands, if additional information is essential to you (it was for me!).
```
sudo sh fetchMinimalGbdb.sh ./browserEnvironment.txt ./my.minimal.db.list.txt
```
Fetch the minimal golden path database information for these organisms by running the script fetchMinimalGoldenPath.sh. Before running, edit and replace lines for more verbosity, as in step 2.
```
sudo sh fetchMinimalGoldenPath.sh ./browserEnvironment.txt ./my.minimal.db.list.txt
```
hg18 sql table creation files have a syntax problem (at least with my MySQL version). Go to $STORAGE/genomebrowser/htdocs/goldenPath/hg18/database and run
```
sudo sed -i 's/TYPE=/ENGINE=/g' *.sql
```

Load the minimal golden path databases fetched with the script above

sudo sh loadDb.sh ./browserEnvironment.txt hg18
sudo sh loadDb.sh ./browserEnvironment.txt hg18
sudo sh loadDb.sh ./browserEnvironment.txt hg19
sudo sh loadDb.sh ./browserEnvironment.txt mm9
sudo sh loadDb.sh ./browserEnvironment.txt mm10
sudo sh loadDb.sh ./browserEnvironment.txt dm3
sudo sh loadDb.sh ./browserEnvironment.txt hgFixed

Grant access to the newhe genome browser user

for DB in hg18 hg19 mm9 mm10 dm3
do
 mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD \
 -e "GRANT SELECT, INSERT, UPDATE, DELETE, INDEX, CREATE, DROP, ALTER, \
 CREATE TEMPORARY TABLES ON $DB.* TO 'gbuser'@'localhost'; FLUSH PRIVILEGES;" 
done

Now you must have a basic track functionality if the local version of UCSC genome browser. However, there are not many things that can be done, apart from custom track exploration and sequence retrieval as there is no gene annotations etc. The next section explains how we can customize the UCSC databases a bit further than the "take the minimum or all" approach of the kent scripts.

D. Installation of other genome database tables

As it is very space costly (and most times useless) to install the full mirror of Genome Browser databases and there is no straightforward way to determine what feature corresponds to which table, I created a Perl script called fetchCustomDb.pl to fetch the tables we need. However, this is not completely automatic as it requires certain manual work to determine the tables for the required features (I did it using the UCSC Table Browser) and to note them down so as to create a YAML configuration file which is required by the Perl script. The YAML configuration file is quite self-excplicable and contains these tables but in a configuration format understandable by the Perl script together with other variables. The script can be downloaded from here and the YAML parameter file from here. I created the table list by fetching the tables for the features of interest in the UCSC Table Browser and then by viewing the source of the page and copying-pasting the contents of ther respective SELECT list. This of course can become more systematic by using some package to scrap the page (in the TODO list...). In this way, the final table list contains a lot of duplicates as many tables are interconnected. But the script takes care of that. As the tables and the way they are constructed across genomes are a mess (e.g. in some genomes features are splitted per chromosome, in others not), you need to explore a bit the FTP server of UCSC to determine that. Another script also takes care of the external databases that have to be installed (e.g. GO, UniProt, etc.) by directly using mysqldump in UCSC server. It can be downloaded from here. Once done, you can pass these tables to the parameters file and the script takes care of the rest. After you define all these, you just run
```
sudo perl fetchCustomDb.pl --param your_param_file.yml
```
The parameter file is optional if your needs are the same as mine (they are loaded also by default). It is advised to use the --dry parameter to see what will be the total amount of data to be downloaded, as UCSC data continue to expand.
```
sudo perl fetchCustomDb.pl --param your_param_file.yml --dry
```
The script uses a Perl interface for rsync. One would wonder why not use a simple shell script with multiple rsync lines. The answer for me is easy usage, reusability, elegance, system and maintenance!
Now we must reload the database tables. This will be done with the kent script loadDb.sh. However, keep in mind that in order to use this script, the databases created in step C.5 must be dropped as this script works only if the databases do not exist. This can be easily done using a GUI tool such as phpMyAdmin or webmin and sufficient user privileges, or even in command line by
```
-e "DROP DATABASE genome_to_be_dropped;"
```
Do NOT drop the hgcentral database! The latest version of the aforementioned Perl script takes care of that for you. Just read the documentation.

Set up a cron work in /etc/cron.weekly to clean the trash data from custom tracks. You can do this either in webmin or using the script here, name it clean-gb-trash and set permissions to 755:

#!/bin/bash
find $STORAGE/genomebrowser/trash/ \! \( -regex "$STORAGE/genomebrowser/trash/ct/.*" \
 -or -regex "$STORAGE/genomebrowser/trash/hgSs/.*" \) -type f -amin +10080 -exec rm -f {} \;
find $STORAGE/genomebrowser/trash/    \( -regex "$STORAGE/genomebrowser/trash/ct/.*" \
 -or -regex "$STORAGE/genomebrowser/trash/hgSs/.*" \) -type f -amin +20160 -exec rm -f {} \;

and

sudo chmod 755 /etc/cron.daily/clean-gb-trash

Finally, download the processed Genbank files from UCSC FTP path /gbdb/genbank/./data/processed/* to $STORAGE/gbdb/genbank/./data/processed/

sudo rsync --archive --compress --partial --recursive --progress --stats --verbose --human-readable \
rsync://hgdownload.cse.ucsc.edu/gbdb/genbank/data/processed/* \
$STORAGE/gbdb/genbank/data/processed/

E. Setting up the custom track database

This section describes how to set up support for the custom track database, so that to avoid using the trash directories and achieve faster access. This was later added to the UCSC Genome Browser. It is advised that you follow this step as it is recommended for proper user sessions.

Enable the custom track database. This is not handled by kent scripts. To do this, firstly create the customTrack database in MySQL
```
mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD -e "CREATE DATABASE customTrash;"
```

Create another user to work with custom tracks, e.g. ctgbuser (I created ctgbuser with password ctbguser@mydomain).

mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD \ 
-e "CREATE USER 'ctgbuser'@'localhost' IDENTIFIED BY 'password'; FLUSH PRIVILEGES;"

mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD \ 
-e "GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, ALTER ON customTrash.* \ 
TO 'ctgbuser'@'localhost'; FLUSH PRIVILEGES;" 

mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD \ 
-e "GRANT FILE ON *.* TO 'ctgbuser'@'localhost'; FLUSH PRIVILEGES;"

Create a temporary directory which is used by this functionality

sudo mkdir $STORAGE/genomebrowser/data/tmp
sudo chown -R www-data:www-data $STORAGE/genomebrowser/data/tmp

Enter the following items to hg.conf

customTracks.host=localhost
customTracks.user=ctgbuser
customTracks.password=password
customTracks.useAll=yes
customTracks.tmpdir=$STORAGE/genomebrowser/data/tmp

Make sure you do all things below as root, either with su or with sudo. Create a hidden directory .conf in $STORAGE/genomebrowser.
```
sudo mkdir $STORAGE/genomebrowser/.conf
```
In this directory, create the .ct.hg.conf file file with the following contents:
```
db.host=localhost
db.user=ctgbuser
db.password=password
```
and set its permissions to 600
```
sudo chmod 600 ct.hg.conf
```
Next, place a copy of hg.conf there too
```
sudo cp $STORAGE/genomebrowser/cgi-bin/hg.conf  $STORAGE/genomebrowser/.conf/.hg.conf
```
Finally, create two symbolic links in /root for these files
```
sudo ln -s $STORAGE/genomebrowser/.conf/.hg.conf .hg.conf
sudo ln -s $STORAGE/genomebrowser/.conf/.ct.hg.conf .ct.hg.conf
```

In /etc/cron.daily create the tmp cleaner script (cleans it daily) and name it clean-gb-tmp:

#!/bin/bash
find $STORAGE/data/tmp -type f -amin +1440 -exec rm -f {} \;

and then

sudo chmod 755 /etc/cron.daily/clean-gb-tmp

Create the following script to be used with a cron job (better schedule it through webmin tool) to periodically clean the custom tracks database and name it clean-gb-ctdb
```
#!/bin/sh

DS=`date "+%Y-%m-%d"`
YYYY=`date "+%Y"`
MM=`date "+%m"`
export DS YYYY MM

mkdir -p $STORAGE/genomebrowser/data/trashLog/localhost/${YYYY}/${MM}
RESULT="$STORAGE/data/trashLog/localhost/${YYYY}/${MM}/${DS}.txt"
export RESULT

sudo $STORAGE/kenthome/bin/x86_64/dbTrash -age=168 -drop -verbose=2 \
> ${RESULT} 2>&1
```
and then
```
sudo chmod 755 /etc/cron.daily/clean-gb-ctdb
```
This will clean it weekly and keep a log. Don't forget to add $STORAGE/kenthome/bin/x86_64/dbTrash to your sudoers file, so as not to ask for password confirmation. You can google on how to do this, it's easy. The reason for this is that dbTrash uses .hg.conf which is located under /root home and the default cron user (which is the root by the way, strange...) cannot find it.

F. Setting up the Blat server (optional but required for most molecular biology labs)

Most tools required by blat have already been compiled (gfServer, gfClient, faToNib and blat). If some of them have not been compiled in step B.16, compile them separately (see also the first note in Notes).

Update the blatServers table in the hgcentral database with the address of your host (usually localhost).

mysql -uUSER_WITH_WRITE_PERMISSIONS -pPASSWORD -e "USE hgcentral; \
UPDATE blatServers SET host='localhost' WHERE db LIKE '%hg18%'; \
UPDATE blatServers SET host='localhost' WHERE db LIKE '%hg19%'; \
UPDATE blatServers SET host='localhost' WHERE db LIKE '%mm9%'; \
UPDATE blatServers SET host='localhost' WHERE db LIKE '%mm10%'; \
UPDATE blatServers SET host='localhost' WHERE db LIKE '%dm3%';"

It would be good to backup this table first or note down the initial entries in case something goes wrong. You can also change the default ports (explore the relative tables using MySQL command line or phpMyAdmin).

Everything else is straightforward. See also instructions in here on how to launch the blat server. I created a startup script and made it run every time my machine boots (with a certain delay unfortunately). It also contains a lot of hardcoded paths, which is in the todo list to change.
```
sudo update-rc.d sblat defaults
sudo chmod +x /etc/init.d/sblat
```

Notes

I faced a lot of problems with making the whole Kent source tree. For example, one tool needed for the cleaning of the customTrash database, dbTrash, was not compiling... There was a problem with mysql and sql libraries, so what I did was to install everything that had to do with dev packages from synaptic and in the /src directory of /kent source, I typed make libs. I found here. The tool was not compiled at first but when I visited /src/hg/dbTrash and typed make, it was compiled. All of the above as root. With the same way, I compiled the hgsql tool.
It is recommended that you migrate the MySQL database storage folder from the default location, as the table sizes will explode fast, especially if you want to host a lot of features, so as to keep the filesystem light. The process is not very difficult and explained in many blogs/forums. Just google for it.
Be sure also to have a lot of available space for the gbdb directory, as all the big tracks (not suitable for a database, e.g. ENCODE tracks and genome files) are stored there.
To insert a new table in any Genome Browser database without loading everything from the beginning:
```
mysql -uSER_WITH_WRITE_PERMISSIONS -pPASSWORD --local-infile -e \
"load data local infile '/path/to/my/table.ext' into table my_genome.table;"
```
The table must have been created first from the respective table.sql file! I will soon provide a couple of example scripts as until now there have been a lot of times that I should add extra tables (according to the needs of my collegues) without rebuilding everything from the beginning.
There is a general problem with the $MACHTYPE environmental variable outside the kent shell scripts that are used for the general Genome Browser building (/src/procuct scripts). I fixed this by manually editing the makefile of each additional tool I wanted to use (e.g. the spToDb tool) and replaced the line
```
MYLIBDIR = ../../../lib/$(MACHTYPE)
```
with
```
MYLIBDIR = ../../../lib/x86_64
```

TODO

Log rotation scripts for the logs maintained by most of the UCSC Genome Browser tools... If someone has done it, I would really appreciate some sharing!
Better scrap table browser pages for table names?

2 comments:

Unknown said...: Thanks God (Jim Kent) for make the Hubs exist!; 29 July 2013 at 20:58
Unknown said...: Thanks for this tuto. I have a issue in step B.14 when i run "sudo sh kentSrcUpdate.sh ./browserEnvironment.txt", system return this error :
git update report summary is in email to root
kentSrcUpdate.sh: 73: kentSrcUpdate.sh: mail: not found
make: execvp: ./hg/sqlEnvTest.sh: Permission denied
make: *** [hgLib] Error 127
make: *** Waiting for unfinished jobs....
/bin/sh: 1: ./machTest.sh: Permission denied
make: *** [topLibs] Error 126
egrep: daily.log: No such file or directory; 5 September 2014 at 05:30

Bioinformatics dance

Pages

Installation of UCSC Genome Browser in a local server

A. Preface and MySQL configuration

B. Installation of the UCSC Genome Browser web application and session system

C. Installation of minimal genome databases

D. Installation of other genome database tables

E. Setting up the custom track database

F. Setting up the Blat server (optional but required for most molecular biology labs)

Notes

TODO

2 comments:

Post a Comment

Blog Archive

Popular Posts

Favorites

About Me

Followers