Differences between revisions 12 and 13
Revision 12 as of 2014-02-21 13:04:05
Size: 5951
Editor: ?NataliaNikitina
Comment:
Revision 13 as of 2014-02-21 13:12:10
Size: 5947
Editor: ?NataliaNikitina
Comment:
Deletions are marked like this. Additions are marked like this.
Line 41: Line 41:
To automate generation of workunits basing on ligands and receptor libraries, we suggest the following script as a basis. It must be launched from your "source" project directory (see above). To automate generation of workunits basing on ligands and receptor libraries, we suggest the following script as a basis. It must be launched from your "source" project folder(see above).

BOINC Project Setup for Virtual Drug Screening

This page summarises and introduces to the employment of BOINC to orchestrate tasks for the docking of small chemical compounds to a protein. This is commonly a flexible ligand fitted to a solid structure - or sets of structure that capture the protein in various moments. Our ambition is to bring all components directly into a regular Debian package or present it as a dependency. The authors of this page have their own respective web-site up, with all components available on Debian, but to round it all up, the development is still ongoing - and particularly so this documentation. For joining in, please contact us. The corresponding package is maintained at

http://anonscm.debian.org/gitweb/?p=pkg-boinc/boinc-server-autodock.git

In the nearest future, the steps listed below are intended to be automated and be performed via GUI provided by the Raccoon software of MGLTools. Until then we suggest to rely on these instructions as the basis for setting up and running the BOINC project for virtual drug screening. If you do not have BOINC server installed, you may find useful the corresponding page.

1. Conceptional Overview

2. Preparation of BOINC side

2.1. Add AutoDock Vina application to a BOINC server

2.2. Inform local database of available binaries

3. Preparation of Docking side

3.1. Make a database of receptor models for screening

3.2. Make a database of ligand models for screening

3.3. Set configuration parameters for docking

4. Management of running project

4.1. Implement an assimilator program for collecting docking results

For the quick start we suggest just to use the sample assimilator provided in the BOINC source. For the result template that we created, it will collect output files into sample_results folder under the main project directory.

4.2. Create a bash script to generate workunits

To automate generation of workunits basing on ligands and receptor libraries, we suggest the following script as a basis. It must be launched from your "source" project folder(see above).

   1 #!/bin/bash
   2 # Generation of BOINC workunits for AutoDock Vina application
   3 set -e
   4 if [ -z "$BOINC_SOURCEDIR" ]; then BOINC_SOURCEDIR=$(dirname $0); fi
   5 # Set configuration parameters
   6 . $BOINC_SOURCEDIR/autodockvina_set_config.sh
   7 BATCH=`mysql -u $BOINC_DBUSER -p$BOINC_DBPASS -N -s -e "use ${BOINC_PROJECTNAME}; select MAX(batch) from workunit;" 2> /dev/null`
   8 if [ -z "$BATCH" ]; then
   9   echo "E: Error selecting batch number from the database! Please check MySQL connection parameters."
  10   exit 1
  11 else
  12   let BATCH+=1
  13 fi
  14 cd ${BOINC_INSTALLROOT}/${BOINC_PROJECTNAME}
  15 for lig_file in ${BOINC_HOMEDIR}/my_autodock_vina_library/ligands/test_lib/*.pdbqt
  16 do
  17   lig_name=`basename $lig_file .pdbqt`
  18   for conf_file in `ls ${BOINC_HOMEDIR}/my_autodock_vina_library/configs/test_config.txt`
  19   do
  20     ligand_input=ligand_input_tmp_${lig_name}_${i}`date '+%s'`
  21     receptor_input=test_receptor.pdbqt
  22     config_input=config_tmp_`basename $conf_file .txt`
  23     cp $lig_file $ligand_input
  24     cp ${BOINC_HOMEDIR}/my_autodock_vina_library/receptors/test_receptor.pdbqt $receptor_input
  25     cp $conf_file $config_input
  26     #Stage input files
  27     ./bin/stage_file --copy $receptor_input
  28     ./bin/stage_file --copy $ligand_input
  29     ./bin/stage_file --copy $config_input
  30     #Generate workunit
  31     wuname=test_${lig_name}_${BATCH}_`basename $conf_file .txt`_${i}`date '+%s'`
  32     ./bin/create_work --appname autodock-vina --batch $BATCH --wu_name $wuname --wu_template templates/autodockvina_wu_template.xml \
  33                       --result_template templates/autodockvina_result_template.xml \
  34                       --command_line "--cpu 1 --receptor receptor.pdbqt --ligand ligand.pdbqt --config config.txt --out vina_result.pdbqt" $ligand_input 
  35 
  36 $receptor_input $config_input
  37       echo "$wuname is prepared successfully."
  38   done
  39 done
  40 
  41 echo "I: Workunits were successfully created for batch #${BATCH}."

5. Result Collection

5.1. Create a bash script to filter out docking results

The "top" best compounds names may be extracted from the output files with use of convenient Linux utilites. We propose the following example:

   1 #/bin/bash
   2 #Call: ./get_top_energies.sh [DIRECTORY WITH RESULT FILES (DEFAULT '.')] [NUMBER OF HITS (DEFAULT 10)]
   3 TOP=10
   4 OUTPUTSDIR='.'
   5 if [ $# -ge 1 ] ; then
   6   if [ -d $1 ] ;  then OUTPUTSDIR=$1
   7     if [ $# -ge 2 ] ; then
   8       if ! [[ $2 =~ ^[0-9]+$ ]] ; then
   9         echo "Please, specify the number of hits properly."
  10         exit
  11       else
  12         TOP=$2
  13       fi
  14     fi
  15   else
  16     echo "Please, specify number of hits and the directory with result files."
  17     exit
  18   fi
  19 fi
  20 for f in ${OUTPUTSDIR}/*_0                       #output file with top binding affinities, generated by BOINC assimilator
  21 do
  22   echo -n `basename $f`; sed -n '/-+-/{n;p}' $f  #choose the best one for this compound (others may be useful later)
  23 done | sed 's/  */ /g' | cut -f1,3 -d' ' | sort -n -k 2,2 | head -$TOP

This bash script, executed in the directory with output files or with an appropriate parameter, should give the ligands names with binding affinity values. This data may be the first of interest for the project owner, and the script may easily be extended to extract more detailed information about docking results from the output files and the database. From the social point of view, it can be interesting to get the list of top ligands together with the names of users whose computers found them, and to display them at the project website.