A note on learning to run AutoDock Vina

Author: Xiping Gong (xipinggong@hotmail.com, Department of Food Science and Technology, College of Agricultural and Environmental Sciences, University of Georgia, Griffin, GA, USA)

Date: 01/16/2025

Introduction

AutoDock Vina is an open-source molecular docking software designed for computationally predicting how small molecules, such as drug candidates, bind to a receptor of known 3D structure, typically a protein. It is a successor to AutoDock, focusing on speed, accuracy, and ease of use. Please check out the following for the details.

AutoDock Vina Website: https://vina.scripps.edu

Tutorial for the Installation & Practice: https://autodock-vina.readthedocs.io/en/latest/introduction.html

This note provides two examples. The first example is from the official website, and anther example is to dock the PFOA molecule into the human serum albumin protein. In the second example, this note can help us to learn how to dock a molecule into a protein automatically.

The first example from the official website

This is an example from the official documentation. This example shows how to dock a single molecule into a rigid receptor. Link: https://autodock-vina.readthedocs.io/en/latest/docking_basic.html.

Prepare the inputs

Two input files are necessary to get it started: They can be found from the GitHub: https://github.com/ccsb-scripps/AutoDock-Vina/tree/develop/example/basic_docking/data, including

1iep_ligand.sdf
1iep_receptorH.pdb

or, run the following in the bash to get the inputs.

$ wget https://raw.githubusercontent.com/ccsb-scripps/AutoDock-Vina/refs/heads/develop/example/basic_docking/data/1iep_receptorH.pdb
$ wget https://raw.githubusercontent.com/ccsb-scripps/AutoDock-Vina/refs/heads/develop/example/basic_docking/data/1iep_ligand.sdf

Run AutoDock Vina using bash

# 1. Preparing the receptor
# mk_prepare_receptor.py is a built-in script
$ mk_prepare_receptor.py -i 1iep_receptorH.pdb -o 1iep_receptor -p -v \
--box_size 20 20 20 --box_center 15.190 53.903 16.917

# 2. Preparing the ligand
# mk_prepare_ligand.py is a built-in script
$ mk_prepare_ligand.py -i 1iep_ligand.sdf -o 1iep_ligand.pdbqt

# 3. Preparing the box parameters
# Saving the following into the 1iep_receptor.box.txt for AutoDock Vina.
# The center of the box should be at the approximate center of the binding pocket.
# The size of the box defines the dimensions (in Ångströms) of the search space.
# Guidelines:
# The box should fully enclose the binding pocket.
# Add a margin of 5–10 Å on each side to allow for ligand flexibility.

center_x = 15.190
center_y = 53.903
center_z = 16.917
size_x = 20.0
size_y = 20.0
size_z = 20.0

# 4. Running the AutoDock Vina
$ vina --receptor 1iep_receptor.pdbqt \
       --ligand 1iep_ligand.pdbqt \
       --config 1iep_receptor.box.txt \
       --out 1iep_ligand_vina_out.pdbqt \
       --exhaustiveness=32 

# The output will be as follows,
Scoring function : vina
Rigid receptor: 1iep_receptor.pdbqt
Ligand: 1iep_ligand.pdbqt
Grid center: X 15.19 Y 53.903 Z 16.917
Grid size  : X 20 Y 20 Z 20
Grid space : 0.375
Exhaustiveness: 32
CPU: 0
Verbosity: 1

Computing Vina grid ... done.
Performing docking (random seed: -1745072807) ...
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************

mode |   affinity | dist from best mode
     | (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
   1       -13.22          0          0
   2       -11.26      3.038      12.41
   3       -11.19     0.9483      1.658
   4       -11.16      3.875      12.35
   5       -10.61      2.572      12.63
   6       -9.815      3.007      12.56
   7       -9.209      2.975      13.09
   8       -8.883      3.969      12.69
   9       -8.822       3.51      12.14

Run AutoDock Vina using Python

Additionally, this example can be done in a Python script saved as a "test.py".

#!/usr/bin/python
# -*- coding: utf-8 -*-

# This is an example to show how to use the AutoDock Vina in Python.
# The code can be found in the following link.
# Link: https://autodock-vina.readthedocs.io/en/latest/docking_python.html

from vina import Vina

v = Vina(sf_name='vina')

# collect the input data
v.set_receptor('1iep_receptor.pdbqt')
v.set_ligand_from_file('1iep_ligand.pdbqt')
v.compute_vina_maps(center=[15.190, 53.903, 16.917], box_size=[20, 20, 20])

# Score the current pose
energy = v.score()
print('Score before minimization: %.3f (kcal/mol)' % energy[0])

# Minimized locally the current pose
energy_minimized = v.optimize()
print('Score after minimization : %.3f (kcal/mol)' % energy_minimized[0])
v.write_pose('1iep_ligand_minimized.pdbqt', overwrite=True)

# Dock the ligand
v.dock(exhaustiveness=32, n_poses=20)
v.write_poses('1iep_ligand_vina_out.pdbqt', n_poses=5, overwrite=True)

Run the script in your terminal, for example,

$ python test.py

# The following is the output.
Computing Vina grid ... done.
Score before minimization: -12.513 (kcal/mol)
Performing local search ... done.
Score after minimization : -13.170 (kcal/mol)
Performing docking (random seed: -1693342343) ...
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************

mode |   affinity | dist from best mode
     | (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
   1       -13.28          0          0
   2       -11.31      3.053       12.4
   3       -11.22      1.086       1.81
   4       -11.09      3.918      12.29
   5       -11.02      1.487      2.015
   6       -10.65      2.579      12.62
   7       -10.34       1.79      13.54
   8       -9.934      3.556      12.29
   9       -9.732      2.543      12.54
  10       -9.615      2.732       12.6
  11       -9.261      2.397       13.8
  12       -9.098      1.855      12.89
  13       -9.012      3.897      12.66
  14       -8.884      3.586      12.14
  15       -8.514      1.513       2.24
  16       -8.508      4.137      6.491
  17       -8.217      3.699       12.5
  18       -8.174       2.47      4.095
  19       -7.967      3.133      6.051

View the docked pose

# Convert the pdbqt to pdb using the open babel, and then we can use the VMD to view pose.pdb
$ obabel 1iep_ligand_vina_out.pdbqt -O pose.pdb

A new example: PFOA - human serum albumin (hSA) protein

The goal of this example is to test whether using the AutoDock Vina can accurately predict the binding of PFOA with the hSA protein. To test it, I integrated all scripts (Python and Bash) together, so that we can automatically screen other potential PFAS molecules.

Background

Reference Maso, Lorenzo, et al. "Unveiling the binding mode of perfluorooctanoic acid to human serum albumin." Protein Science 30.4 (2021): 830-841. DOI: https://doi.org/10.1002/pro.4036

Alt text

Figure 1. Structure of hSA in complex with PFOA and Myr. Chemical structure (top) and composite omit maps depicting the (Fo−Fc) electron density (bottom) of PFOA (a) and Myr (b) contoured at 4σ; (c) Crystal structure of hSA-PFOA-Myr complex (white) obtained using a twofold molar excess of PFOA over Myr [PDB identification code: 7AAI]; (d) Superimposition of hSA-PFOA-Myr ternary complex (white) with aligned hSA-Myr binary complex (blue white) [PDB identification code: 7AAE]. The structure of hSA is organized in homologues domains (I, II and III), subdomains (A and B), fatty acids (FA) and Sudlow's binding sites. The α-helices of hSA are represented by cylinders. Bound PFOA and Myr are shown in a ball-and-stick representation with a semi-transparent van der Waals and colored by atom type (PFOA: carbon = dark salmon, oxygen = firebrick, fluorine = palecyan; Myr: carbon = smudge green, oxygen = firebrick). The electron density PFOA and Myr is shown as grey mesh. (Note: I switched the "7AAE" with "7AAI" after checking out both structures from the PDB database.)

A general script to run the docking

# 1. Download the PDB file
# PDBID = 7AAI. Link: https://www.rcsb.org/structure/7AAI
$ wget https://files.rcsb.org/pub/pdb/compatible/pdb_bundle/aa/7aai/7aai-pdb-bundle.tar.gz
$ tar -xvzf 7aai-pdb-bundle.tar.gz
$ cp 7aai-pdb-bundle1.pdb 7aai.pdb

# 2. Play with it and save the PDB file of hSA protein by using the VMD (if you do not have VMD, you can also ignore this step).In this PDB file, we have four PFOA binding pocket, and the pocket 1 has the strongest binding.
Protein: resid 3 to 583
PFOA1: resid 601
PFOA2: resid 604
PFOA3: resid 603
PFOA4: resid 602

# 3. Now, we are using a bash script to run our job.
# Here, I investigated how the box size impacts on the docking performance. I selected the pocket of PFOA1 as the binding pocket, and increase the box size to see whether it will predict the docking site accurately.

$ bash run.sh | tee run.log # Please see the run.sh script in the Appendix for the details.

# In this bash script, it shows a general way to run the docking jobs. It has a Python script "autodock_vina.py", which can be used to generate the best docking pose. The input files for this script have the receptor.pdbqt, ligand.pdbqt, and pocket_params.txt. The output files include the ligand_docked.pdbqt and ligand_minimized.pdbqt.

Analysis & Conclusion

Alt Text

Figure 2. Impact of box size on the docking performance of AutoDock Vina for PFOA docking in human serum albumin (HSA) (PDBID = 7AAI). The PFOA1 (Exp.) structure represents the experimentally determined native binding pose of PFOA in HSA. Key interacting residues, such as TYR and SER, are highlighted along with the measured distances between interacting atoms. AutoDock Vina was used to dock PFOA with varying box sizes. Box centers are defined as (8.6, 4.9, 19.2), and box sizes tested include (8.5, 8.5, 8.5), (13.5, 13.5, 13.5), (18.5, 18.5, 18.5), and (23.5, 23.5, 23.5). The docking results show that smaller box sizes better predicted the native location of PFOA, while larger box sizes resulted in predicted poses that were far from the experimentally observed binding site.

This suggests that how to choose the box parameters is a key to improve the prediction.

Question: What if we do not have any knowledge about the location of pocket? This prediction could have a low reliability.

Appendix

run.sh

#!/bin/bash

echo "# Effect of box size on the docking performance"
echo "# ============================================="
echo ''

SECONDS=0  # Reset the timer

# Clean the pdb file: PDBID = 7aai
echo "# Clean the pdb file: PDBID = 7aai, because 7aai.pdb has two conformations A and B, so conformation B was removed"
echo "$ python clean_pdb.py 7aai.pdb 7aai_cleaned.pdb"
        python clean_pdb.py 7aai.pdb 7aai_cleaned.pdb
echo ''

# Generate the pdb files
echo "# Cleaned pdb file -->  x_receptor.pdb and x_ligand.pdb"
echo "$ python get_pdb_byresid.py 7aai_cleaned.pdb --out x_receptor.pdb --start_resid   3 --end_resid 583"
        python get_pdb_byresid.py 7aai_cleaned.pdb --out x_receptor.pdb --start_resid   3 --end_resid 583
echo "$ python get_pdb_byresid.py 7aai_cleaned.pdb --out x_ligand.pdb   --start_resid 601 --end_resid 601"
        python get_pdb_byresid.py 7aai_cleaned.pdb --out x_ligand.pdb   --start_resid 601 --end_resid 601
echo "$ Saving native pose: $ mv x_ligand.pdb pose0.pdb"
        cp x_ligand.pdb pose0.pdb
echo ''

# Generate the pdbqt files
# - receptor
echo "# x_receptor.pdb --> x_receptor.pdbqt by using the mk_prepare_receptor.py"
echo "$ mk_prepare_receptor.py -i x_receptor.pdb -o x_receptor -p -v --box_size 20 20 20 --box_center 0.0 0.0 0.0 # ignore the box_size and box_center"
        mk_prepare_receptor.py -i x_receptor.pdb -o x_receptor -p -v --box_size 20 20 20 --box_center 0.0 0.0 0.0 # ignore the box_size and box_center
echo ''
# - ligand
echo "# x_ligand.pdb --> x_ligand.sdf using the obabel"
echo "$ obabel x_ligand.pdb -O x_ligand.sdf  -p 7.4 # pH 7.4"
        obabel x_ligand.pdb -O x_ligand.sdf  -p 7.4 # pH 7.4
echo "# x_ligand.sdf --> x_ligand.pdbqt using the mk_prepare_ligand.py"
echo "mk_prepare_ligand.py -i x_ligand.sdf -o x_ligand.pdbqt"
      mk_prepare_ligand.py -i x_ligand.sdf -o x_ligand.pdbqt
echo ''

# Define the array of values for the first argument
values=(0.0 5.0 10.0 15.0) # Add as many values as needed

# Counter for PDB file numbering
counter=1

# Loop through each value in the array
for value in "${values[@]}"; do
  echo -e "# Running: $ bash vina.sh $value"
  echo -e "# ------------------------------\n"

  echo -e "# Running an AutoDock Docking Job\n"

  margin=$value  # argument input

  # Generate the pocket parameters to dock
  echo "# Cleaned pdb file --> x_pocket_params.txt by selecting the residues"
  echo "$ python calc_pocket_params.py 7aai_cleaned.pdb --start_resid 601 --end_resid 601 --margin $margin | tee x_pocket_params.txt"
          python calc_pocket_params.py 7aai_cleaned.pdb --start_resid 601 --end_resid 601 --margin $margin | tee x_pocket_params.txt
  echo ''

  # Run AutoDock Vina
  echo "# Generate the best docking pose by using the docking_vina.py"
  python autodock_vina.py \
      --receptor x_receptor.pdbqt \
      --ligand x_ligand.pdbqt \
      --box_params x_pocket_params.txt \
      --output x_ligand_docked.pdbqt \
      --minimized x_ligand_minimized.pdbqt \
      --exhaustiveness 32 \
      --n_poses 1
  echo ''

  echo "# Convert the docking pose pdbqt into pdb"
  echo "$ obabel x_ligand_docked.pdbqt -O pose.pdb"
          obabel x_ligand_docked.pdbqt -O pose.pdb
  echo ''

  # Rename the output PDB file with an incremented number
  new_pose_file="pose${counter}.pdb"
  echo "# Renaming pose.pdb to ${new_pose_file}"
  mv pose.pdb "$new_pose_file"

  # Done 
  echo -e "# Finished the AutoDock Docking Job\n"

  # Increment the counter
  counter=$((counter + 1))

done

# Done
echo "# The total time spent: $SECONDS seconds."
echo "# -----------------"
echo ''

clean_pdb.py

import argparse

# Argument parser for input and output files
parser = argparse.ArgumentParser(description="Clean a PDB file by keeping only conformation 'A' or unmarked alternates.")
parser.add_argument("input_file", help="Path to the input PDB file")
parser.add_argument("output_file", help="Path to the output PDB file")

args = parser.parse_args()

# Process the input file and clean it
with open(args.input_file, "r") as infile, open(args.output_file, "w") as outfile:
    for line in infile:
        if line.startswith("ATOM") or line.startswith("HETATM"):
            alt_loc = line[16]  # Alternate location indicator (column 17)
            if alt_loc == 'A' or alt_loc == ' ':  # Keep conformation 'A' or no alternates
                # Replace the alternate location indicator with a blank space
                new_line = line[:16] + ' ' + line[17:]
                outfile.write(new_line)
        else:
            # Write non-ATOM/HETATM lines as is
            outfile.write(line)

get_pdb_byresid.py

import argparse

# Argument parser for specifying the residue range and output file
parser = argparse.ArgumentParser(description="Filter residues in a specified range from a PDB file.")
parser.add_argument("input_file", help="Path to the input PDB file")
parser.add_argument("--out", required=True, help="Path to the output PDB file")
parser.add_argument("--start_resid", type=int, required=True, help="Start of residue range (inclusive)")
parser.add_argument("--end_resid", type=int, required=True, help="End of residue range (inclusive)")

args = parser.parse_args()

# Read and process the PDB file
with open(args.input_file, "r") as infile, open(args.out, "w") as outfile:
    for line in infile:
        if line.startswith("ATOM") or line.startswith("HETATM"):
            resid = int(line[22:26].strip())  # Extract residue ID from columns 23-26
            if args.start_resid <= resid <= args.end_resid:  # Check if resid is within the range
                outfile.write(line)
        else:
            # Write non-ATOM/HETATM lines as is
            outfile.write(line)

calc_pocket_params.py

import argparse

def calculate_pocket_parameters(input_file, start_resid, end_resid, margin):
    """
    Calculate the center and size of the binding pocket for AutoDock Vina, with an optional margin.
    Args:
        input_file (str): Path to the PDB file.
        start_resid (int): Start of the residue range.
        end_resid (int): End of the residue range.
        margin (float): Margin to add to the box size (Ångströms).
    Returns:
        dict: Dictionary with center (x, y, z) and size (x, y, z) of the pocket.
    """
    x_coords, y_coords, z_coords = [], [], []

    with open(input_file, "r") as pdb_file:
        for line in pdb_file:
            if line.startswith("ATOM") or line.startswith("HETATM"):
                resid = int(line[22:26].strip())  # Extract residue ID from columns 23-26
                if start_resid <= resid <= end_resid:
                    x = float(line[30:38].strip())  # X coordinate
                    y = float(line[38:46].strip())  # Y coordinate
                    z = float(line[46:54].strip())  # Z coordinate
                    x_coords.append(x)
                    y_coords.append(y)
                    z_coords.append(z)

    # Calculate center
    center_x = (max(x_coords) + min(x_coords)) / 2
    center_y = (max(y_coords) + min(y_coords)) / 2
    center_z = (max(z_coords) + min(z_coords)) / 2

    # Calculate size with margin
    size_x = (max(x_coords) - min(x_coords)) + margin
    size_y = (max(y_coords) - min(y_coords)) + margin
    size_z = (max(z_coords) - min(z_coords)) + margin

    # Compute the maximum of size_x, size_y, and size_z and assign it
    max_size = max(size_x, size_y, size_z)
    size_x = max_size
    size_y = max_size
    size_z = max_size

    # Print the box size
    # print(f"The size of box is: {max_size}")

    return {
        "center_x": center_x,
        "center_y": center_y,
        "center_z": center_z,
        "size_x": size_x,
        "size_y": size_y,
        "size_z": size_z,
    }

if __name__ == "__main__":
    # Argument parser for dynamic inputs
    parser = argparse.ArgumentParser(description="Calculate binding pocket center and size for AutoDock Vina.")
    parser.add_argument("input_file", help="Path to the input PDB file")
    parser.add_argument("--start_resid", type=int, required=True, help="Start of residue range (inclusive)")
    parser.add_argument("--end_resid", type=int, required=True, help="End of residue range (inclusive)")
    parser.add_argument("--margin", type=float, default=0.0, help="Margin to add to the box size (Ångströms, default=0.0)")

    args = parser.parse_args()

    # Calculate parameters
    parameters = calculate_pocket_parameters(args.input_file, args.start_resid, args.end_resid, args.margin)

    # Output results in the desired format
    print(f"center_x = {parameters['center_x']:.3f}")
    print(f"center_y = {parameters['center_y']:.3f}")
    print(f"center_z = {parameters['center_z']:.3f}")
    print(f"size_x = {parameters['size_x']:.3f}")
    print(f"size_y = {parameters['size_y']:.3f}")
    print(f"size_z = {parameters['size_z']:.3f}")

autodock_vina.py

#!/usr/bin/python
import argparse
from vina import Vina

def parse_box_params(file_path):
    """Parse a text file to extract center and size parameters."""
    params = {}
    with open(file_path, 'r') as f:
        for line in f:
            key, value = line.strip().split('=')
            params[key.strip()] = float(value.strip())
    center = [params['center_x'], params['center_y'], params['center_z']]
    size = [params['size_x'], params['size_y'], params['size_z']]
    return center, size

def main():
    # Argument parser for flexible input
    parser = argparse.ArgumentParser(description="Run molecular docking using AutoDock Vina.")
    parser.add_argument("--receptor", required=True, help="Path to the receptor PDBQT file")
    parser.add_argument("--ligand", required=True, help="Path to the ligand PDBQT file")
    parser.add_argument("--box_params", required=True, help="Path to the text file specifying center and size parameters")
    parser.add_argument("--output", default="ligand_docked.pdbqt", help="Output file for the docked ligand")
    parser.add_argument("--minimized", default="x_ligand_minimized.pdbqt", help="Output file for the minimized ligand")
    parser.add_argument("--exhaustiveness", type=int, default=8, help="Exhaustiveness of the docking (default: 8)")
    parser.add_argument("--n_poses", type=int, default=1, help="Number of docking poses to generate (default: 1)")
    args = parser.parse_args()

    # Parse box parameters from the text file
    center, size = parse_box_params(args.box_params)

    # Initialize Vina
    v = Vina(sf_name='vina')

    # Set receptor and ligand
    v.set_receptor(args.receptor)
    v.set_ligand_from_file(args.ligand)

    # Set docking box
    v.compute_vina_maps(center=center, box_size=size)

    # Score the current pose
    energy = v.score()
    print('Score before minimization: %.3f (kcal/mol)' % energy[0])

    # Minimize the ligand
    energy_minimized = v.optimize()
    print('Score after minimization : %.3f (kcal/mol)' % energy_minimized[0])
    v.write_pose(args.minimized, overwrite=True)
    print(f"Minimized pose saved to: {args.minimized}")

    # Perform docking
    v.dock(exhaustiveness=args.exhaustiveness, n_poses=10)
    v.write_poses(args.output, n_poses=args.n_poses, overwrite=True)

    print(f"Docking complete. Results saved to: {args.output}")

if __name__ == "__main__":
    main()