This document provides comprehensive instructions for utilizing AlphaFold within the Hoffman2 High Performance Computing (HPC) environment, leveraging the AlphaFold container, a specialized software package containing AlphaFold. This container can be executed on Hoffman2 through the Apptainer container runtime application.
Downloading Data
Start a job on a compute node
You will need to first start a job to use AlphaFold and Apptainer
qrsh -l h_data=10G,h_rt=12:00:00
This example using 10GB of memory and 12 hour time limit. Note, you can use a non-GPU node if you are just downloading data. You can also use a qsub job script instead to submit this as a non-interactive job
Setting Up the Data Directory
Before initiating AlphaFold, it's essential to download the required datasets. Start by setting the directory for data download:
export DOWNLOAD_DIR=$SCRATCH/alphafoldtest/data
Execute the data download script
Utilize the download scripts provided by AlphaFold for setting up databases. These scripts are located at
/app/alphafold/scripts
within the container. Refer to the
AlphaFold GitHub repository for a detailed list of these scripts.
Run the following command to download all necessary data:
module load apptainer
apptainer exec $H2_CONTAINER_LOC/h2-alphafold.sif /app/alphafold/scripts/download_all_data.sh $DOWNLOAD_DIR
Note: Ensure there is adequate storage space in the specified directory for the downloaded data.
Running AlphaFold
Start a job on a compute node
You will need to first start a job to use AlphaFold and Apptainer
qrsh -l h_data=10G,h_rt=12:00:00,gpu,V100
This example using 10GB of memory and 12 hour time limit, using a V100 GPU compute node. You can also use a qsub job script instead to submit this as a non-interactive job
Setting Environment Variables:
Establish the required environment variables for the data and output directories, as well as the path to your FASTA file:
export DOWNLOAD_DIR=$SCRATCH/alphafoldtest/data
export OUTPUT_DIR=$SCRATCH/alphafoldtest/output
export FASTA_PATHS=test.fasta
Executing AlphaFold:
AlphaFold can be run using the run_alphafold.sh
script, located at /app/run_alphafold.sh
within the container. Execute AlphaFold with the following command, assuming the use of a GPU for relaxation steps:
module load apptainer
apptainer exec --nv $H2_CONTAINER_LOC/h2-alphafold.sif /app/run_alphafold.sh \
--fasta_paths=$FASTA_PATHS \
--max_template_date=2022-01-01 \
--model_preset=monomer \
--db_preset=full_dbs \
--data_dir=$DOWNLOAD_DIR \
--output_dir=$OUTPUT_DIR \
--use_gpu_relax=TRUE
Adjust the max_template_date according to your needs. Choose the suitable model_preset and db_preset as per the specific requirements of your project. Verify that the variables FASTA_PATHS, DOWNLOAD_DIR, and OUTPUT_DIR are set correctly.