Skip to content

miladrayka/reindeer_software

Repository files navigation

Python 3.9 License: MIT

REINDEER Software

REINDEER is a software for the sturcture-based protein-ligand feature generation.

Logo

Currently, REINDEER provides only four below feature vector:

1- Occurrence of Interatomic Contact (OIC) - Ref

2- Distance-Weighted Interatomic Contact (DWIC) - Ref

3- Extended Connectivity Interaction Feature (ECIF) - Ref

4- Multi-Shell Occurrence of Interatomic Contact (MS-OIC) - Ref

Citation

Project is not done. I will update the repository and the paper in the following months. But, for now, you can cite the below paper:

REINDEER: A Protein-Ligand Feature Generator Software for Machine Learning Algorithms

Contact

Milad Rayka, [email protected]

Install

1- First install python (3.9) then make a virtual environment and activate it.

python -m venv env
.\env\Scripts\activate

Which env is the location to create the virtual environment.

2- Clone reindeer_software Github repository.

git clone https://github.com/miladrayka/reindeer_software.git

3- Change your directory to reindeer_software.

4- Install required packages with pip.

pip install -r requirements.txt

Notes

1- Provided protein-ligand complex should have hydrogen atoms

2- File formats for protein and ligand are .pdb and .mol2. In the case of ECIF, instead of .mol2, .sdf file should be provided.

3- All protein-ligand complexes should be provided as the below example:

./test
├── 1a1e
│   ├── 1a1e_ligand.mol2
│   ├── 1a1e_ligand.sdf
│   └── 1a1e_protein.pdb
├── 1a28
│   ├── 1a28_ligand.mol2
│   ├── 1a28_ligand.sdf
│   └── 1a28_protein.pdb
├── 1a30
    ├── 1a30_ligand.mol2
    ├── 1a30_ligand.sdf
    └── 1a30_protein.pd

Usage

REINDEER provides GUI, CLI, and using within python codes for feature generation.

Graphical User Interface (GUI)

After changing your dicrectory to reindeer_software type the follwoing code for running GUI:

python ./gui_launcher.py

For example check the Tutorial file.

GUI

Command Line Interface (CLI)

For access to CLI, type the following command (you should be at reindeer_software directory):

python ./reindeer_software.py -h

The output is like this:

usage: reindeer_software.py [-h] -m METHOD -d DIRECTORY -f FILE_NAME -n N_JOBS

Generate features for set of given structures

optional arguments:
  -h, --help            show this help message and exit
  -m METHOD, --method METHOD
                        Feature generation method. Only OIC, DWIC, ECIF, and
                        MS-OIC are implemented for now.
  -d DIRECTORY, --directory DIRECTORY
                        directory of structures files
  -f FILE_NAME, --file_name FILE_NAME
                        Name for saving generated features.
  -n N_JOBS, --n_jobs N_JOBS
                        Number of cpu cores for parallelization

Example for OIC:

python ./reindeer_software.py -m OIC -d ../test/ -f feature_vector_oic.csv -n -1

Within Python

REINDEER can also be used within python codes.

Example for OIC:

from reindeer.feature_generators import oic_dwic
from reindeer.script import utils

oic = oic_dwic.InterAtomicContact(
    pathfiles="../test/",
    filename="oic_fv.csv",
    ligand_format="mol2",
    amino_acid_classes=utils.amino_acid_classes_OIC,
    cutoff=12.0,
    feature_type="OIC",
    exp=None,
)

within_python_example.ipynb file provides examples for this usages.

Case Study

CaseStudy.ipynb contains all code to reproduce the case study section of the paper on Google COLAB.

System Specification

REINDEER is tested on the following system:

OS RAM CPU
Windows 10 8.00 GB AMD FX-770K Quad Core Processor (3.5 GHz)

We don't assume using macOS or Linux can make a problem.

Development

To ensure code quality and consistency the following extensions of VSCode are used during development:

  • black
  • isort
  • pylance
  • pylint
  • flake8
  • AI python docstring generators

Original Repository

Following repositories were used for the development of REINDEER:

Copy Right

Copyright (c) 2024, Milad Rayka