Note: I have completed the bonus point assignment also
- Start from the repository of previous Session
- Start using Cursor!
- Create a
eval.py
with its config that tests the model given a checkpoint - Integrate the
infer.py
you made in the last session with Hydra - Make sure to integrate codecov in your repo, should be at least 70%
- Push the Docker Image to GHCR, it should show up in the Packages section of your repo
Optional Assignment (Bonus Points - 500)
- Create another GitHub Actions Workflow to use the Docker image created to train the cat-dog model for 5 epochs on a small backbone network
- Your actions should fail if accuracy is <95%
- Model Checkpoint, Model Logs, Config used, should be presented as artifacts
Image Build Command
docker build -t light_train_test -f ./Dockerfile .
Run training and checkpoint stored in model_storage/
docker run --rm -v /workspace/emlo4-session-05-ajithvcoder/:/workspace/ light_train_test python src/train.py
Run evaluation and results are printed
docker run --rm -v /workspace/emlo4-session-05-ajithvcoder/:/workspace/ light_train_test python src/eval.py
Run Inference and results are stored in infer_images/
docker run --rm -v /workspace/emlo4-session-05-ajithvcoder/:/workspace/ light_train_test python src/infer.py
- Added
train.yaml
,infer.yaml
, andeval.yaml
for separate purposes - There are two experiments: one for
dogbreed
and another forcatdog
- In the data folder, there are also two: one for
catdog
and another fordogbreed
, so we can experiment on multiple datasets and dataloaders - There are two classifier models, each can be used for its purpose. We can also integrate them into one
- Similarly,
trainer
,paths
, andlogger
are set, but they are common and usually won’t change - In callbacks, we can declare the checkpoint
path
,summary
, andearly stopping
criteria - These criteria can be overridden from any YAML files. For example, I have overridden the values of
model_checkpoint
callback inexperiment/dogbreed_ex_train.yaml
- By default, checkpoints are stored in the
model_storage
folder
- Datamodules, models, and train, eval, and infer files are tested
Order in which tests are run (eval should run after train)
- Packages like
pytest-order
andpytest-dependency
are used for readability and ordering - Command:
pytest --collect-only -v
Overall Test Coverage Command
pytest --cov-report term --cov=src/ tests/
Individual Module Test Command
pytest --cov-report term --cov=src/models/ tests/models/test_timm_classifier.py
- Using GitHub Actions and the
docker-build.yml
, we are building the Docker image and pushing it to GitHub GHCR
-
Added an additional experiment
catdog_ex.yaml
with a MobileNet lightweight model, and it is used to trigger thecatdog
experimentpython src/train.py --config-name=train experiment=catdog_ex trainer.max_epochs=5
-
Manual tests are written in the workflow to collect logs and check the accuracy using the
grep
command. If the accuracy is more than 95%, it passes; otherwise, it fails. Results for both passing and failing cases are attached at the bottom -
Artifacts are stored using GitHub Actions in the pipeline run. You can view them in the results section
-
In the CI pipeline (
ci-pipeline.yml
), first,docker-build.yml
is triggered, thendocker-test.yml
is triggered. We can set the min accuracy and Docker image name as parameters to thedocker-test.yml
file
- Learnt about Hydra, Pytests, and artifact creation
-
Hydra Configs
- Located in the
configs
folder
- Located in the
-
Test Coverage Report
-
Docker Image Build and Push
-
Artifact Successful Generation with Accuracy Above 50%
-
Artifact Unsuccessful Generation with Accuracy Below 95%
- Ajith Kumar V (myself)
- Aakash Vardhan
- Anvesh Vankayala
- Manjunath Yelipeta
- Abhijith Kumar K P
- Sabitha Devarajulu