TSAI-EMLO-4.0/05_Pytorch_Lightning_II at main · ajithvcoder/TSAI-EMLO-4.0

History

Name		Name	Last commit message	Last commit date
parent directory ..
.devcontainer		.devcontainer
.github		.github
assets		assets
configs		configs
data		data
model_storage		model_storage
resources		resources
src		src
tests		tests
.converagerc		.converagerc
.env		.env
.gitignore		.gitignore
.project-root		.project-root
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
pyproject.toml		pyproject.toml

README.md

EMLOV4-Session-05 Assignment - PyTorch Lightning - II

Requirements

Start from the repository of previous Session
Start using Cursor!
Create a eval.py with its config that tests the model given a checkpoint
Integrate the infer.py you made in the last session with Hydra
Make sure to integrate codecov in your repo, should be at least 70%
Push the Docker Image to GHCR, it should show up in the Packages section of your repo

Optional Assignment (Bonus Points - 500)

Create another GitHub Actions Workflow to use the Docker image created to train the cat-dog model for 5 epochs on a small backbone network
Your actions should fail if accuracy is <95%
Model Checkpoint, Model Logs, Config used, should be presented as artifacts

Development Method

Build Command

Image Build Command

docker build -t light_train_test -f ./Dockerfile .

Run training and checkpoint stored in model_storage/

docker run --rm -v /workspace/emlo4-session-05-ajithvcoder/:/workspace/ light_train_test python src/train.py

Run evaluation and results are printed

docker run --rm -v /workspace/emlo4-session-05-ajithvcoder/:/workspace/ light_train_test python src/eval.py

Run Inference and results are stored in infer_images/

docker run --rm -v /workspace/emlo4-session-05-ajithvcoder/:/workspace/ light_train_test python src/infer.py

Hydra Configuration

Added train.yaml, infer.yaml, and eval.yaml for separate purposes
There are two experiments: one for dogbreed and another for catdog
In the data folder, there are also two: one for catdog and another for dogbreed, so we can experiment on multiple datasets and dataloaders
There are two classifier models, each can be used for its purpose. We can also integrate them into one
Similarly, trainer, paths, and logger are set, but they are common and usually won’t change
In callbacks, we can declare the checkpoint path, summary, and early stopping criteria
These criteria can be overridden from any YAML files. For example, I have overridden the values of model_checkpoint callback in experiment/dogbreed_ex_train.yaml
By default, checkpoints are stored in the model_storage folder

Pytest and Coverage

Datamodules, models, and train, eval, and infer files are tested

Order in which tests are run (eval should run after train)

Packages like pytest-order and pytest-dependency are used for readability and ordering
Command: pytest --collect-only -v

Overall Test Coverage Command

pytest --cov-report term --cov=src/ tests/

Individual Module Test Command

pytest --cov-report term --cov=src/models/ tests/models/test_timm_classifier.py

CI Pipeline Development

Using GitHub Actions and the docker-build.yml, we are building the Docker image and pushing it to GitHub GHCR

Bonus Points - Hydra and Test and CI Development

Added an additional experiment catdog_ex.yaml with a MobileNet lightweight model, and it is used to trigger the catdog experiment
```
python src/train.py --config-name=train experiment=catdog_ex trainer.max_epochs=5
```
Manual tests are written in the workflow to collect logs and check the accuracy using the grep command. If the accuracy is more than 95%, it passes; otherwise, it fails. Results for both passing and failing cases are attached at the bottom
Artifacts are stored using GitHub Actions in the pipeline run. You can view them in the results section
In the CI pipeline (ci-pipeline.yml), first, docker-build.yml is triggered, then docker-test.yml is triggered. We can set the min accuracy and Docker image name as parameters to the docker-test.yml file

Learnings

Learnt about Hydra, Pytests, and artifact creation

Results

Hydra Configs
- Located in the configs folder
Test Coverage Report
Docker Image Build and Push
Artifact Successful Generation with Accuracy Above 50%

View Action Run
Artifact Unsuccessful Generation with Accuracy Below 95%

View Action Run

Group Members

Ajith Kumar V (myself)
Aakash Vardhan
Anvesh Vankayala
Manjunath Yelipeta
Abhijith Kumar K P
Sabitha Devarajulu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

05_Pytorch_Lightning_II

05_Pytorch_Lightning_II

README.md

EMLOV4-Session-05 Assignment - PyTorch Lightning - II

Contents

Requirements

Development Method

Build Command

Hydra Configuration

Pytest and Coverage

CI Pipeline Development

Bonus Points - Hydra and Test and CI Development

Learnings

Results

Group Members

Files

05_Pytorch_Lightning_II

Directory actions

More options

Directory actions

More options

Latest commit

History

05_Pytorch_Lightning_II

Folders and files

parent directory

README.md

EMLOV4-Session-05 Assignment - PyTorch Lightning - II

Contents

Requirements

Development Method

Build Command

Hydra Configuration

Pytest and Coverage

CI Pipeline Development

Bonus Points - Hydra and Test and CI Development

Learnings

Results

Group Members