Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added docker & docker-compose to the project #31

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rowellz
Copy link

@rowellz rowellz commented Nov 5, 2024

Added support for docker-compose. This docker container will enable users to run both acezero & nerfstudio, allowing the creation of gaussian splats from just a few commands. I haven't fully tested all of the features of acezero, but am aware that some functionality like --render_visualization doesn't work in the docker container.

Only added two files, the Dockerfile & docker-compose.yml file, and updated two files, the .gitignore & README

Hoping this will make ace0 more accessible and easier to contribute on!

Also, this was tested on a RTX 3060 12GB card on a Ubuntu host machine. Was able to successfully train a gaussian splat from datasets of 6000 & even 10000 images, both of which matched roughly 70% of each dataset and the camera calibration results were quite impressive based on the nerfstudio splat. I'm going to try this on a larger gpu so I can hopefully train the splat at a higher resolution, since my 3060 can only handle a downscale value of 6.

Copy link
Member

@amonszpart amonszpart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thanks for your contribution, and effort to try to make ACE0 easier to use!

I left a few optional suggestions, I hope you consider them.

Cheers,
Aron

Dockerfile Outdated
https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& mkdir /root/.conda \
&& bash Miniconda3-latest-Linux-x86_64.sh -b \
&& rm -f Miniconda3-latest-Linux-x86_64.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you maybe be able to consider using a free conda implementation? E.g. Miniforge from conda-forge?

It's hard to predict what will happen to the permissiveness of Anaconda products: https://www.theregister.com/2024/08/08/anaconda_puts_the_squeeze_on/

If not, please add an echo or some sort of warning for users to check Anaconda's licensing terms.
Thanks!

Dockerfile Outdated
conda run -n ace0 python setup.py install

#Install ace0 deps
RUN apt-get install -y libnvidia-egl-wayland1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Could you please combine as many apt-get calls as possible to a single RUN layer? See also https://docs.docker.com/build/building/best-practices/#apt-get

In general, it's good to have fewer RUN lines, and group instructions with && \ at the end. This allows the layer to not contain changes that would be undone in a consecutive command, and will result in smaller Docker images that are easier to carry around.

Dockerfile Outdated

RUN apt-get install -y libglib2.0-0

RUN conda run -n nerfstudio ns-install-cli
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all these RUNs up to L40 could be combined (or if not, very few splits would be needed). See advice from the Docker blog:

At a high level, it works by analyzing the layers of a Docker image. With every layer you add, more space will be taken up by the image. Or you can say each line in the Dockerfile (like a separate RUN instruction) adds a new layer to your image.
(source: https://www.docker.com/blog/reduce-your-image-size-with-the-dive-in-docker-extension/)

RUN apt-get install -y libglib2.0-0  # TODO: merge with earlier ones

ENV TCNN_CUDA_ARCHITECTURES="50;52;60;61;70;75;80;86" # consider making an export if it's not needed for the final image

RUN && \
  conda create --name nerfstudio -y python=3.8 && \
  conda run -n nerfstudio pip install --upgrade pip && \
  conda run -n nerfstudio pip install "torch==2.1.2+cu118" "torchvision==0.16.2+cu118" --extra-index-url https://download.pytorch.org/whl/cu118 && \
  conda run -n nerfstudio conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit && \
  conda run -n nerfstudio pip install ninja "git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch" && \
  conda run -n nerfstudio pip install nerfstudio && \
  conda run -n nerfstudio ns-install-cli

tty: true
stdin_open: true
ports:
- "7771:7007"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: make this the default "7007:7007" or closer to it, e.g. "7008:7007"

count: all
capabilities: [gpu]
extra_hosts:
- "host.docker.internal:host-gateway"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: put a comment above explaining what this does and why it's needed.

ports:
- "7771:7007"
volumes:
- ./images:/workspace/images
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: remove this, it's not needed for the final container (unless I'm mistaken)

devices:
- driver: 'nvidia'
count: all
capabilities: [gpu]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually use docker run --gpus='all,"capabilities=compute,utility,graphics"' ..., now that might not be the most up to date way, but I think capabilities=all would be nice to have, see also here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/docker-specialized.html#driver-capabilities

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the docker-compose docs recommend using capabilities: [gpu] for docker-compose files

https://docs.docker.com/compose/how-tos/gpu-support/

@rowellz
Copy link
Author

rowellz commented Dec 2, 2024

@amonszpart Thanks for reviewing this! I almost forgot about this PR, haha. I'll try to resolve most of if not all of your feedback by the end of this week, my apologies as I have other priorities at the moment!

@rowellz
Copy link
Author

rowellz commented Dec 17, 2024

@amonszpart Hey! I think I took care of most if not all of your feedback, l did forget to mention that the extra_hosts param allows the nerfstudio client to be accessed by the host machine's IP address & localhost during the Gaussian splat training phase. Just let me know if you want anything else tweaked so it's good to merge!

Cheers 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants