Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD Support #104

Closed
GrahamboJangles opened this issue Apr 1, 2023 · 13 comments · Fixed by #1954
Closed

AMD Support #104

GrahamboJangles opened this issue Apr 1, 2023 · 13 comments · Fixed by #1954
Labels
enhancement New feature or request

Comments

@GrahamboJangles
Copy link

GrahamboJangles commented Apr 1, 2023

Does this work on AMD cards? What are the GPU requirements for inference?

@merrymercy
Copy link
Member

Please refer to https://github.com/lm-sys/FastChat#vicuna-weights and https://github.com/lm-sys/FastChat#serving.
We only use PyTorch, so it should be easy to port to AMD if PyTorch supports AMD well.

@Askejm
Copy link

Askejm commented Apr 4, 2023

you can attempt to use rocm on linux but it's far from a smooth experience

@merrymercy merrymercy changed the title AMD? AMD Support Apr 5, 2023
@merrymercy merrymercy added the enhancement New feature or request label Apr 8, 2023
@kira-bruneau
Copy link

kira-bruneau commented Apr 15, 2023

I tried it out on my RX 7900 XTX and it loaded the whole Vicuna 13B model in 8bit mode into VRAM - but segfaulted after loading the checkpoint shards.

(I'm guessing since the card isn't officially supported by ROCm yet 😅: ROCm/ROCm#1973)

I also setup the ROCm + Vicuna development environment using a Nix flake, but there are a few more tweaks I want to make before publishing it (eg. writing Nix packages for accelerate and gradio).

@kira-bruneau
Copy link

kira-bruneau commented Apr 19, 2023

🎉 I managed to this running on my RX 7900 XTX!! I just tracked down all the development commits that added gfx11 support to ROCm built it all from source.

I pushed my Nix flake here if anyone wants to try it out themself https://github.com/kira-bruneau/FastChat/commit/75235dac0365e11157dbd950bc1a4cf528f8ddc6.

(I have it hard-coded to target gfx803 & gfx1100, so you might want to change that if you have a different AMD card: https://github.com/kira-bruneau/FastChat/commit/75235dac0365e11157dbd950bc1a4cf528f8ddc6#diff-206b9ce276ab5971a2489d75eb1b12999d4bf3843b7988cbe8d687cfde61dea0R24)

Steps:

  1. Install Nix
  2. Enable Nix flakes
  3. Load the development environment (this will build ROCm and PyTorch and will take multiple hours, so I recommend letting it run overnight):
nix develop github:kira-bruneau/FastChat/gfx1100
  1. Run the model:
python -m fastchat.serve.cli --model-path <path-to-model> --num-gpus 1 --load-8bit

asciicast

@Gaolaboratory
Copy link

Gaolaboratory commented Apr 23, 2023

I can confirm fastchat with vicuna 13b model runs fine with 8bit mode on a single AMD 6800 card. SYSTEM: Ubuntu 20 LTS, installed rocm 5.4.2, then pytorch with rocm 5.4.2 support. No need to build from source, works directly with all official packages.

@kira-bruneau
Copy link

kira-bruneau commented Apr 25, 2023

Oh yep sorry, for all other supported AMD cards you shouldn't need to build from source. I only had to because the RX 7900 XTX isn't supported in the release builds of ROCm yet.

@aseok
Copy link

aseok commented Jun 1, 2023

@kira-bruneau Is it still necessary for rx570(gfx803) to build from source?

@kira-bruneau
Copy link

kira-bruneau commented Jun 1, 2023

@aseok Oh nope! It was only necessary before the 5.5 ROCm release to support gfx1100.

Although... there are still some problems in nixpkgs that means there are parts that you still have to compile from source if you want to use the flake: (see NixOS/nixpkgs#230881) - right now the builder fails to cache rocfft, so you'd have to compile pytorch from source still 😞.

If you want to avoid building from source completely, I'd recommend using the official pytorch releases: https://pytorch.org, or try to find a docker image setup for it (which would be a little bit more involved). Hopefully the fixes will get upstreamed soon though!

@fubuki4649
Copy link

fubuki4649 commented Jun 18, 2023

(@kira-bruneau) Can someone create instructions to install + run with ROCm please? It seems that there is no flag for it to run in ROCm mode.

@JonLiuFYI
Copy link
Contributor

Perhaps there should be a note in the README about AMD compatibility?

I successfully reproduced @Gaolaboratory's results. I managed to run Vicuna 13B on my RX 6800 XT with the --load-8bit option, using some packages from my OS (Fedora 38).

@onyasumi

Can someone create instructions to install + run with ROCm please?

  1. Install the ROCm package for your OS. Fedora 38 example below:

     sudo dnf install rocm-opencl rocm-opencl-devel
    
  2. Follow PyTorch's instructions to get the ROCm version of PyTorch. Remember to verify that PyTorch detects your graphics card.

    Side note: as of writing, Fedora 38 installs ROCm 5.5.1 and PyTorch goes up to 5.4.2, but it still worked for me.

  3. Install FastChat according to the README.

  4. Download a model and try to use it in CLI.

It seems that there is no flag for it to run in ROCm mode.

If PyTorch isn't installed, running pip install fschat will try to install it. If you specifically install the ROCm version of PyTorch beforehand, pip will notice that and skip installing PyTorch during the FastChat install. As a result, FastChat will just use ROCm through your install of PyTorch.

@merrymercy
Copy link
Member

@JonLiuFYI could you contribute a pull request to add some notes about AMD?

@fubuki4649
Copy link

@JonLiuFYI Thank you, this seemed to work for me. I can add a PR later to document this in the README

@merrymercy
Copy link
Member

@JonLiuFYI @onyasumi please go ahead. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants