Skip to content

v0.4.0

Compare
Choose a tag to compare
@joecummings joecummings released this 14 Nov 15:37
· 123 commits to main since this release

Highlights

Today we release v0.4.0 of torchtune with some exciting new additions! Some notable ones include full support for activation offloading, recipes for Llama3.2V 90B and QLoRA variants, new documentation, and Qwen2.5 models!

Activation offloading (#1443, #1645, #1847)

Activation offloading is a memory-saving technique that asynchronously moves checkpointed activations that are not currently running to the CPU. Right before the GPU needs the activations for the microbatch’s backward pass, this functionality prefetches the offloaded activations back from the CPU. Enabling this functionality is as easy as setting the following options in your config:

enable_activation_checkpointing: True
enable_activation_offloading: True

In experiments with Llama3 8B, activation offloading used roughly 24% less memory while inflicting a performance slowdown of under 1%.

Llama3.2V 90B with QLoRA (#1880, #1726)

We added model builders and configs for the 90B version of Llama3.2V, which outperforms the 11B version of the model across common benchmarks. Because this model size is larger, we also added the ability to run the model using QLoRA and FSDP2.

# Download the model first
tune download meta-llama/Llama-3.2-90B-Vision-Instruct --ignore-patterns "original/consolidated*"
# Run with e.g. 4 GPUs
tune run --nproc_per_node 4 lora_finetune_distributed --config llama3_2_vision/90B_qlora

Qwen2.5 model family has landed (#1863)

We added builders for Qwen2.5, the cutting-edge models from the Qwen family of models! In their own words "Compared to Qwen2, Qwen2.5 has acquired significantly more knowledge (MMLU: 85+) and has greatly improved capabilities in coding (HumanEval 85+) and mathematics (MATH 80+)."

Get started with the models easily:

tune download Qwen/Qwen2.5-1.5B-Instruct --ignore-patterns None
tune run lora_finetune_single_device --config qwen2_5/1.5B_lora_single_device

New documentation on using custom recipes, configs, and components (#1910)

We heard your feedback and wrote up a simple page on how to customize configs, recipes, and individual components! Check it out here

What's Changed

New Contributors

Full Changelog: v0.3.1...v0.4.0