Feat: Drop long samples and shuffle rl samples #2040

NanoCode012 · 2024-11-12T10:21:49Z

Description

This PR aims to solve three issues:

TRL not dropping long samples: ˈɛmpti reported TRL does not drop long samples which could lead them to being truncated.
Caitlyn G. and ˈɛmpti also reported about shuffling issues with KTO. KTOTrainer does not perform shuffling prior-training anymore and uses sequentialsampler. https://github.com/huggingface/trl/pull/2248/files
Logging number of dropped samples for both SFT+RL. This was requested a while back from someone in discord.

Motivation and Context

How has this been tested?

Run preprocess and check dropped samples:

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

* feat: LOG warn if samples are dropped due to seq length * feat: add drop long samples for RL * feat: add ipo * fix: remove num_proc for map as subprocesses are prone to die * feat: shuffle rl dataset * fix: support preprocess for kto * chore: use set instead of list * feat: add simpo

NanoCode012 added 8 commits November 12, 2024 17:05

feat: LOG warn if samples are dropped due to seq length

904d214

feat: add drop long samples for RL

5a38d49

feat: add ipo

65ccdac

fix: remove num_proc for map as subprocesses are prone to die

2b5dd5f

feat: shuffle rl dataset

071b65b

fix: support preprocess for kto

d8a85db

chore: use set instead of list

dfd6f2f

feat: add simpo

71173df

NanoCode012 marked this pull request as ready for review November 12, 2024 11:37

winglian approved these changes Nov 19, 2024

View reviewed changes

winglian added the ready to merge label Nov 19, 2024

winglian merged commit f007c38 into main Nov 19, 2024
11 checks passed

winglian deleted the feat/drop_shuffle_rl branch November 19, 2024 15:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Drop long samples and shuffle rl samples #2040

Feat: Drop long samples and shuffle rl samples #2040

NanoCode012 commented Nov 12, 2024 •

edited

Loading

Feat: Drop long samples and shuffle rl samples #2040

Feat: Drop long samples and shuffle rl samples #2040

Conversation

NanoCode012 commented Nov 12, 2024 • edited Loading

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

NanoCode012 commented Nov 12, 2024 •

edited

Loading