Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

🏷️ Model badges: select only TRL models
#2178 opened Oct 4, 2024 by qgallouedec Loading…
5 tasks
🕊️ Migration PPOv2 -> PPO
#2174 opened Oct 4, 2024 by qgallouedec Draft
5 tasks
🃏 Model card: "unsloth" tag
#2173 opened Oct 4, 2024 by qgallouedec Loading…
5 tasks
Update incorrect data processing in DataCollatorForChatML
#2172 opened Oct 4, 2024 by ruijunfeng Loading…
4 of 5 tasks
Rename trainer arg tokenizer to processing_class
#2162 opened Oct 3, 2024 by qgallouedec Loading…
17 of 18 tasks
[CGPO] Mixture of judges judge
#2159 opened Oct 3, 2024 by gaetanlop Loading…
4 tasks done
populate SUPPORTED_COMMANDS cli
#2157 opened Oct 2, 2024 by grumpyp Loading…
4 of 5 tasks
[CGPO] Calibrated reward
#2155 opened Oct 2, 2024 by gaetanlop Loading…
4 tasks done
minor KTO setting changes + KL batch size
#2153 opened Oct 2, 2024 by kawine Loading…
CLI SFT Quantization Fix
#2151 opened Oct 1, 2024 by August-murr Loading…
5 tasks
[Open discusion] Multistep dataset
#2148 opened Oct 1, 2024 by qgallouedec Draft
4 tasks
Refactor ScriptArguments
#2145 opened Sep 30, 2024 by qgallouedec Draft
5 tasks
[DPO] Adding weighted preference optimization (WPO)
#2141 opened Sep 29, 2024 by gaetanlop Loading…
2 tasks done
DPO trainer supports num_logits_to_keep to save memory
#2129 opened Sep 26, 2024 by xyangk Loading…
3 of 5 tasks
Process-supervised RM Trainer
#2127 opened Sep 26, 2024 by gaetanlop Loading…
5 tasks done
[SCoRE] initial score stage 1
#2115 opened Sep 24, 2024 by kashif Draft
Fix RLOO checkpointing
#2114 opened Sep 24, 2024 by bartoszzuk Loading…
Remove deprecated args in trainers
#2036 opened Sep 8, 2024 by qgallouedec Draft
5 tasks
feat: add support for packing tokenized datasets
#2011 opened Sep 3, 2024 by kmehant Loading…
2 of 5 tasks
allow masking on consecutive messages with same roles
#2000 opened Aug 31, 2024 by lsy641 Loading…
4 of 5 tasks
added initial TPO implementation
#1965 opened Aug 24, 2024 by sahsaeedi Loading…
4 of 5 tasks
Add SRPO algorithm.
#1772 opened Jun 25, 2024 by frasermince Loading…
1 of 7 tasks
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.