Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

add JEPO trainer
#5411 opened Mar 31, 2026 by zbills Loading…
3 of 7 tasks
Add DistillationTrainer for efficient on-policy distillation
#5407 opened Mar 30, 2026 by cmpatino Loading…
8 tasks
Add length-normalized sigmoid loss type to DPO trainer
#5406 opened Mar 30, 2026 by BrownianNotion Loading…
5 of 8 tasks
Remove xfail for Qwen3VL CI tests
#5402 opened Mar 30, 2026 by albertvillanova Loading…
Add per-sample tool filtering to GRPOTrainer via tools column
#5398 opened Mar 27, 2026 by lailanelkoussy Loading…
3 tasks done
Better test consistency RLOO vs GRPO
#5396 opened Mar 27, 2026 by qgallouedec Loading…
Add tool calling support to RLOOTrainer
#5395 opened Mar 27, 2026 by qgallouedec Loading…
Remove xfail for ZeRO 2 and 3 + SFT + PEFT test
#5383 opened Mar 27, 2026 by qgallouedec Loading…
Fix DAPO token-level loss to use prompt-level aggregation
#5381 opened Mar 26, 2026 by matdou Loading…
2 of 5 tasks
Remove truncation_mode from DPO
#5372 opened Mar 25, 2026 by albertvillanova Loading…
add more generaic device suppport for CI tests
#5357 opened Mar 24, 2026 by kaixuanliu Loading…
Enable Tensor Parallelism in SFT script
#5331 opened Mar 21, 2026 by songhappy Loading…
(5/5) async grpo metrics
#5322 opened Mar 20, 2026 by AmineDiro Loading…
(3/5) Cancel Stale inflight tasks
#5320 opened Mar 20, 2026 by AmineDiro Loading…
ProTip! no:milestone will show everything without a milestone.