-
Notifications
You must be signed in to change notification settings - Fork 974
Pull requests: THUDM/slime
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Support partial rollout resume in Search-R1 example
#2128
opened Jun 23, 2026 by
OLIVER-XYP
Loading…
Reduce entropy logging memory when entropy coef is zero
#2127
opened Jun 23, 2026 by
none0663
Contributor
Loading…
fix(data): prompt-length filtering crashes on VLM dataset with apply_chat_template
#2126
opened Jun 23, 2026 by
Meihan-chen
Loading…
Add test for megatron server
run-ci-changed
#2123
opened Jun 23, 2026 by
zhuzilin
Contributor
Loading…
fix(partial-rollout): cap max_new_tokens by prior response length
#2122
opened Jun 23, 2026 by
none0663
Contributor
Loading…
fix(retool): coerce list prompt to str in reward_func
#2120
opened Jun 23, 2026 by
mvanhorn
Loading…
fix(delta-sync): surface failed engine apply results instead of silently discarding them
#2119
opened Jun 22, 2026 by
tanishkasinghhh
Loading…
fix(rm_hub): grade the final ###Response segment in deepscaler reward
#2116
opened Jun 22, 2026 by
SuperMarioYL
Loading…
fix(rm_hub): guard deepscaler reward against a missing response
#2115
opened Jun 21, 2026 by
vjsai
Loading…
fix(ppo): stop corrupting the logged rollout/kl metric
#2114
opened Jun 21, 2026 by
EazyReal
Contributor
Loading…
fix(gpt-oss): update _patch_bridge_expert_cache_to_cpu to match Megatron-Bridge API
#2113
opened Jun 21, 2026 by
aoshen02
Contributor
Loading…
Fix(rollout): Fail closed on unknown SGLang model names
#2112
opened Jun 21, 2026 by
Baiyu-Su
Contributor
Loading…
fix(train): support eval-only mode (--num-rollout 0)
#2109
opened Jun 20, 2026 by
EazyReal
Contributor
Loading…
feat(examples/strands_sglang): update to strands-sglang 0.4.2
#2106
opened Jun 20, 2026 by
Lawhy
Contributor
Loading…
feat(tracking): add MLflow tracking support alongside W&B
#2099
opened Jun 17, 2026 by
rrranlyu
Loading…
fix(dist): preserve new_group options across reloadable group reload
#2095
opened Jun 17, 2026 by
EazyReal
Contributor
Loading…
fix(scripts): correct model config source path in FP8 low_precision scripts
#2094
opened Jun 17, 2026 by
aoshen02
Contributor
Loading…
2 tasks done
fix(fully-async): respect partial_rollout=False when requeuing ABORTED groups
#2092
opened Jun 16, 2026 by
Kagura-0001
Loading…
feat(loss): add --loss-aggregation for the four ScaleRL pg_loss modes
#2090
opened Jun 16, 2026 by
EazyReal
Contributor
Loading…
fix(opd): score teacher logprobs at rollout temperature, not 0
#2085
opened Jun 15, 2026 by
EazyReal
Contributor
Loading…
feat(rl): add off-policy IS correction hook (current policy vs rollout)
#2084
opened Jun 15, 2026 by
EazyReal
Contributor
Loading…
feat(rl): add REINFORCE advantage estimator
#2083
opened Jun 15, 2026 by
EazyReal
Contributor
Loading…
fix(rollout): isolate per-trajectory exceptions in generate_and_rm_group
#2078
opened Jun 15, 2026 by
aoshen02
Contributor
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.