Spaces:

InosLihka
/

rhythm_env

Running

App Files Files Community

1.89 MB

Ctrl+K

Ctrl+K

3 contributors

History: 54 commits

InosLihka's picture

Add SFT v3 + GRPO refine results to README + results.md

666b4ce about 8 hours ago

docs
Add SFT v3 + GRPO refine results to README + results.md about 8 hours ago
plots
Add SFT v3 + GRPO refine results to README + results.md about 8 hours ago
scripts
Add SFT v3 + GRPO refine results to README + results.md about 8 hours ago
server
Refactor grader to use openenv.core.rubrics.WeightedSum + Rubric subclasses 2 days ago
tests
Refactor grader to use openenv.core.rubrics.WeightedSum + Rubric subclasses 2 days ago
training
Move blog to root as BLOG.md (per Meta mentor guidance) 3 days ago
ui
refactor: rewrite blog around product vision; fix UI for Gradio 6 4 days ago
.dockerignore

92 Bytes
Initial commit: RhythmEnv daily planning RL environment 20 days ago
.env.example

441 Bytes
Algorithm Distillation: grader v2 with belief_accuracy + SFT pipeline 3 days ago
.gitattributes

218 Bytes
Post-deadline: full eval results + bigger plots via Git LFS 2 days ago
.gitignore

200 Bytes
Algorithm Distillation: grader v2 with belief_accuracy + SFT pipeline 3 days ago
BLOG.md

9.48 kB
Move blog to root as BLOG.md (per Meta mentor guidance) 3 days ago
Dockerfile

2.79 kB
Rebuild as Life Simulator: 5 meters, 3 hidden profiles, GRPO training pipeline 4 days ago
README.md

18.1 kB
Add SFT v3 + GRPO refine results to README + results.md about 8 hours ago
__init__.py

724 Bytes
env: enrich observation with history, anomalies, and discovery bonus 4 days ago
client.py

5.04 kB
client: surface ALL observation fields (was dropping deltas, anomalies, last_action, step_history) 3 days ago
eval_baselines_v2.json

284 Bytes
Algorithm Distillation: grader v2 with belief_accuracy + SFT pipeline 3 days ago
inference.py

13.4 kB
iter4: fix the 'constant belief = free reward' bug + 6 other deep issues 3 days ago
models.py

4.17 kB
Algorithm Distillation: grader v2 with belief_accuracy + SFT pipeline 3 days ago
openenv.yaml

93 Bytes
Initial commit: RhythmEnv daily planning RL environment 20 days ago
pyproject.toml

909 Bytes
Rebuild as Life Simulator: 5 meters, 3 hidden profiles, GRPO training pipeline 4 days ago
uv.lock

576 kB
Initial commit: RhythmEnv daily planning RL environment 20 days ago