The Bestiary Collection Decensored language models made using Heretic (https://github.com/p-e-w/heretic) • 5 items • Updated 2 days ago • 113
Online RLHF Collection Datasets, code, and models for online RLHF (i.e., iterative DPO) • 19 items • Updated Jun 12, 2024 • 5