Project Swallow Asahidata

community

AI & ML interests

None defined yet.

Recent Activity

Taishi-N324 authored a paper 2 days ago

On the Optimal Reasoning Length for RL-Trained Language Models

Taishi-N324 authored a paper 4 months ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

kazukifujii authored a paper 6 months ago

Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs

View all activity

models 0

None public yet

datasets 0

None public yet