Sunny Sanyal's picture

Sunny Sanyal

Sunny111

·

https://sites.google.com/view/sunnysanyal/home

AI & ML interests

Efficient Training Recipes of Large Models (mostly LLMs)

Recent Activity

replied to their post about 2 months ago

Are you familiar with reverse residual connections or looping in language models? Excited to share my Looped-GPT blog post and codebase 🚀 https://github.com/sanyalsunny111/Looped-GPT TL;DR: looping during pre-training improves generalization. Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens P.S. This is my first post here — I have ~4 followers and zero expectations for reach 😄

posted an update 2 months ago

Are you familiar with reverse residual connections or looping in language models? Excited to share my Looped-GPT blog post and codebase 🚀 https://github.com/sanyalsunny111/Looped-GPT TL;DR: looping during pre-training improves generalization. Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens P.S. This is my first post here — I have ~4 followers and zero expectations for reach 😄

upvoted a paper 3 months ago

Pre-training Small Base LMs with Fewer Tokens

View all activity

Organizations

Sunny111 's datasets

None public yet