zhang
kekueknu2
ยท
AI & ML interests
None yet
Recent Activity
upvoted a paper 28 days ago
daVinci-Dev: Agent-native Mid-training for Software Engineering upvoted an article about 1 year ago
From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning upvoted an article over 1 year ago
Illustrating Reinforcement Learning from Human Feedback (RLHF)