arxiv:2604.13740
Michal Valko
AI & ML interests
large language models, reasoning, fine-tuning, test-time computation, reinforcement learning with human feedback, world models
Recent Activity
updated a dataset about 13 hours ago
misovalko/my-research-papers authored a paper 4 days ago
Spectral Thompson sampling authored a paper 4 days ago
Covariance-adapting algorithm for semi-bandits with application to sparse rewards