Learn to Reason Efficiently with Adaptive Length-based Reward Shaping
Published in ICLR 2026, 2025
Recommended citation: Wei Liu, Ruochen Zhou, Yiyun Deng, Yuzhen Huang, Junteng Liu, Yuntian Deng, Yizhe Zhang, Junxian He https://arxiv.org/abs/2505.15612
Recommended citation:
@article{liu2025laser,
title={Learn to Reason Efficiently with Adaptive Length-based Reward Shaping},
author={Liu, Wei and Zhou, Ruochen and Deng, Yiyun and Huang, Yuzhen and Liu, Junteng and Deng, Yuntian and Zhang, Yizhe and He, Junxian},
journal={arXiv preprint arXiv:2505.15612},
year={2025}
}
