A lightweight large language model inference framework that performs structured and fine-grained natural language reasoning without the need for complex search and external tools.
Aug 1, 2025
This paper introduces an efficient method for finetuning Large Language Models (LLMs) using off-policy reinforcement learning, aiming to improve performance while minimizing computational resources.
Jul 9, 2025
Jun 15, 2025
This paper investigates how competitive mechanisms can enhance the reasoning capabilities of Large Language Models (LLMs), leading to improved performance on complex tasks.
May 31, 2025