Jinyi Liu (刘金毅) | PhD Candidate in RL, LLMs, and Agentic Systems

Reliable reasoning and decision-making
with LLM post-training, reinforcement learning, and agents.

I am a Ph.D. candidate at Tianjin University and a member of the TJU DRL Lab. I work with Jianye Hao, Yan Zheng, and Hongyao Tang on reinforcement learning, LLM post-training, and agentic systems that make language models reason more reliably, act more effectively, and support scientific discovery.

Selected Papers Projects

Current focus

Scaling post-training for language models, building reliable reasoning frameworks, and designing agentic systems that connect decision-making with real-world scientific and practical workflows.

Open to

Open to collaborations, research internships, and conversations around LLM post-training, RL, agents, and AI for science. The TJU DRL Lab is also welcoming interns and prospective MS/PhD students.

Signature Themes

Research Pillars

Three directions that define how I think about reliable language-model systems and their real-world use.

Reliable reasoning

I design reasoning frameworks that make language models more consistent, more controllable, and more trustworthy on complex tasks.

Fine-grained reasoning Reliability Decision-making

LLM post-training

I study reinforcement-learning-based post-training methods that improve capability while reducing cost and instability.

RL tuning Efficiency Verifiable rewards

Agentic systems for science

I build agent systems that translate language-model reasoning into useful workflows for research, analysis, and discovery.

AI for science Multi-agent systems Real-world tools

Research

Selected Publications

A curated set of papers that best represent my current work across reasoning, agent systems, and AI for science.

View All Publications

Featured Reasoning

SCALR@COLM 2025 2025

From Chaos to Order: The Atomic Reasoner Framework for Fine-grained Reasoning in Large Language Models

Breaks complex reasoning into small atomic steps so language models can reason more reliably without relying on heavyweight search or tool orchestration.

Details PDF

From Chaos to Order: The Atomic Reasoner Framework for Fine-grained Reasoning in Large Language Models

CellAgent: LLM-Driven Multi-Agent Framework for Natural Language-Based Single-Cell Analysis

02 AI for Science

The Fourteenth International Conference on Learning Representations (ICLR 2026 Poster) 2026

CellAgent: LLM-Driven Multi-Agent Framework for Natural Language-Based Single-Cell Analysis

Lets researchers run end-to-end single-cell and spatial transcriptomics analysis through natural language, reducing programming overhead without sacrificing result quality.

Details PDF

Ovd-explorer: Optimism should not be the sole pursuit of exploration in noisy environments

03 Reinforcement Learning

Proceedings of the AAAI Conference on Artificial Intelligence 2024

Ovd-explorer: Optimism should not be the sole pursuit of exploration in noisy environments

Improves exploration in noisy reinforcement learning settings by separating genuinely useful uncertainty from stochastic noise that would otherwise mislead the agent.

Details PDF

Systems

Selected Projects

Selected systems and learning artifacts that turn research ideas into tools people can actually use.

View All Projects

LLM Agent Tutorial

Project page for the LLM Agent Tutorial website, highlighting the public learning resource, companion materials, and entry points for readers.

LLM Agent Tutorial

Details Visit

CellAgent

An LLM-driven multi-agent system that lets researchers run end-to-end single-cell analysis through natural language while preserving high-quality scientific outputs.

LLM Agent CellAgent

Details Visit

Uni-RLHF

An integrated RLHF platform that makes it easier to study, compare, and operationalize reinforcement learning from diverse human and synthetic feedback.

PbRL RLHF

Details Visit

Updates

Recent News

Recent milestones across papers, systems, tutorials, and community work.

📝 2026-01 Three papers accepted by ICLR 2026: ReMix, CellAgent, and From Seeing to Doing.
🧠 2026-01 Two workshop papers accepted at ICLR 2026: AgentMemoryBench and DistRLVR.
📝 2026-01 Paper accepted by ACM TheWebConf 2026 Industry: AFE-Master.
🔥 2025-12 Released our beginner-friendly LLM Agent tutorial (website & PDF).
🎤 2025-12 The 137th RLCHINA Paper Seminar hosted!

Trajectory

Experience Snapshot

A brief view of research training, industry collaboration, and selected recognition.

Full Experience

Experience

Aug 2025 - Present

Algorithm Research Intern

Shanghai AI Lab (advised by Shuyue Hu)

Oct 2024 - Aug 2025

Algorithm Research Intern (Project Collaboration)

Kuaishou (advised by Hangyu Mao)

Jun 2022 - Mar 2024

Algorithm Research Intern

NetEase (advised by Yujing Hu)

Selected Honors

2025

CSIG Science and Technology Progress Award, First Prize (2025年度CSIG科技进步奖一等奖)

CSIG

2025

Distinguished PC Members in AAMAS 2025

AAMAS 2025

2024

Academic First-class Scholarship (Top 10%)

Tianjin University