A unified framework for large language model-guided reinforcement learning in digital twin industrial environments

Published in Robotics and Computer-Integrated Manufacturing, 2025

Digital twin (DT) optimization in industrial environments faces persistent challenges, including sample inefficiency, extensive training requirements, and limited cross-domain adaptability. This paper presents a unified three-phase framework that integrates large language models (LLMs) with reinforcement learning (RL) via imitation learning (IL). The proposed approach comprises three key components: (1) offline expert demonstration collection using LLM-generated multi-agent coordination strategies, (2) offline and supervised IL to clone these strategies using a centralized training and decentralized execution (CTDE) architecture, and (3) lightweight RL fine-tuning to optimize the pre-trained policy. The system resolves equipment assignment conflicts and leverages coordination history for adaptive decision-making. Experiments in multi-agent industrial scenarios, including human–machine collaboration and fatigue-aware maintenance, demonstrate that our IL+RL hybrid reduces online training time by up to 96% while maintaining over 66% of optimal task performance, using only 4% of the training episodes required by standard RL. The approach also achieves 30%–40% task completion in zero-shot cross-domain settings (e.g., warehouse, manufacturing), and up to 99.7% with minimal fine-tuning. Conceptually, the framework establishes a new paradigm of ”language-conditioned IL,” where reasoning from general-purpose LLMs serves as an adaptive prior for efficient multi-agent coordination in DT. The results highlight how LLM-guided demonstrations can bridge symbolic reasoning and adaptive learning, offering both conceptual and practical advances for scalable, sample-efficient decision-making in Industry 5.0 systems.

Recommended citation: Fan, Haolin, et al. "A unified framework for large language model-guided reinforcement learning in digital twin industrial environments." Robotics and Computer-Integrated Manufacturing (2026).
Download Paper

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Haolin (Oliver) Fan

Share on