Large Language Models (LLMs) like the GPT and LLaMA series have revolutionized natural language processing, excelling as generalpurpose AI assistants. However, their dominant interaction paradigm is passive and reactive; they faithfully execute user commands, but the user always leads the conversation. This passivity fundamentally limits their application in scenarios requiring proactive guidance toward complex goals.
Abstract
Proactive conversational agents, which aim to guide conversations to achieve complex objectives, are a key frontier of autonomous agent research. However, although existing large language models are efficient passive responders, they have a fundamental capability deficiency in autonomously planning and executing longhorizon, goal-oriented conversational strategies. The severe scarcity of high-quality real-world training data further exacerbates this limitation.To address this challenge, we propose FSTI (From Simulation to Interaction), a brand-new paradigm for building proactive conversational agents. This framework first utilizes a four-agent simulation architecture that decouples role-playing, process advancement, and quality assessment, to synthesize large-scale, highfidelity data for complex financial mediation proactive dialogues in realistic adversarial scenarios. Subsequently, FSTI adopts a threestage progressive training scheme to efficiently distill the emergent procedural knowledge and strategic behaviors from the simulation data into a compact final model.Experimental results show that the 8B-parameter model trained via FSTI exhibits deep internalization of the complex mediation process. On the highly challenging proactive financial mediation dialogue task, its performance not only significantly surpasses top-tier closed-source models such as GPT-4o, but it is also fully suitable for real-world deployment with strict requirements for latency. FSTI provides a path that is both effective and scalable for building truly autonomous and efficient conversational agents.
Comparison of Datasets for Proactive Dialogue Systems.
performance of different models on the proactive dialogue mediation task, with and without the procedure included in the system prompt.
Scores are evaluated by an LLM-as-a-judge based on our weighted rubric. The highest score in each column across both conditions is highlighted. Our model’s scores are marked in blue, while competitors’ bests are in gray.
Note: Proc. Adh. stands for Process Adherence, Impartial. for Impartiality and Neutrality, and Emo. Adpt. for Emotional Adaptability. The numbers in parentheses indicate the maximum score for each metric.