Workflow
上下文一致性
icon
Search documents
告别「盲目自信」,CCD:扩散语言模型推理新SOTA
机器之心· 2025-12-13 01:13
Core Insights - The article discusses the introduction of a new decoding algorithm called Coherent Contextual Decoding (CCD) for Diffusion Language Models (DLMs), which addresses issues of slow inference speed and logical coherence in Any-order decoding modes [2][7][19] - The CCD algorithm leverages historical prediction information to enhance current decoding choices, thereby correcting the "short-sightedness" of traditional DLM inference strategies [9][19] Group 1: Research Background - Open-source diffusion language models like Dream and LLaDA have demonstrated comparable general capabilities to autoregressive LLMs, showcasing advantages in global planning and bidirectional context understanding [5] - Current mainstream DLM inference algorithms suffer from a critical flaw of local "overconfidence," leading to suboptimal sampling choices that can result in cascading errors [7][19] Group 2: Core Innovations - The CCD algorithm introduces a "history buffer" mechanism to reject short-sighted predictions by utilizing past diffusion step predictions to correct current decoding choices [9] - An adaptive sampling strategy (CCD-DS) is implemented, allowing for dynamic adjustment of decoding speed based on the context, thus breaking the trade-off between generation speed and quality [10][19] Group 3: Experimental Results - The research team conducted comprehensive experiments using mainstream open-source DLMs (Dream-7B and LLaDA-8B) across various tasks, including mathematical reasoning, code generation, and planning [13] - Under the adaptive strategy (CCD-DS), significant improvements in both inference speed and model performance were observed, with Dream's inference speed increasing by 3.48 times and performance improving by 3.91% in the Trip Plan task [16] Group 4: Case Study - A case study in mathematical reasoning illustrates the superiority of CCD, where the algorithm effectively distinguishes between grammatical fluency and semantic importance, leading to correct reasoning trajectories [17] Group 5: Conclusion and Outlook - The CCD approach provides a theoretically sound and practical solution for improving inference in diffusion language models, paving the way for their application in more complex reasoning tasks [19]