Workflow
扩散语言模型(Diffusion Language Models)
icon
Search documents
姚班传奇陈立杰入职OpenAI,16岁保送清华,30岁拿下UC伯克利助理教授
3 6 Ke· 2026-01-15 01:43
Core Insights - Chen Lijie, a prominent figure from Tsinghua University's Yao Class and an assistant professor at UC Berkeley, has joined OpenAI to focus on mathematical reasoning [1][2]. Group 1: Chen Lijie's Background - Chen Lijie was born in 1995 and won the national informatics Olympiad gold medal at the age of 16, gaining admission to Tsinghua University [7]. - He became an assistant professor at UC Berkeley in 2025, specializing in theoretical computer science and computational complexity theory [7][19]. - His educational journey includes significant achievements in informatics competitions, culminating in a gold medal at the International Olympiad in Informatics in 2013 [8][10]. Group 2: Research Contributions - Chen's recent research focuses on Diffusion Language Models, aligning with the current evolution of generative models [2]. - He has made substantial contributions to theoretical computer science, including solving a long-standing open problem in quantum information during his time at MIT [13][14]. - His work has been recognized with multiple awards, including the Best Student Paper awards at FOCS and STOC in 2019 [14][19]. Group 3: OpenAI Involvement - OpenAI has acknowledged Chen's previous research, including a paper he co-authored that was cited in their 2022 publication on language model hallucinations [3]. - His role at OpenAI will involve exploring AI safety, particularly in the context of his expertise in computational complexity and its applications to quantum physics and AI [19].
告别「盲目自信」,CCD:扩散语言模型推理新SOTA
机器之心· 2025-12-13 01:13
Core Insights - The article discusses the introduction of a new decoding algorithm called Coherent Contextual Decoding (CCD) for Diffusion Language Models (DLMs), which addresses issues of slow inference speed and logical coherence in Any-order decoding modes [2][7][19] - The CCD algorithm leverages historical prediction information to enhance current decoding choices, thereby correcting the "short-sightedness" of traditional DLM inference strategies [9][19] Group 1: Research Background - Open-source diffusion language models like Dream and LLaDA have demonstrated comparable general capabilities to autoregressive LLMs, showcasing advantages in global planning and bidirectional context understanding [5] - Current mainstream DLM inference algorithms suffer from a critical flaw of local "overconfidence," leading to suboptimal sampling choices that can result in cascading errors [7][19] Group 2: Core Innovations - The CCD algorithm introduces a "history buffer" mechanism to reject short-sighted predictions by utilizing past diffusion step predictions to correct current decoding choices [9] - An adaptive sampling strategy (CCD-DS) is implemented, allowing for dynamic adjustment of decoding speed based on the context, thus breaking the trade-off between generation speed and quality [10][19] Group 3: Experimental Results - The research team conducted comprehensive experiments using mainstream open-source DLMs (Dream-7B and LLaDA-8B) across various tasks, including mathematical reasoning, code generation, and planning [13] - Under the adaptive strategy (CCD-DS), significant improvements in both inference speed and model performance were observed, with Dream's inference speed increasing by 3.48 times and performance improving by 3.91% in the Trip Plan task [16] Group 4: Case Study - A case study in mathematical reasoning illustrates the superiority of CCD, where the algorithm effectively distinguishes between grammatical fluency and semantic importance, leading to correct reasoning trajectories [17] Group 5: Conclusion and Outlook - The CCD approach provides a theoretically sound and practical solution for improving inference in diffusion language models, paving the way for their application in more complex reasoning tasks [19]