AI大模型幻觉测试：马斯克的Grok全对，国产AI甘拜下风？

Group 1 - Musk, co-founder of OpenAI, is developing an AI assistant named Grok through his company xAI, which is currently involved in a $300 million equity transaction, valuing xAI at $113 billion [1] - Musk expressed frustration on the X platform regarding the presence of "garbage" data in uncorrected foundational models, indicating plans to rewrite the human knowledge corpus using Grok 3.5 or Grok 4 to enhance data accuracy [1][2] - The industry is currently employing various methods, such as RAG frameworks and external knowledge integration, to mitigate AI hallucinations, while Musk's approach aims to create a reliable knowledge base [2][35] Group 2 - A recent evaluation of AI models, including Grok, revealed that some models still exhibit hallucinations, with Grok performing well in tests by providing accurate answers [3][11][21] - The tests highlighted the importance of enabling deep thinking modes and networked searches to improve the accuracy of AI-generated content, as models like Doubao and Tongyi showed improved performance when these features were activated [7][21][37] - The evaluation also indicated that while AI hallucinations persist, they are becoming less frequent, and Grok consistently provided correct answers across multiple tests [33][38] Group 3 - Critics, including Gary Marcus, argue that Musk's plan to rewrite the human knowledge corpus may introduce bias, potentially compromising the objectivity of the AI model [38] - The ongoing development of AI models suggests that integrating new mechanisms for content verification may be more effective in reducing hallucinations than rewriting the knowledge base [38] - Research indicates that retaining some level of AI hallucination can be beneficial in fields like abstract creation and scientific research, as demonstrated by the recent Nobel Prize-winning work utilizing AI's "error folding" [38]