AI幻觉
Search documents
我们让GPT玩狼人杀,它特别喜欢杀0号和1号,为什么?
Hu Xiu· 2025-05-23 05:32
Core Viewpoint - The discussion highlights the potential dangers and challenges posed by AI, emphasizing the need for awareness and proactive measures in addressing AI safety issues. Group 1: AI Safety Concerns - AI has inherent issues such as hallucinations and biases, which require serious consideration despite the perception that the risks are distant [10][11]. - The phenomenon of adversarial examples poses significant risks, where slight alterations to inputs can lead AI to make dangerous decisions, such as misinterpreting traffic signs [17][37]. - The existence of adversarial examples is acknowledged, and while they are a concern, many AI applications implement robust detection mechanisms to mitigate risks [38]. Group 2: AI Bias - AI bias is a prevalent issue, illustrated by incidents where AI mislabels individuals based on race or gender, leading to significant social implications [40][45]. - The root causes of AI bias include overconfidence in model predictions and the influence of training data, which often reflects societal biases [64][72]. - Efforts to mitigate bias through data manipulation have limited effectiveness, as inherent societal structures and language usage continue to influence AI outcomes [90][91]. Group 3: Algorithmic Limitations - AI algorithms primarily learn correlations rather than causal relationships, which can lead to flawed decision-making [93][94]. - The reliance on training data that lacks comprehensive representation can exacerbate biases and inaccuracies in AI outputs [132]. Group 4: Future Directions - The concept of value alignment is crucial as AI systems become more advanced, necessitating a deeper understanding of human values to ensure AI actions align with societal norms [128][129]. - Research into scalable oversight and superalignment is ongoing, aiming to develop frameworks that enhance AI's compatibility with human values [130][134]. - The importance of AI safety is increasingly recognized, with initiatives being established to integrate AI safety into public policy discussions [137][139].
国内60%AI应用背后的搜索公司,怎么看AI幻觉问题?|AI幻觉捕手
2 1 Shi Ji Jing Ji Bao Dao· 2025-05-23 00:08
Core Viewpoint - The concept of "AI hallucination" refers to AI generating inaccurate information, which is attributed to limitations in model generation and training data, but the role of search engines in providing accurate information is often overlooked [1][3]. Group 1: AI Hallucination and Search Engines - AI hallucination is a persistent issue that cannot be completely eliminated, primarily due to the inherent problems with information sources [3][4]. - The accuracy of AI-generated responses is influenced by the quality of the information retrieved from search engines, which can also contain inaccuracies [4][6]. - The search engine's role is likened to that of a supplier of ingredients for a chef, where the quality of the ingredients (information) directly impacts the final dish (AI output) [1]. Group 2: Company Insights and Technology - Bocha, a startup based in Hangzhou, provides search services for over 60% of AI applications in China, with a daily API call volume exceeding 30 million, comparable to one-third of Microsoft's Bing [1][2]. - The company employs a dual approach of "model + human" to filter information, using a model to assess credibility before human intervention for verification [4][5]. - Bocha's search engine prioritizes "semantic relevance," allowing it to return results based on the full context of user queries rather than just keywords [6][7]. Group 3: Challenges and Future Outlook - The company faces challenges in building a large-scale index library, with a target of reaching 500 billion indexed items, which requires significant infrastructure and resources [14][15]. - The anticipated future demand for AI search services is expected to exceed human search volumes by 5 to 10 times, indicating a growing need for robust search capabilities in AI applications [14]. - Bocha aims to establish a new content collaboration mechanism that rewards high-quality content providers, moving away from traditional paid ranking systems [9][10].
体验Kimi的新功能后,我为月之暗面捏把汗
Hu Xiu· 2025-04-30 13:56
DeepSeek R1 横空出世成了明日之星,腾讯元宝、豆包、夸克等也搭上了 DeepSeek 的便车吃香喝辣,还有誓要在技术上和 DeepSeek R1 的一较高下的阿 里通义千问捷报频频…… 唯独去年的投放王者,铺天盖地出现在各个广告位的Kimi,好像一下子没了消息。 而就在这几天,我们终于等到了 Kimi 的"大动作"。4 月 28 日,Kimi 宣布和财新传媒达成合作,当用户使用Kimi 提问财经相关内容时,Kimi "将结合财 新传媒旗下专业报道内容,通过模型生成答案,为你提供及时、可信、可证的高质量财经信息"。 好家伙,当我们以为 Kimi 已经摆烂躺平的时候,原来还是有在暗地里偷偷努力的。 选择和财新网合作发力财经垂直领域, Kimi 的确对 AI 工具的发展路线有了一些自己的新思考。 毕竟只比模型能力, Kimi 肯定不如能免费接入的 DeepSeek ,但与专业财经媒体强强联合,甚至日后拓展到和更多垂直领域的专业媒体合作提供信源, 能增强kimi 在特定垂直领域的公信力,长期来看大有可为。 不过在 Kimi 发布了合作消息后,我就第一时间测试了拥抱新功能的 Kimi。从测试结果来看,我有点想 ...
人类幻觉比AI要严重多了
Hu Xiu· 2025-04-17 04:45
Group 1 - The article discusses the phenomenon of AI hallucinations, where AI models provide seemingly accurate but fabricated information due to issues with training data quality and completeness [2][3] - Google's official explanation attributes AI hallucinations to two main reasons: the quality of training data and the model's difficulty in accurately understanding real-world knowledge [2] - A study by Vectara in March 2025 found that leading AI models have low hallucination rates, with Gemini-2.0-Flash-001 achieving a 0.7% hallucination rate, indicating high accuracy in document processing [3] Group 2 - The article compares the hallucination rates of AI models to human error rates, noting that top AI models outperform human experts in knowledge-intensive tasks but still lag in open-ended creative tasks [7] - In the medical field, the World Health Organization reported an average misdiagnosis rate of 30%, highlighting that human cognitive biases lead to more significant errors than AI hallucinations [8] - Human cognitive biases, such as confirmation bias and anchoring effects, contribute to a higher incidence of misjudgment compared to AI, as illustrated by historical examples like the Titanic disaster and the Chernobyl accident [9][10]
“AI幻觉”冲击合规防线,“大模型不金融”困局待解
第一财经· 2025-04-11 14:53
2025.04. 11 本文字数:1807,阅读时长大约3分钟 在法律层面,早在2023年8月,由网信办等七部门发布的《生成式人工智能服务管理暂行办法》(下 称《办法》)正式施行,《办法》明确要求生成式AI服务提供者需建立数据合规、算法透明、生成 内容管理等六大机制,随着《办法》的实施,中国AI产业的治理与规范化水平日益发展和成熟。 导读 : 金融领域因其数据密度高、专业性强,暴露出大模型垂直行业数据供给不足的问题。 作者 | 第一财经 齐琦 2025年是AI应用元年,金融行业正经历一场以"垂直化AI"为核心的深度变革。安永最新报告显示, 中国金融科技市场规模已突破4.59万亿美元,预计2030年将达9.97万亿美元,年复合增长率达 13.8%。 当前,包括银行、保险、基金等金融机构已完成多类通用大模型的本地化部署。行业人士对记者称, 大模型与专业知识库的结合是AI落地的未来趋势。 金融AI的知识基建:从通用到专属 具体看来,AI正逐步渗透金融领域,从风险管理到客户服务、从投资决策再到支付安全。 易方达投顾金融科技负责人刘玮对第一财经分析称,DeepSeek的出现令金融机构以更具成本效益的 方式运用AI技术, ...
新华网文化观察丨文艺创作,AI热下的冷思考
Xin Hua She· 2025-03-31 03:52
新华网记者李欣 王坤朔 丁梓朔 今年,DeepSeek赋能潮起。在文艺创作领域,有关"AI写作""AI创作"等话题的讨论持续高热。 当AI可以输出《诗经》的风雅、金庸的招式、莫奈的笔触……AI会给文学和艺术带来什么样的变革? 数字化浪潮下,创作的边界在哪里?文艺将会向何处去? 图片由AI生成 "AI已经'破门而入'" 新华网北京3月31日电 题:文艺创作,AI热下的冷思考 "为DeepSeek鼓掌!"这是某文化创意公司策划宁映雪在春节期间发的朋友圈。与之相配的图片,是她用 DeepSeek写的一首名为《沙漏内部有潮汐》的诗,"铜绿爬上表盘时,分针正剖开/一尾银鱼的腹部。沙 粒坠入深井/钟摆用弧度收割所有未完成的疑问/黄昏在候鸟骨骼里迁徙……" 不止现代诗,春联、贺词、攻略,甚至古典诗词、歌词、剧本、散文、小说……AI以前所未有的速度 和深度接入普通人的生活,引发全民"创作热情",甚至不少人戏称要"用AI续写《红楼梦》后四十回"。 在创作端,最先受到影响的或许是网文平台。不少网文平台编辑反映,年后审核工作量骤增。番茄小说 等平台的部分板块,新书首秀数量环比增幅超过50%。有分析称,这可能与大量新人开始用AI写文 ...
除了不能当女婿,DeepSeek比董宇辉差到哪了?
36氪· 2025-03-11 13:48
Core Viewpoint - DeepSeek is emerging as a new consumption decision-making tool for young consumers, providing personalized recommendations that challenge traditional influencer-led shopping methods [3][5][46]. Group 1: DeepSeek's Functionality and Impact - DeepSeek offers personalized recommendations based on user-specific queries, such as skin type or reading preferences, providing detailed reports that include product features and suitability [4][12]. - The platform is seen as a more comprehensive alternative to traditional live-streaming influencers, as it utilizes a deep thinking model to deliver tailored suggestions [5][6]. - As of February 9, DeepSeek's app has surpassed 110 million downloads, with weekly active users reaching nearly 97 million, indicating its growing popularity among young consumers [9]. Group 2: Comparison with Traditional E-commerce Platforms - Traditional e-commerce platforms like Taobao and JD have attempted to integrate AI for personalized shopping but have not prioritized these features in their main app interfaces, limiting user engagement [7][8][22]. - DeepSeek's recommendations are based on a broader range of sources compared to existing e-commerce AI assistants, which often rely on fewer references, leading to less comprehensive suggestions [29][30]. - Despite the advantages of DeepSeek, traditional influencers still hold a significant role in the market due to their established trust and the ability to provide curated selections backed by professional institutions [19][20]. Group 3: Challenges and Limitations - DeepSeek faces challenges such as "AI hallucination," where the AI may produce inaccurate or biased recommendations based on its training data, necessitating human oversight for quality control [17][18]. - The platform's current model requires users to transition to e-commerce sites for purchases, which contrasts with the seamless shopping experience offered by influencers [20][21]. - E-commerce platforms are cautious about integrating DeepSeek due to concerns over data sensitivity and the potential disruption of existing business models [40][41][42]. Group 4: Future Prospects - The shift towards AI-driven recommendations is seen as a significant trend in e-commerce, with DeepSeek positioned to capture the preferences of younger consumers [46]. - There is a need for e-commerce platforms to adapt and potentially collaborate with AI technologies like DeepSeek to enhance their offerings and maintain competitiveness in the evolving market landscape [47][48].
除了不能当女婿,DeepSeek比董宇辉差到哪了?
商业洞察· 2025-03-09 08:04
字母榜 . 让未来不止于大 以下文章来源于字母榜 ,作者薛亚萍 从事产品运营工作的陈鹏,今年26岁,他想要通过阅读提高自己的眼界,便向DeepSeek提问"最应 该读的十本书是什么?"DeepSeek同样给他列出了书单,并且分类附上了理由。陈鹏选择了购入其 中几本书。 这曾是李佳琦和董宇辉们在直播间的工作: 导购。 过去几年,头部主播们通过建立和用户之间的信任,构建了以主播为核心的商品分发机制,将他们认 为最好的、最适合的东西推荐给粉丝朋友们。 但现在这套推荐体系,正在被DeepSeek解构。DeepSeek深度思考模式的长思维链优势,能为用户 提供更全面、精准的优质解答,进而形成一对一的个性化推荐,应用到购物领域,俨然已经成为D选 ——DeepSeek优选。 "D选"的本质是"AI导购",辅助用户高效进行消费决策,"AI导购"这个场景并不陌生。 早在多年前,一些电商平台就试图借助大模型实现"种草+购物"的交易闭环。 譬如淘宝的AI助手"淘宝问问"早已接入通义千问,功能包含个性化推荐,并生成选购建议。京东的言 犀大模型,也接入消费导购场景,"京东京言"也被明确定位于"专属AI购物助手"。抖音APP的AI搜索 ...