Workflow
置信度值
icon
Search documents
如何让人工智能更“靠谱”
Xin Lang Cai Jing· 2026-01-26 22:08
Core Insights - The reliability of artificial intelligence chatbots is increasingly questioned, as they can provide incorrect information and exhibit "blind confidence" in their responses [1][2] - The 40th International Conference on Artificial Intelligence highlighted the importance of making AI more reliable and responsible [1] Group 1: AI Reliability Issues - A study titled "The Trap of Blind Confidence" revealed that when AI displays high confidence, users are more likely to accept its suggestions, which may often be incorrect [1] - The issue of confidence calibration in AI systems misleads users, as many systems do not accurately reflect their confidence levels [1] Group 2: Cognitive Fatigue in AI - Researchers from the University of South Carolina identified a phenomenon called "cognitive fatigue," where AI models deviate from original instructions and produce unreliable information over time [2] - A system was designed to visualize when AI begins to experience cognitive fatigue, allowing for real-time interventions to keep the conversation on track [2] Group 3: Human-AI Collaboration - Experts emphasize the need to focus on human-AI collaboration rather than allowing AI to operate independently, as the boundaries of AI actions are currently too broad [3] - There is a pressing need for systematic scientific exploration of AI's internal mechanisms to understand its efficiency and vulnerabilities [3]
【环球财经】如何让人工智能更“靠谱”
Xin Lang Cai Jing· 2026-01-26 10:46
新华财经新加坡1月26日电(记者舒畅)越来越多人意识到,人工智能聊天机器人并不总是可靠。它可能答非所问、前言不搭后语,有时甚至"一本正 经"地捏造不存在的信息。在1月下旬于新加坡举行的第40届人工智能促进协会年会上,如何让人工智能更"靠谱"、更"负责",是这场国际人工智能学术会 议的重要议题之一。 在研究者眼中,人工智能的缺陷对应着更细分的问题。比如,人工智能的置信度值是否与真实情况匹配等。一个来自意大利的研究团队在本届年会上提醒 说,合理计算人工智能的置信度值非常重要。 在这项名为"盲目自信的陷阱"的研究中,参与者在人工智能协助下完成逻辑推理题。结果显示,当人工智能表现得非常自信时,参与者更容易采纳它的建 议——但这些建议很可能是错的;然而,如果人工智能犹豫不决,用户却可能会拒绝正确的建议。 意大利米兰-比可卡大学的研究成员卡泰丽娜·弗雷戈西说,这两种情况都反映了现实问题:很多人工智能系统的置信度值没有被正确校准,从而向使用者 传递了误导性的信号。 "认知疲劳是可以被发现、预测并干预的。"玛尔瓦说。团队设计了一个系统,通过追踪注意力衰减等三个关键指标,将人工智能何时开始"疲劳"可视化, 并提供多种实时干预手 ...