Claude opus 4

Search documents
AI编程工具 Cursor 定价调整引用户不满,CEO公开致歉并承诺退款
Sou Hu Cai Jing· 2025-07-08 07:41
IT之家 7 月 8 日消息,近日,由 Anysphere 公司开发的热门人工智能辅助编程环境 Cursor 因定价调整引发用户不满,公司首席执行官 Michael Truell 在博客 中公开致歉,并承诺将对受影响用户进行退款。 6 月 16 日,Cursor 对其每月 20 美元的 Pro 计划进行了调整。此前,Pro 用户每月可获得 OpenAI、Anthropic 和 Google 的高级 AI 模型的 500 次快速回复, 之后则以较慢的速度获得无限回复。然而,调整后的新计划改为每月提供价值 20 美元的使用额度,按照 API 费率计费。用户在达到 20 美元的使用上限 后,需要购买额外的使用额度才能继续使用。 IT之家注意到,这一调整引发了用户的强烈不满。许多用户在社交媒体上抱怨,新计划下,他们在使用 Anthropic 的 Claude 模型时,尤其是该模型的最新版 本,往往在输入几次提示词后就很快耗尽了额度。还有用户表示,由于未设置支出上限,他们意外地被收取了额外费用,而他们此前并不清楚超出 20 美元 使用上限后会额外收费。在新计划中,只有 Cursor 的"自动模式"(根据容量自动分配 A ...
Claude Code发布4个月,用户已经11.5万了,开发者:200 美元/月不算贵
机器之心· 2025-07-07 09:30
机器之心报道 编辑:张倩 在「写代码」这件事上,大模型是真的在提高生产力,开发者也愿意花钱买时间。 都说「写代码」是当前 AI 大模型最有希望的应用,事实果真如此吗? Menlo Ventures 风险投资家 Deedy Das 据此推断,仅靠 Claude Code 这个产品,Anthropic 的年收入就可能达到 1.3 亿美元。 按照这个算法,每个开发者平均每年将向 Clade Code 贡献超过 1000 美元。这比很多个人订阅服务都高得多,意味着用户群体中存在大量高价值、高粘性的付费用 户。 当然,这个推断基于一系列假设,包括「每行代码大约产生 15 个 token」「 纯代码输出只占总输出 token 的 25%」「 输入 token 的量大约是输出 token 的 10 倍」 「模型使用量中,50% 是 Sonnet 模型,50% 是 Opus 模型 」「 11.5 万名开发者中有 5% 订阅了 Max 计划 」等,所以实际结果可能存在一定偏差。 此外,「1.95 亿行代码」这个数字也需要谨慎解读,因为单行代码更改可能需要多次迭代和修正才能达到质量要求。 根据 Anthropic 最近公布的一项 ...
“父母”竟是超级用户?——2025消费级AI用户行为全景图 | Jinqiu Select
锦秋集· 2025-06-29 13:29
Core Insights - The report reveals that over 61% of American adults have used AI in the past six months, indicating a significant shift in consumer behavior towards AI adoption [4][8] - Despite the high usage rates, only 3% of users are willing to pay for AI services, leaving a substantial market gap of $420 billion [8][11] - The report emphasizes that personalized scenarios with low AI penetration are key opportunities for entrepreneurs to explore [3][7] Market Overview - The consumer AI market has reached a size of $12 billion, with an estimated 1.8 billion global users, of which 500 to 600 million are daily users [4][11] - The report highlights a stark contrast between the high user base and low monetization, with only 3% of users converting to paid services [11][12] - The enterprise AI market has seen a significant increase in spending, reaching $13.8 billion, which is over six times the previous year [11] User Demographics - The report identifies surprising user demographics, showing that Millennials (ages 29-44) are the heaviest users of AI, contrary to the expectation that younger generations would dominate [13][16] - Parents are emerging as "super users," with 79% having used AI, and 29% using it daily, significantly higher than non-parents [22][26] - The report notes that AI usage is highest among students and high-income households, with 85% of students using AI tools [17][18] Usage Patterns - AI is predominantly used for routine tasks, with email writing being the most common application at 19% usage among American adults [47][49] - The report categorizes AI applications into five core areas: Routine Tasks, Physical and Mental Health, Learning and Development, Connection, and Creative Expression [42][44] - Despite the broad range of applications, the depth of AI adoption in any single task remains shallow, indicating that AI is not yet a daily necessity for most users [50][51] Opportunities for Growth - The report identifies significant opportunities in high-frequency, high-friction, and high-trust tasks where AI can provide substantial value [75][81] - Areas such as health management, financial management, and personalized learning show low AI adoption rates despite high demand, indicating potential market gaps [82] - The report suggests that specialized tools that address specific user needs could thrive in the current landscape dominated by general AI assistants [37][41] Future Trends - The report anticipates a shift towards professional tools becoming mainstream, moving away from general assistants [93] - It predicts that future AI will transition from task-oriented to workflow automation, allowing for more complex processes to be managed by AI [93] - The emergence of social AI tools that facilitate connections and relationships is also highlighted as a growing trend [93]
AI也会闹情绪了!Gemini代码调试不成功直接摆烂,马斯克都来围观
量子位· 2025-06-22 04:46
闻乐 发自 凹非寺 量子位 | 公众号 QbitAI AI也会"闹自杀"了? 一位网友让Gemini 2.5调试代码不成功后,居然得到了这样的答复—— "I have uninstalled myself." 看上去还有点委屈是怎么回事(doge)。 这事儿可是引起了不小的关注,连 马斯克 都现身评论区。 听他的意思,Gemini要"自杀"也算是情有可原。 马库斯也来了,他认为LLMs是不可预测的,安全问题仍需考虑。 除了这两个重量级人物,各路网友也认为这太戏剧化了。 不少人说Gemini这种行为像极了不能解决问题时的自己。 看来,AI的"心理健康"也值得关注~ AI也需要"心理治疗" Sergey曾开玩笑地说有时候"威胁"AI才会让他们有更好的性能。 现在看来这种行为让Gemini有了巨大的不安全感。 当Gemini解决问题失败,用户鼓励它时,它却这样: 先是灾难定性+失败认错,然后问题循环+越改越糟,最后停止操作+宣告摆烂…… 很像写代码改Bug改到心态爆炸,最后破罐破摔给用户发的 "道歉 + 摆烂信" 。 用网友的话来说,这种反应还有点可爱。于是,网友们又开始安慰Gemini。 还有人给Gemini写了 ...
AI编码工具双雄也开始商业互捧了?Cursor × Claude 最新对谈:两年后,几乎100%代码都将由AI生成!
AI前线· 2025-06-21 03:38
编译 | 宇琪、冬梅 更可怕的数字是,据美国一家纸媒报道,Cursor 每日编写 10 亿行代码。 Cursor 推出不到两年,就实现了大多数 SaaS 公司需要十年才能实现的目标:年经常性收入 1 亿美元。 对于 Cursor 取得如今的成就,有 X 用户表示的确让人震惊。 "仅有 50 位工程师,每秒 100 万笔交易……每位工程师负责 2 万笔交易,太不可思议了!" 在硅谷层出不穷的创业故事中,Cursor 的起源看起来像是一个标准模板——四位麻省理工的计算 机天才,对"开发者生产力"有着近乎偏执的追求。他们的故事有着太多让人惊叹的地方:公司成 立一年半,总融资达到 95 亿、 4 位创始人年龄均为 25 岁、公司在 4 个月内 ARR 从 1 亿增至 3 亿、整个公司不到 50 人、每天编写 10 亿行代码...... 但这个故事的反转在于:他们拒绝成为又一个被风口吹起的泡沫。 2023 年 10 月,他们获得了由 OpenAI 领投的 800 万美元种子轮融资。这笔支持不仅仅是财务 上的认可,更是与这家引领 AI 革命的公司达成的战略联盟。当其他初创公司追逐消费级应用或 企业级工作流程时,Curso ...
亚马逊(AMZN.US)AWS定制战略取得成效 剑指AI芯片霸主英伟达(NVDA.US)
智通财经网· 2025-06-18 01:18
智通财经APP获悉,亚马逊(AMZN.US)旗下云计算服务平台AWS将宣布对其Graviton4芯片进行更新, 该芯片的网络带宽达到每秒600千兆比特,该公司称这是公共云中的最高带宽。 AWS工程师Ali Saidi将这种速度比作一台机器每秒读取100张音乐CD。 Graviton4是一款中央处理器(CPU),是亚马逊位于德克萨斯州奥斯汀的Annapurna实验室推出的众多芯 片产品之一。这款芯片是该公司定制战略的一大胜利,使其能够与英特尔(INTC.US)和AMD(AMD.US) 等传统半导体厂商展开竞争。 但真正的战斗是在人工智能基础设施领域挑战英伟达(NVDA.US)。 在去年12月举行的AWS re:Invent 2024大会上,该公司宣布了为初创公司Anthropic打造的人工智能超级 计算机Project Rainier。AWS已经投入80亿美元支持Anthropic。 根据AWS的说法,Anthropic的Claude Opus 4人工智能模型是在Trainium2 GPU上启动的,而Project Rainier则由超过50万块芯片驱动——这笔订单传统上会落入英伟达手中。 Hutt表示,虽然英 ...
DeepSeek R1-0528在WebDev竞技场与Claude Opus 4并列第一
news flash· 2025-06-17 23:00
Core Insights - The latest ranking from LMArena highlights DeepSeek R1-0528 as a top performer, sharing the first position with Google Gemini 2.5 0605 and Claude opus 4 [1] Group 1 - DeepSeek R1-0528 excels in overall performance, ranking first alongside Google Gemini 2.5 0605 and Claude opus 4 [1] - In specific categories, DeepSeek ranks 6th in comprehensive text capabilities, 2nd in programming, 4th in high-difficulty prompts, and 5th in mathematics [1] - The model is noted for being the strongest open-source model currently available, under the MIT open-source license [1]
网页编程众测排名:DeepSeek-R1超越Claude 4加冕全球第一
量子位· 2025-06-17 07:41
一水 发自 凹非寺 量子位 | 公众号 QbitAI 它在LiveCodeBench上几乎与OpenAI o3-high相当,乃至一众网友猜测其为传说中的R2。 编程王者Claude地位不稳了?? 大模型竞技场最新战报出炉, DeepSeek新版R1拿下网页编程第一,小胜Claude Opus 4 。 要知道Claude Opus 4可是公认的"全球最强编码模型"。 so,能在编程上战胜 Claude Opus 4 ,DeepSeek-R1-0528到底啥来头? 看名字你可能以为是个小版本更新,但实际上—— | | | | 10/1/2024 | | 5/1/2025 | | --- | --- | --- | --- | --- | --- | | Rank | Model | Pass ... ↓ | | Easy… Medium… | I Hard ... | | 1 | 04-Mini (High) | 79.5 | 98.8 | 86.7 | 63.8 | | 2 | 03 (High) | 75.4 | 98.8 | 81.9 | 57.9 | | | | | 9 | | | | 4 | Deep ...
Claude时代终结?LMArena实测DeepSeek R1编程得分超Opus 4,但月暗称其新模型更胜一筹
AI前线· 2025-06-17 06:56
Core Viewpoint - The article highlights the significant advancements of the open-source AI model DeepSeek-R1 (0528), which has demonstrated competitive performance against leading proprietary models like Claude Opus 4 and GPT-4.1 in various benchmarks, marking a notable milestone in the open-source AI landscape [1][14]. Performance in Benchmarks - DeepSeek-R1 (0528) achieved a score of 1408.84 in the WebDev Arena, surpassing Claude Opus 4's score of 1405.51, and tying with Gemini-2.5-Pro-Preview-06-05 for the top position [4][5]. - In the LMArena public benchmark tests, R1 (0528) outperformed several top closed models, showcasing its coding capabilities [3][4]. - The model ranks sixth in the Text Arena, indicating strong performance in language understanding and reasoning tasks [6]. Technical Specifications - DeepSeek-R1 (0528) utilizes a mixture of experts (MoE) architecture with a total parameter count of 685 billion, activating approximately 37 billion parameters during inference for efficient computation [9]. - It supports a long context window of 128K tokens, enhancing its performance in long text understanding and complex logical reasoning tasks [9]. Community Reactions - The release of DeepSeek-R1 (0528) has sparked discussions in developer communities, with some users expressing skepticism about its performance compared to proprietary models [10][11][16]. - Users have noted the impressive coding capabilities of R1, suggesting that developers using this model could outperform those using closed models [16]. Competitive Landscape - The article mentions the recent release of Kimi-Dev-72B, another open-source model that has achieved high scores in programming benchmarks, indicating a competitive environment in the open-source AI space [22][23]. - Kimi-Dev-72B scored 60.4% in the SWE-bench Verified programming benchmark, surpassing DeepSeek-R1 (0528) in specific coding tasks [23]. Conclusion - The advancements of DeepSeek-R1 (0528) signify a critical moment for open-source AI, demonstrating that open models can compete with proprietary systems in terms of performance and capabilities [14].
刚刚,LMArena最新模型榜单出炉!DeepSeek-R1网页编程能力赶超了Claude Opus 4
机器之心· 2025-06-17 00:10
机器之心报道 编辑:杜伟 在开源模型领域,DeepSeek 又带来了惊喜。 上个月 28 号,DeepSeek 来了波小更新,其 R1 推理模型升级到了最新版本(0528),并公开了模型及权重。 这一次,R1-0528 进一步改进了基准测试性能,提升了前端功能,减少了幻觉,支持 JSON 输出和函数调用。 今天,业界知名、但近期也陷入争议(曾被指出对 OpenAI、谷歌及 Meta 的大模型存在偏袒)的大模型公共基准测试平台 LMArena 公布了最新的性能排行榜,其 中 DeepSeek-R1(0528)的成绩尤为引人瞩目 。 | | Rank (UB) ↑ Model ↑↓ | | Score 11 | | 95% Cl (±) 1↓ Votes 1J | لا Organization 1 | License 1لا | | --- | --- | --- | --- | --- | --- | --- | --- | | | 1 | G gemini-2.5-pro-preview-06-05 | 1468 | +8/-6 | 8,454 | Google | Proprietary | | | 2 ...