量子位

Search documents
14%论文都有AI代写?Nature:每7篇就有1篇藏有ChatGPT特征词
量子位· 2025-07-04 07:02
Core Insights - The article discusses the increasing prevalence of AI-assisted writing in biomedical research, highlighting that over 20% of the 1.5 million abstracts published on PubMed in 2024 exhibit characteristics typical of large language models (LLMs) [1][3]. Group 1: AI Usage in Biomedical Research - A significant portion of biomedical papers, approximately 14%, has been identified as having been written with the assistance of LLMs within a year [1]. - The use of LLMs in certain countries and disciplines has surpassed 20%, indicating a growing trend in AI-assisted writing [3][5]. - The study found that 10%-11% of abstracts in 2024 utilized LLMs, with some sub-corpora showing rates as high as 30% [11][15]. Group 2: Characteristics of LLM Writing - LLMs tend to favor stylistic verbs and adjectives that do not contribute to the content but alter the writing style, leading to the frequent use of words like "intricate" and "notably" [2][11]. - The analysis revealed that 66% of the overused words in 2024 were verbs, while 16% were adjectives, indicating a preference for certain types of language [11][21]. - Authors are beginning to adjust their writing to avoid obvious LLM markers, which complicates the assessment of LLMs' impact on academic output [5][21]. Group 3: Variability in LLM Usage - The reliance on LLMs varies significantly across disciplines, with fields like computational biology and bioinformatics seeing around 20% usage due to rapid technological advancements [15][16]. - In non-English speaking countries, such as China and South Korea, LLM usage can reach 15% as researchers seek assistance with English writing [16]. - Open-access journals, particularly those with simpler review processes, show higher LLM usage rates (up to 24%) compared to prestigious journals like Nature and Science, which have rates between 6% and 8% [16]. Group 4: Detection and Adaptation - Researchers are exploring methods to detect LLM-generated text, but current detection tools may not accurately differentiate between human and AI-generated content [27][28]. - The study indicates that while authors can modify their writing to reduce LLM characteristics, complete avoidance is challenging [25][28]. - Future research aims to quantify the impact of AI on academic literature more comprehensively, moving beyond single text analyses to broader statistical evaluations [28].
硅谷的企业级AI正在这样赚钱|2025人工智能现状报告
量子位· 2025-07-04 04:40
Core Insights - The report emphasizes the shift towards "monetization" in AI development strategies among companies [3] - Companies are increasingly adopting multi-model strategies, combining OpenAI's models with 1-2 other suppliers to optimize performance across various applications [4][10][39] Group 1: AI Product Strategy - AI product strategies have entered a new phase of value transformation [8][31] - Companies are reshaping their product and service pricing strategies, moving towards hybrid pricing models that combine subscription fees with usage-based billing [43][46] - A significant portion of companies (40%) currently do not plan to change their pricing strategies, while 37% are exploring new pricing models based on usage and ROI [49][50] Group 2: Talent and Investment - There is a notable shortage of suitable AI talent, with many companies struggling to fill AI-related positions, particularly AI/ML engineers, which have an average recruitment cycle exceeding 70 days [51][56] - Companies are allocating 10-20% of their R&D budgets to AI, with plans for increased investment by 2025, indicating that AI is becoming a core element of product strategy [60][61] Group 3: AI Tools and Ecosystem - The AI tools ecosystem is maturing, with about 70% of employees in surveyed companies having access to internal AI tools, though only around half use them regularly [70][72] - High-growth companies are more proactive in experimenting with and adopting new AI tools, viewing AI as a strategic lever to enhance internal workflows [82] Group 4: AI Spending and Cost Structure - Companies with annual revenues around $500 million spend approximately $100 million on AI annually, with monthly model training costs ranging from $160,000 to $1.5 million depending on product maturity [16][19][69] - As AI products scale, talent costs typically decrease as a percentage of total spending, while infrastructure and computational costs tend to rise [12]
物理学家靠生物揭开AI创造力来源:起因竟是“技术缺陷”
量子位· 2025-07-04 04:40
Core Viewpoint - The creativity exhibited by AI, particularly in diffusion models, is hypothesized to be a result of the model architecture itself, rather than a flaw or limitation [1][3][19]. Group 1: Background and Hypothesis - AI systems, especially diffusion models like DALL·E and Stable Diffusion, are designed to replicate training data but often produce novel images instead [3][4]. - Researchers have been puzzled by the apparent creativity of these models, questioning how they generate new samples rather than merely memorizing data [8][6]. - The hypothesis presented by physicists Mason Kamb and Surya Ganguli suggests that the noise reduction process in diffusion models may lead to information loss, akin to a puzzle missing its instructions [8][9]. Group 2: Mechanisms of Creativity - The study draws parallels between the self-assembly processes in biological systems and the functioning of diffusion models, particularly focusing on local interactions and symmetry [11][14]. - The concepts of locality and equivariance in diffusion models are seen as both limitations and sources of creativity, as they force the model to focus on smaller pixel groups without a complete picture [15][19]. - The researchers developed a system called the Equivariant Local Score Machine (ELS) to validate their hypothesis, which demonstrated a 90% accuracy in matching outputs of trained diffusion models [18][19]. Group 3: Implications and Further Questions - The findings suggest that the creativity of diffusion models may be an emergent property of their operational dynamics, rather than a separate, higher-level phenomenon [19][21]. - There remain questions regarding the creativity of other AI systems, such as large language models, which do not rely on the same mechanisms of locality and equivariance [21][22]. - The research indicates that both human and AI creativity may stem from an incomplete understanding of the world, leading to novel and valuable outputs [21][22].
周伯文等将亮相上海交大,论道AI如何重塑产业格局,交大高金MBA课程全新升级
量子位· 2025-07-04 04:40
允中 发自 凹非寺 量子位 | 公众号 QbitAI AI重构商业逻辑、重塑产业格局,我们站在技术革命与投资机遇的交汇点——如何在这场变 革中成为赢家? 上海交大上海高级金融学院 率先给出答案。 7月6日下午,坐标上海,在 "AI创智时代的最好投资"主题论坛暨交大高金MBA全新课程升 级发布会 上,一场关于未来的深度对话即将展开。 来自人工智能实验室的顶尖科学家将揭开技术演进的神秘面纱;叱咤金融与科技战场的校友 们将分享他们亲历的转型阵痛与突破;产研领袖们将共同拆解: 当算法开始替代经验,什么 才是不可替代的竞争力? 报名方式 :可扫下方二维码,或者点击 阅读原文 跳转链接哦~ 议程安排 在这里,科学企业家将揭示学术研究如何转化为创新动能,产业先锋将探讨技术革命带来的 机遇与挑战,投资专家则将重新定义科技时代的价值判断标准。每一场对话都直指AI时代的 核心命题——在变革的浪潮中,我们该如何锚定自己的坐标? 当天,他们还将发布 交大高金MBA课程升级方案 ,正是对这些问题的最新回应—— 当传统商科教育遭遇AI降维打击,他们选择重构知识体系,让金融思维与科技基因深 度融合。 时间&地点 上海交通大学上海高级金融学 ...
英伟达加冕历史第一股!老黄最新身家1388亿美元,刚入职的研究员都赶上了
量子位· 2025-07-04 04:40
西风 发自 凹非寺 量子位 | 公众号 QbitAI 3.92万 亿美元市 值 ,刷新全球历史纪录,来自AI芯片霸主—— 英伟达。 什么概念?LSEG数据指出,这一数字 超过了加拿大+墨西哥股市的总市值之和 ,也 超过英国所有上市公司总市值 。 而在此之前,历史最高记录保持者为 苹 果 ,去年12月26日,苹果创下3.915万亿美元历史收盘纪录。 与英伟达创纪录同一时间,在华尔街上市公司中,微软股价总市值达3.7万亿美元,位居第二;苹果总市值3.19万亿美元,位居第三。 英伟达市值的暴涨足以见当前AI赛道的火热程度。 2023年,英伟达的市值才首次破1万亿美元 ,AI火爆的这两年该数字闪电般大幅翻番。 根据Forbes的计算,此时此刻 黄仁勋本人净资产为1388亿美元 ,较之前增长18亿美元,涨幅为1.31%,当前在全球富豪榜中排第10位。 3.92万亿美元市值,股价至160.98 美元 路透社消息,英伟达股价早盘暴涨2.4%,上涨后达到每股160.98美元,由此总市值来到3.92万亿美元,成为"历史上市值最高的公司"。 不过, 随后其股价涨幅收窄至1.5%,每股报价159.60美元,对应市值3.89万亿美元 ...
百万年薪遍地走,Meta薪资接连曝光!AI人才身价水涨船高ing
量子位· 2025-07-04 04:40
Core Viewpoint - The article discusses the significant salary increases for positions at Meta, particularly in AI and software engineering, amidst a competitive talent acquisition landscape in the tech industry [1][2]. Salary Disclosure - A federal document revealed the base salaries for various positions at Meta, including AI research scientists, software engineers, and product managers [2][7]. - A specific example highlighted a machine learning engineer's annual base salary of $350,000 (approximately 2.51 million RMB) and a total package of $20 million (approximately 140 million RMB) over four years [3][9]. Salary Ranges - Software engineers at Meta have a base salary range from $120,000 (approximately 860,000 RMB) to $480,000 (approximately 3.44 million RMB) [10]. - Machine learning engineers can earn up to $440,000 (approximately 3.15 million RMB) annually [11]. - Other AI-related roles, such as AI research scientists and AI product marketing managers, have salaries ranging from $170,000 to $230,000 [13]. Competitive Landscape - The article notes that the current talent acquisition competition has led to substantial salary increases across the tech industry, with Meta being one of the most aggressive in this regard [19][26]. - Other tech giants, such as Google and OpenAI, are also offering competitive salaries, with OpenAI reportedly paying up to $650,000 for technical staff [35][36]. Domestic Competition - The article highlights that the talent competition is not limited to the U.S., as Chinese companies like Tencent and Huawei are also aggressively recruiting top talent, with reports of salaries reaching one million RMB [38][39]. - Various Chinese tech firms have launched talent recruitment programs with no upper salary limits, indicating a fierce competition for skilled professionals [40]. Conclusion - The overall trend indicates a significant increase in salaries for tech professionals, particularly in AI and machine learning, driven by a limited talent pool and high demand from major tech companies [21][37].
无损加速视觉语言模型推理!轻松剪掉视觉冗余Token|腾讯AI Lab
量子位· 2025-07-04 01:42
Core Insights - The article discusses the challenges faced by large visual language models (LVLM) due to the exponential increase in visual token counts, leading to significant inference costs and performance bottlenecks [1][2][3] - Tencent AI Lab and CMU have proposed a new solution called VScan, which enhances inference efficiency without modifying model architecture or retraining, achieving up to 2.91x acceleration [2][5][38] Group 1 - The increase in visual tokens, such as LLaVA-NeXT processing up to 2,880 tokens and Qwen2.5-VL handling 16,384 tokens, has led to a quadratic growth in computational complexity during inference [2][4] - VScan has been empirically validated across multiple mainstream LVLMs, including LLaVA-1.5, LLaVA-NeXT, Qwen2.5-VL, and Video-LLaVA, covering tasks like image question answering and video understanding [4][5] - VScan's two-stage token filtering mechanism effectively reduces visual token input while maintaining accuracy, making it suitable for various resource-constrained environments [5][28] Group 2 - Existing visual token pruning methods can be categorized into text-agnostic and text-aware approaches, but they often lack a comprehensive understanding of the cross-stage information flow in LVLMs [8][9] - VScan's design is based on a systematic analysis of visual token contributions throughout the entire inference process, from visual encoding to language decoding [10][12][19] - The article emphasizes that effective pruning strategies should consider the dynamic value of tokens across the entire encoding process rather than relying solely on final layer attention [15][22] Group 3 - VScan employs a dual scanning mechanism: global scanning to retain semantically critical tokens and local scanning to capture detailed information from overlooked areas [30][26] - The first pruning phase occurs in the visual encoding stage, while the second phase targets text-irrelevant visual information during the language decoding stage, optimizing the timing of pruning [27][24] - Experimental results demonstrate that VScan significantly reduces visual token counts and inference times while maintaining high accuracy, outperforming existing methods [29][28] Group 4 - VScan has been tested on various LVLMs, including LLaVA-1.5 and Qwen-2.5-VL, across multiple benchmark datasets, showing robust performance even under high compression rates [28][34] - In practical scenarios, VScan achieved a 1.37x speedup in inference efficiency with LLaVA-1.5-7B and a 2.05x speedup with LLaVA-NeXT-7B, while maintaining minimal performance degradation [36][38] - The solution is open-sourced on GitHub, allowing the community to further validate and expand upon this efficient pruning paradigm [6][39]
科学家Ilya不想当CEO,都是扎克伯格逼的
量子位· 2025-07-04 01:42
克雷西 发自 凹非寺 量子位 | 公众号 QbitAI 全怪Meta挖人太狠,全怪扎克伯克开的薪资条件无法拒绝。 这不,Ilya和奥特曼,踏入了同一条河流,吐起了相同槽,甚至连安抚员工的饼也差不多。 在被小扎撬走了联创后,Ilya本人刚刚对这件事做出了回应, 证实Daniel Gross已经离开SSI 。 同时Daniel也转发了Ilya的推文,表示自己很荣幸帮助SSI起步,SSI的前途将会一片光明。 有意思的是, 两人在推文当中都只字未提Meta 。 Ilya,被迫当公司CEO了。 是的,即便两次改变了AI、改变了世界,但Ilya一直是研究员、首席科学家…而这一次, 不得不当自己创办公司的CEO 。 相比奥特曼被小扎戳中肺管后的歇斯底里,Ilya和Daniel两人至少在表面上比较体面。 但实际上,两人之间可能已经出现分歧,而且并非一朝一夕。 Ilya话里话外,大概就是即便小扎不挖,这人也在SSI干不了多久了。 有瓜啊这是… 小扎收购未果,但撬动了CEO 我们再来仔细看下Ilya的回应,主要内容概括起来有三件事—— 联创Daniel Gross已经于6月29日离开SSI; Ilya本人将出任CEO,另一名联创 ...
LeCun团队揭示LLM语义压缩本质:极致统计压缩牺牲细节
量子位· 2025-07-04 01:42
Core Viewpoint - The article discusses the differences in semantic compression strategies between large language models (LLMs) and human cognition, highlighting that LLMs focus on statistical compression while humans prioritize detail and context [4][17]. Group 1: Semantic Compression - Semantic compression allows efficient organization of knowledge and quick categorization of the world [3]. - A new information-theoretic framework was proposed to compare the strategies of humans and LLMs in semantic compression [4]. - The study reveals fundamental differences in compression efficiency and semantic fidelity between LLMs and humans, with LLMs leaning towards extreme statistical compression [5][17]. Group 2: Research Methodology - The research team established a robust human concept classification benchmark based on classic cognitive science studies, covering 1,049 items across 34 semantic categories [5][6]. - The dataset provides category affiliation information and human ratings of "typicality," reflecting deep structures in human cognition [6][7]. - Over 30 LLMs were selected for evaluation, with parameter sizes ranging from 300 million to 72 billion, ensuring a fair comparison with human cognitive benchmarks [8]. Group 3: Findings and Implications - The study found that LLMs' concept classification results align significantly better with human semantic classification than random levels, validating LLMs' basic capabilities in semantic organization [10][11]. - However, LLMs struggle with fine-grained semantic differences, indicating a mismatch between their internal concept structures and human intuitive category assignments [14][16]. - The research highlights that LLMs prioritize reducing redundant information, while humans emphasize adaptability and richness, maintaining context integrity [17]. Group 4: Research Contributors - The research was conducted collaboratively by Stanford University and New York University, with Chen Shani as the lead author [19][20]. - Yann LeCun, a prominent figure in AI and a co-author of the study, has significantly influenced the evolution of AI technologies [24][25][29].
vivo突破手机AI部署难题,绕开MoE架构限制,骁龙8 Elite流畅运行|ICCV 2025
量子位· 2025-07-03 09:00
Core Viewpoint - The article emphasizes the importance of deploying large models on mobile devices, particularly focusing on maintaining pure language capabilities while integrating multimodal functionalities. Group 1: Challenges in Current MLLM Deployment - Existing mobile large language models (MLLMs) face significant challenges, including a drop of over 10% in pure language task accuracy when supporting multimodal functions [3][4][6]. - Current mobile NPU platforms do not support the Mixture of Experts (MoE) architecture, which is commonly used to maintain language capabilities during multimodal training [7][8]. Group 2: GenieBlue Contributions and Technical Highlights - GenieBlue retains original language capabilities during multimodal training by freezing the original LLM parameters and introducing replicated Transformer layers along with lightweight LoRA modules [3][19]. - Through extensive fine-tuning, GenieBlue achieves multimodal capabilities comparable to mainstream MLLMs while fully preserving original pure language performance [3][19]. - GenieBlue avoids the MoE architecture limitations by employing a non-shared base inference strategy, enabling smooth operation on devices with Qualcomm Snapdragon 8 Elite (4th generation) chips [3][19]. Group 3: Training Data and Model Structure Analysis - The article discusses the limitations of simply adding pure text data to maintain language capabilities, highlighting the challenges in collecting high-quality data and the increased training time [9][12]. - It is noted that adding pure text data has limited impact on multimodal capabilities, and while it helps in objective NLP tasks, it does not significantly aid subjective tasks [11][12]. Group 4: GenieBlue Design and Deployment - GenieBlue's design is based on the CogVLM structure, focusing on separating text and multimodal information processing while avoiding MoE architecture [19][21]. - The deployment strategy involves freezing the original LLM during training and using a non-shared base approach, which effectively maintains the original language model's performance [24][26]. - GenieBlue has been validated for its multimodal and pure language accuracy, demonstrating competitive performance while being efficient for mobile NPU deployment [30][31][35]. Group 5: Performance and Efficiency - GenieBlue's multimodal accuracy is slightly lower than Qwen2.5-VL-3B but retains approximately 97% of BlueLM-V-3B's performance [31]. - In terms of pure language accuracy, GenieBlue shows no decline, contrasting with Qwen2.5-VL-3B, which experiences performance degradation in subjective tasks [33]. - The deployment efficiency of GenieBlue on Snapdragon 8 Elite shows that while there is a slight increase in loading time and memory requirements, it meets the daily usage needs of mobile devices with a speed of 30 tokens per second [35].