大型语言模型

Search documents
香港金管局与香港科技大学签署合作备忘录 推动香港金融业的网络安全创新
Zhi Tong Cai Jing· 2025-05-29 03:26
Core Viewpoint - The Hong Kong Monetary Authority (HKMA) and the Hong Kong University of Science and Technology (HKUST) Business School have signed a memorandum of cooperation to enhance collaboration in cybersecurity research, addressing the needs of the Hong Kong financial industry [1][2] Group 1: Collaboration Details - The memorandum establishes a strategic cooperation framework focused on cybersecurity, aiming to promote relevant research and knowledge growth [1] - The collaboration will utilize advanced technologies such as large language models to explore innovative supervisory technology (Suptech) and regulatory technology (Regtech) solutions [1] - The goal is to enhance the HKMA's regulatory capabilities and strengthen the financial sector's cybersecurity resilience [1] Group 2: Objectives and Impact - The partnership aims to develop practical application solutions, increase industry awareness of emerging threats, and cultivate cybersecurity professionals to support the ongoing development of the financial industry [1] - HKMA and HKUST will actively engage with financial institutions to validate research outcomes and gain deeper insights into the evolving cybersecurity needs and challenges faced by the industry [1] - The collaboration is expected to contribute to the resilience of Hong Kong's financial ecosystem by addressing real-world cybersecurity challenges [2]
蔡崇信:大多数机器人不需要像人类,年轻人选老板比选岗位更重要
Sou Hu Cai Jing· 2025-05-26 03:36
ters we the 来源:猎云网 第五届BEYOND国际科技创新博览会(BEYOND Expo2025)于5月21日至24日举行。 5月24日,在闭幕式上,阿里巴巴集团董事长蔡崇信现身现场,提到阿里巴巴对组织架构进行了一些调整。 蔡崇信称,阿里巴巴将专注于几大核心业务:一是电子商务;二是云计算;三是希望确保人工智能渗透到业务的各个方面,既面向客户,也面向内部。 此外,蔡崇信还发表了年轻人就业的观点。 他认为,年轻人应因为想获取更多技能和知识而工作,这才是工作的意义。 同时,他表示,当你将机器人技术与人工智能结合起来时,想到了非常令人兴奋的事情。比如,机器人可以为你煮咖啡,或者可以到你家清洁地板。 但他也认为,世界上大多数智能机器人不需要看起来像人类。 他举例,如果你想让一个机器人来清洁你的地毯,回家打扫你的厨房或客厅,你真的想要一个看起来像人类的东西吗?我会感到害怕。我只想要一个看起来 像吸尘器的东西能智能地在房间里完成清洁工作。 "当我们谈论机器人时,我们总是会想起小时候看过的电影。它们看起来都像人,但它们显然不是人。现在,我们是否正在努力向与人类完全一样的机器迈 进?我认为这实际上是一种技术。还有很多 ...
腾讯混元TurboS技术报告首次全公开:560B参数混合Mamba架构,自适应长短链融合
AI前线· 2025-05-22 19:57
Core Viewpoint - Tencent's Hunyuan TurboS model ranks 7th globally in the latest Chatbot Arena evaluation, showcasing its advanced capabilities and innovative architecture [1][2]. Group 1: Model Architecture and Innovations - Hunyuan TurboS employs a hybrid Transformer-Mamba architecture, achieving a balance between performance and efficiency through the integration of Mamba's long-sequence processing and Transformer’s contextual understanding [2][7]. - The model features 128 layers and utilizes an innovative "AMF" (Attention → Mamba2 → FFN) and "MF" (Mamba2 → FFN) interleaved module pattern, maintaining high computational efficiency while having a total of 560 billion parameters [7][14]. - An adaptive long-short thinking chain mechanism allows the model to dynamically switch between quick response and deep thinking modes based on problem complexity, optimizing resource allocation [2][7]. Group 2: Training and Evaluation - The model was trained on a dataset comprising 16 trillion tokens, significantly enhancing its performance compared to previous iterations [10][13]. - Hunyuan TurboS achieved an overall score of 1356 in the LMSYS Chatbot Arena, ranking it among the top 7 out of 239 models evaluated [2][49]. - The model demonstrated strong performance across various benchmarks, particularly excelling in multi-task capabilities and multilingual support, ranking first in Chinese, French, and Spanish [4][42]. Group 3: Post-Training Strategies - The post-training process includes four key modules: Supervised Fine-Tuning (SFT), Adaptive Long-short CoT Fusion, Multi-round Deliberation Learning, and Two-stage Large-scale Reinforcement Learning [8][22]. - SFT data was meticulously curated across multiple themes, ensuring high-quality samples for training [24][26]. - The adaptive long-short CoT fusion method allows the model to choose between long and short reasoning chains based on the complexity of the task, enhancing its reasoning capabilities [26][29]. Group 4: Performance Metrics - Hunyuan TurboS outperformed many leading models in key areas such as mathematical reasoning, logic reasoning, and knowledge-intensive tasks, particularly in Chinese evaluations [41][42]. - The model achieved a cost-effective output generation, using only 52.8% of the tokens compared to similar models while maintaining performance [43][45]. - The model's architecture and training optimizations resulted in a 1.8x acceleration in inference compared to pure Transformer MoE models [47].
何恺明等新作大道至简,瞬时速度改为平均速度,一步生成表现提升70%
量子位· 2025-05-21 06:31
Core Viewpoint - The article discusses the introduction of a new model called MeanFlow, which utilizes average velocity to achieve a one-step generation framework, significantly improving the state-of-the-art (SOTA) in image generation tasks [1][5][10]. Group 1: Model Development - The MeanFlow model is trained from scratch without any pre-training, distillation, or curriculum learning, achieving a Fréchet Inception Distance (FID) score of 3.43, which is a notable improvement over previous one-step diffusion/flow models [3][10][13]. - The model introduces the concept of average velocity to represent flow fields, contrasting with instantaneous velocity used in flow matching methods [5][9]. Group 2: Experimental Results - Experiments conducted on ImageNet at a resolution of 256×256 demonstrated that the MeanFlow model achieved a 50% to 70% relative advantage over previous state-of-the-art methods in terms of FID scores [13][19]. - The model's performance was evaluated through an ablation study, showing various configurations and their corresponding FID scores, with the best results achieved under specific parameter settings [15][19]. Group 3: Scalability and Comparison - The MeanFlow model exhibits good scalability in terms of model size, with different configurations yielding competitive FID scores compared to other generative models [16][19]. - A comparison of the MeanFlow model with other generative models indicates that it significantly narrows the gap between one-step diffusion/flow models and their multi-step predecessors [19][20]. Group 4: Research Team and Background - The research was conducted by a team from MIT and CMU, including notable contributors such as PhD student Geng Zhengyang and other students of He Kaiming [21][22][23]. - The team aims to bridge the gap between generative modeling and simulations in physics, addressing multi-scale simulation problems [20].
前景堪忧!苹果(AAPL.US)被曝在AI领域遭遇重重挫折
Zhi Tong Cai Jing· 2025-05-18 23:53
Core Insights - Apple's ongoing struggles in the AI sector may jeopardize its dominance in the smartphone market and threaten its broader ambitions in robotics and next-generation hardware [1] - Despite initial optimism following the hiring of John Giannandrea in 2018 to lead AI strategy, Apple has failed to keep pace with competitors in generative AI and large language models [1][3] Group 1: AI Strategy and Developments - In 2024, Apple announced "Apple Intelligence," promising smarter writing tools, summarization features, and an upgraded Siri, but the rollout has faced delays and internal testing issues [2] - Apple's slow progress in AI is attributed to a reluctance to make large-scale investments, internal cultural resistance, and strict data privacy policies that limit AI model training [3] - Apple is undergoing a restructuring, with leadership of Siri and related product development shifting from John Giannandrea to Mike Rockwell, head of the Vision Pro headset project [3] Group 2: Future Outlook and Challenges - Engineers are working on a complete overhaul of Siri's architecture to create a new system based on large language models, with internal testing of a proprietary chatbot aimed at matching ChatGPT's capabilities [4] - Apple plans to differentiate Siri from the broader "Apple Intelligence" brand to repair its damaged reputation, while adopting a conservative approach at the 2025 WWDC, focusing on incremental improvements rather than groundbreaking features [4] - Despite significant challenges, insiders believe Apple has the potential to catch up due to its hardware integration advantages, large global user base, and brand influence, although many acknowledge that Apple can no longer afford to be a "latecomer" in the AI field [4]
【中国那些事儿】俄专家:中俄人工智能合作跨越“小院高墙”,构建公平世界科技新秩序
Huan Qiu Wang Zi Xun· 2025-05-10 05:18
科洛宁还提到,人工智能的飞速发展引发了人们对滥用人工智能和通用人工智能的担忧。一些国家利用 其在人工智能领域的主导地位,对他国进行胁迫,阻挠它们与被视为威胁的国家开展合作。鉴于此,那 些希望建立公平世界秩序的国家需加深彼此间的合作,例如在金砖国家框架下,秉持互惠互利的原则, 共同推动全球科技治理体系的完善。 科洛宁强调,俄罗斯科学界对与中国以及其他志同道合的国家携手,共同推动全球在人工智能和通用人 工智能领域的协调发展与有效治理持开放态度。欢迎其他国家参与俄罗斯AGI社区研讨会等开放活动, 以及数学AI等联合会议,并期待各方逐步完善人工智能技术管理的联合战略。 另据相关报道,由外国顶尖专家组成的"瓦尔代"国际辩论俱乐部(Valdai Discussion Club)项目主任季 莫费·博尔达切夫(Timofei Bordachev)同样指出,人工智能是前沿科技领域,中国和俄罗斯都具备相 应技术和人才,两国可以通过在这一领域的合作,树立起科技合作的典范,并为全球南方国家在科学、 文化和教育领域的解放贡献力量。这不仅将为两国开辟全新的合作领域,还将切实推动南南合作,这对 于构建一个更加平衡、公正的世界秩序至关重要。 ...
铜缆和光纤外,第三种选择
半导体行业观察· 2025-05-08 01:49
Core Viewpoint - The article discusses the limitations of copper and fiber interconnects in next-generation data centers and introduces a third solution, e-Tube, which aims to support the growing demands of AI workloads and data bandwidth requirements [1][10][16]. Group 1: Challenges in Data Center Expansion - Data center AI accelerator clusters face increasing complexity due to the emergence of new technologies, particularly generative AI and large language models (LLMs), which are pushing data bandwidth beyond traditional interconnects, rapidly doubling to 800G and soon reaching 1.6T [1]. - The need for improved performance, cost control, and energy efficiency presents significant challenges for network operators [4]. Group 2: Limitations of Current Technologies - Data centers currently rely on 400G and 800G network equipment, using copper cables for short distances and fiber optics for long distances, but both technologies are approaching their respective limits in terabit interconnect speeds [3][6]. - Copper cables, while cost-effective and reliable for short distances, suffer from channel loss due to skin effect, limiting their transmission range and scalability in high-density data centers [3][6]. Group 3: Transition to Optical Interconnects - Large-scale enterprises are shifting towards optical interconnects, such as Active Optical Cables (AOC), which can provide connections over several kilometers but come with increased complexity, power consumption, and costs, potentially up to five times that of copper cables [8]. - Optical technologies are less reliable due to performance variations with temperature changes and the eventual failure of optical components, which can also introduce significant latency [8]. Group 4: Introduction of e-Tube Technology - The e-Tube platform offers a scalable multi-terabit interconnect solution using plastic medium waveguides to transmit radio frequency data, overcoming the limitations of copper and fiber optics [10][12]. - e-Tube cables, made from low-density polyethylene (LDPE), can efficiently transmit data without the high-frequency losses associated with copper, supporting data speeds from 56G to 224G and beyond [12]. Group 5: Advantages of e-Tube - e-Tube technology results in a tenfold increase in cable coverage, fivefold reduction in weight, twofold decrease in thickness, threefold reduction in power consumption, and a thousandfold decrease in latency, all while reducing costs by three times [14]. - This technology is positioned as an ideal alternative to copper cables as data centers transition to 1.6T and 3.2T speeds, providing unique power efficiency and compatibility with existing network infrastructure [14][16].
优步UBER
2025-05-07 15:20
Summary of Uber's Q1 2025 Earnings Call Company Overview - **Company**: Uber Technologies, Inc. (UBER.US) - **Date**: May 7, 2025 Key Points Financial Performance - Uber reported a strong Q1 2025 performance with total bookings and trip volume both increasing, adjusted EBITDA reached $1.9 billion, a 35% year-over-year increase, and free cash flow hit a record $2.3 billion [1][2] - Monthly active users grew by 14% to 170 million, with trip volume increasing by 18% and global retention rates at an all-time high [2] Autonomous Vehicle Initiatives - Uber partnered with Waymo to deploy approximately 100 autonomous vehicles in Austin, achieving high utilization rates and positive consumer feedback, with average usage exceeding 99% compared to human drivers [3][4] - Plans to expand the autonomous vehicle fleet in Austin and other regions like Atlanta are underway [4] Pricing Strategy and Market Dynamics - Uber observed that price elasticity remains similar to past trends, where a $1 price increase negatively impacts transaction volume, but consumers are adapting to stable pricing [5] - The competitive landscape in the U.S. ride-hailing market is intense, with competitors like Bolt and DK&D in international markets, yet Uber maintains a leading position [6] Growth Outlook - Uber anticipates stronger revenue and profitability growth in Q2 2025, setting a solid foundation for the peak season in the second half of the year [7] - The company is focused on providing high-quality services and has established clear strategies and ambitious goals for future growth [7] Delivery Business Performance - The gross margin for Uber's delivery business expanded to 3.7%, a 70 basis point increase year-over-year, driven by advertising revenue and economies of scale [3][10] - The delivery business showed strong profitability with a contribution margin of 9% in Q1, indicating robust growth potential in grocery and retail sectors [10] Insurance Costs and Innovations - Uber expects moderate increases in insurance costs in 2025 but aims to alleviate cost pressures through innovations and policy adjustments [3][11] - The company is implementing driver behavior scoring to enhance safety and reduce insurance costs, with positive feedback received [11] Macro Economic Environment - The macroeconomic environment has not shown significant changes in audience growth, maintaining a stable frequency of service usage [12][13] - Uber's diverse service categories, including dining and transportation, are less affected by macroeconomic uncertainties [13] International Market Developments - In Europe, Uber has achieved a leading position in the UK food delivery market through organic growth, with France and Germany identified as key markets for future expansion [16] Emerging Market Opportunities - Sparse mobility markets present growth opportunities for Uber, with 20% of trips now coming from these areas, which are growing faster than urban core markets [18][19] - Uber plans to launch hundreds of new cities by 2025, focusing on achieving sustainable profitability in these markets [18] Future of Autonomous Driving - The autonomous driving sector is evolving, with companies like Waymo leading the way, and Uber is collaborating with various partners to develop and deploy autonomous technologies in Europe [11][15] Conclusion - Uber's strategic focus on enhancing service quality, expanding autonomous vehicle initiatives, and navigating competitive pressures positions the company for continued growth and profitability in the evolving mobility landscape [7][19]
ICML 2025 | 注意力机制中的极大值:破解大语言模型上下文理解的关键
机器之心· 2025-05-06 04:11
Core Insights - The article discusses a significant phenomenon in large language models (LLMs) related to the concentration of massive values in the self-attention mechanism, particularly in the query (Q) and key (K) representations, which is crucial for contextual knowledge understanding [1][3][4]. Research Highlights - The study reveals that massive values are highly concentrated in Q and K, which is contrary to the expectation of independent operations in each attention head. This consistency across multiple layers and heads is visually demonstrated [3][4]. - The phenomenon of massive values is specifically observed in models using Rotational Position Encoding (RoPE), such as LLaMA, Qwen, and Gemma, while models without RoPE, like GPT-2 and OPT, do not exhibit this pattern [4]. - The research establishes a direct link between the presence of massive values in Q and K and the ability to understand contextual knowledge [4]. Key Findings 1. **Concentration of Massive Values**: Massive values are found to be highly concentrated in specific regions of each attention head, indicating a surprising level of consistency [3][4]. 2. **Impact on Contextual Knowledge Understanding**: The study shows that the presence of massive values is critical for understanding contextual knowledge, as demonstrated through destructive experiments that reset these values to their average [5][6]. 3. **Quantization Techniques**: Specific quantization methods that address massive values, such as AWQ and SmoothQuant, are shown to better preserve contextual knowledge understanding compared to methods that do not focus on massive values [7]. 4. **Origin of Concentration Phenomenon**: The concentration of massive values is attributed to RoPE, which affects low-frequency regions in Q and K, leading to this phenomenon appearing from the early layers of the model [8]. Experimental Results - The experiments reveal a stark contrast in the impact of massive values on different knowledge tasks: - **Resilience in Parametric Knowledge Retrieval**: Tasks relying on parametric knowledge show a decline of only 15-20% in accuracy when massive values are disrupted, maintaining 76%-88% accuracy [10]. - **Catastrophic Decline in Contextual Knowledge Tasks**: Tasks requiring contextual understanding experience a drastic drop in performance, with accuracy in key retrieval tasks plummeting from 100% to near 0% when massive values are disrupted [11]. - **Control Experiments**: When only non-massive values are disrupted, task performance remains stable, confirming the unique importance of massive values in contextual understanding [12]. Future Directions - The research opens several avenues for further exploration, including enhancing or adjusting the distribution of massive values to improve contextual understanding, examining the universality of this phenomenon across different architectures, and designing targeted quantization methods to protect massive values related to contextual understanding [16].
过去四周,AI推理爆了,GPU在燃烧,英伟达依旧供不应求
硬AI· 2025-04-29 00:18
根据摩根士丹利Joseph Moore团队25日发布的报告, 这种强劲的需求主要驱动因素在于token生成量的 增长,自年初以来,token生成量增长了5倍以上 ,这给生态系统带来了巨大压力,并推动了对处理这些 工作负载的投资激增。 点击 上方 硬AI 关注我们 大摩指出,受益于大型语言模型对推理芯片的巨大需求,英伟达面临GPU供不应求局面。但在持续的供应限制、毛利率 压力等负面影响下,大摩轻微下调英伟达目标价至160美元。长期来看,公司增长轨迹依然强劲。 硬·AI 作者 | 张雅琦 编辑 | 硬 AI 过去四周,投资者情绪因宏观经济和供应链风险而恶化,但与此同时,对英伟达GPU核心的需求却因主要 大型语言模型(LLM)对推理芯片的巨大需求而飙升,且这种需求遍及所有地区。 多家AI公司报告用户数量呈爆炸式增长,例如,Open Router等API公司的数据显示,许多公司为满足推 理软件的巨量需求,被迫争抢GPU资源,甚至出现"最后一块GB200"在2025年仅剩一块的状况。 摩根士丹利认为, 这种对推理的需求是关键。 这是由使用模型并产生收入的部分驱动的,证明了推理模 型的扩展是真实存在的,这与仅依赖于风险投 ...