Workflow
华为盘古大模型
icon
Search documents
数智赋能:建筑地产行业的转型突围与未来筑造
机器之心· 2025-09-24 07:48
机器之心发布 敏锐捕捉这一趋势的企业已启动转型,而华为 —— 以客户为中心的高科技巨头,凭借对 "好产品" 的深刻理解与自身数字化实践,成为行业转型的重要伙伴。中国 建筑科学研究院携手华为,从顶层规划、大数据平台到 "一网一云" 建设,逐步深化行业级垂直产品合作;联发集团借助华为数字化能力打造 "新青年好房子" 系 列,以 "1+2+3*N" 数智蓝图实现 "优总价、高品质、强运营、智慧化" 的模式创新。 数智化驱动,全链条实现效率与质量双飞跃 流程与组织优化仅是起点,新质生产力的核心价值,在于通过数智化技术实现 "投融建管营" 全流程的效率革命与质量升级,华为正以技术与生态之力加速这一进 程。中指研究院预测:"AI 赋能将从工具辅助升级为全产业链智能决策,未来竞争焦点将转向空间与资产运营能力。" 在设计端,大模型技术重构创意与审查逻辑:华为昇腾算力与构力科技行业知识深度融合,推出 "知识驱动的审图智能体",不仅提升审查效率,更构建起 "问题发 现 — 知识赋能" 的设计质量飞轮。 机器之心编辑部 作为人类文明的基石产业,建筑地产行业既是全球经济的核心支柱,更在时代浪潮中展现出强劲韧性:全球供应链重组催生人口 ...
深圳:探路者 | 《财经》封面
Cai Jing Wang· 2025-08-18 12:08
Economic Performance - Shenzhen's GDP reached 1.832226 trillion yuan in the first half of the year, marking a 5.1% year-on-year growth despite various challenges such as US-China trade tensions and domestic economic pressures [1][2] - The establishment of the Shenzhen Special Economic Zone 45 years ago has led to a GDP increase from 2.7 million yuan to nearly 4 trillion yuan, representing a growth of over 13,000 times [6] Reform and Innovation - The release of the "Opinions" by the Central Committee and the State Council aims to deepen reform and expand openness in Shenzhen, focusing on education, technology, and talent integration [2][3] - Shenzhen is encouraged to implement new reform measures and innovative experiments to enhance its role as a key engine in the Guangdong-Hong Kong-Macao Greater Bay Area [2][3] Infrastructure and Connectivity - The interconnection of metro systems between Shenzhen and Dongguan reflects the rapid urban integration and infrastructure development in the region [4] - Shenzhen's proactive planning in modern infrastructure has positioned it as a crucial gateway for trade and economic activities in China [9] Industry Development - Shenzhen has established a complete industrial chain in the new energy vehicle sector, with over 30% of national enterprises in this field having a presence in Shenzhen [16] - The city is also a hub for the robotics industry, with significant growth in both industrial and service robots, showcasing a robust ecosystem of innovation and production [24][25] Talent and Investment - Shenzhen's total talent pool has surpassed 7 million, with over 400,000 skilled workers and more than 22,000 returnees from studying abroad, contributing to its innovation-driven economy [22] - The city has seen a substantial increase in venture capital investments, with over 97 billion yuan invested in more than 20,000 projects [29] Challenges and Future Outlook - The competitive landscape is intensifying, with concerns about maintaining Shenzhen's unique advantages amid rising competition from other cities [20][28] - The city is tasked with balancing its historical successes with the need for continuous innovation and adaptation to global market changes [33][34]
国泰海通|产业:华为盘古大模型与昇腾AI计算平台,共同构建软硬一体的AI技术体系
Core Viewpoint - Huawei is exploring a path to build its full-stack AI competitiveness through soft and hard collaborative innovation, transitioning from merely catching up with industry SOTA models to customizing model architectures to better leverage its self-developed Ascend hardware [1][2]. Group 1: AI Development Strategy - Huawei's AI development strategy has shifted towards a dual evolution path that addresses systemic issues in the large-scale application of AI models, focusing on a technology system composed of hardware-software collaborative architecture, operators, and software stacks [1]. - The evolution of the Pangu large model aims to solve efficiency challenges in large-scale distributed systems, particularly addressing the systemic bottleneck of expert load imbalance in the transition from dense architectures to mixture of experts (MoE) sparse architectures [1][2]. Group 2: Innovative Paths for Large Models - Huawei has launched two innovative paths at the large model level: Pangu Pro MoE, which introduces a grouped expert mixture (MoGE) architecture to tackle load imbalance, and Pangu Ultra MoE, which optimizes model architecture through system-level enhancements to better adapt to Ascend hardware [2]. - The physical foundation for this software-hardware collaborative innovation is the new generation AI infrastructure CloudMatrix, which features a unified bus (UB) network that reduces performance discrepancies in cross-node communication [2]. Group 3: Hardware and Software Synergy - The development of CloudMatrix not only provides a physical basis for software innovations like the Prefill-Decode-Caching (PDC) decoupled architecture but also enables high parallelism and low latency in software through large-scale expert parallelism (LEP) and operator-level optimizations like AIV-Direct [2].
大模型“套壳”争议:自研与借力的边界何在?
Sou Hu Cai Jing· 2025-07-17 01:39
Core Viewpoint - The debate over "original research" versus "shell models" in the AI field has intensified, particularly focusing on the similarities between Huawei's Pangu model and Alibaba Cloud's Qwen model [1][2] Group 1: Development and Trends in AI Models - The rise of large models can be traced back to the Transformer architecture released by Google Brain in 2017, with three main types dominating the field: Decoder-only (like GPT), Encoder-Decoder (like T5), and Encoder-only (like BERT) [2] - The launch of ChatGPT in November 2022 based on GPT 3.5 attracted millions of users, marking the entry of large language models (LLMs) into public awareness and prompting many companies to enter the market [2] - The open-source era in 2023 has led to an increase in teams using open-source frameworks for model training, facilitating technological exchange and iteration [1][4] Group 2: Shell Model Controversies - Initial shell model behaviors often involved simple API wrapping without any secondary development, but regulatory scrutiny has increased, leading to penalties for such practices [3] - Despite regulatory actions, shell models continue to emerge, with some models being criticized for having "GPT-like" responses, raising questions about their originality [3][4] - The concept of "data distillation," where a strong "teacher model" generates high-quality data for training a "student model," has gained attention, especially after ByteDance was reported to have used OpenAI's API for data generation [4] Group 3: Open Source and Compliance Issues - The open-source movement has led to debates about whether using open-source model architectures for secondary development constitutes shell modeling, with various opinions on compliance and ethical boundaries [4][8] - A notable incident involved the Yi-34B model, which sparked discussions about compliance with the LLaMA open-source protocol, highlighting the complexities of defining shell models versus original research [5][7] - The lowering of development barriers in the open-source era has resulted in both positive advancements and negative shell behaviors, prompting ongoing discussions about the moral and legal implications of such practices [8][9] Group 4: Industry Perspectives - Major companies may lack foundational training logic and experience in model development, leading them to leverage open-source technologies for quicker advancements [9] - The AI industry recognizes that while using open-source technology is acceptable, it is crucial to provide clear documentation and avoid misrepresenting such efforts as original research [9]
大模型套壳往事
Hu Xiu· 2025-07-14 09:26
Core Viewpoint - The article discusses the ongoing debate in the AI industry regarding "original research" versus "shelling" models, particularly in the context of the emergence of large language models (LLMs) and the practices surrounding their development and deployment [1][2]. Group 1: Historical Context of Model Development - The AI evolution can be traced back to the 2017 release of the Transformer architecture by Google Brain, which remains foundational in the development of various large models today [3]. - The introduction of ChatGPT in November 2022 marked a significant moment, leading to a surge in the development of models, including many that resorted to "shelling" practices to monetize access to ChatGPT's capabilities [4][5]. Group 2: Shelling Practices and Controversies - By the end of 2022, numerous imitation ChatGPT platforms emerged, with developers simply repackaging APIs for profit, leading to regulatory scrutiny [6][7]. - In May 2023, concerns arose regarding the iFlytek Spark model, which allegedly claimed to be developed by OpenAI, highlighting the issue of "identity confusion" in model outputs due to training data contamination [8][9]. Group 3: Data Distillation and Model Training - Data distillation is a method where a powerful "teacher model" generates high-quality data for a "student model" to learn from, which has become a common practice in the industry [9][10]. - The controversy surrounding ByteDance's use of OpenAI's API for data generation raised questions about compliance with usage terms, illustrating the blurred lines between legitimate use and shelling [10]. Group 4: The Open Source Era - The shift to open-source models began in 2023, with many companies opting to release their models to foster innovation and collaboration within the developer community [13][16]. - The emergence of open-source models has led to debates about the legitimacy of using existing architectures for new model development, as seen in the case of Baichuan-7B and Yi-34B [13][14]. Group 5: Industry Dynamics and Future Outlook - The AI industry is witnessing a "hundred model war," where approximately 90% of models are built on open-source frameworks, allowing smaller teams to innovate without starting from scratch [16][17]. - The introduction of lightweight fine-tuning methods has lowered the barriers for model development, enabling more companies to enhance their operational efficiency [17][18]. - The ongoing discussions about the ethical boundaries of shelling and original research highlight the complexities of intellectual property and innovation in the rapidly evolving AI landscape [22][23].
盘古大模型与通义千问,谁抄袭了谁?
Core Viewpoint - The controversy surrounding Huawei's Pangu 3.5 and Alibaba's Tongyi Qianwen 1.5-7B models centers on the high correlation score of 0.927 derived from the "LLM-Fingerprint" technology, suggesting potential similarities or derivation between the two models [1][14][16]. Group 1: Technical Analysis - The "LLM-Fingerprint" technology analyzes model responses to specific trigger words, generating a unique identity for each large model [12][11]. - A report indicated that the correlation score of 0.927 between Huawei's Pangu 3.5 and Alibaba's Tongyi Qianwen 1.5-7B is significantly higher than the scores between other mainstream models, which are generally below 0.1 [14][15]. - Huawei's defense against the allegations was deemed unscientific by external observers, as they pointed out that high correlation could also be found among different versions of the Tongyi Qianwen models [19][20]. Group 2: Open Source Culture and Ethics - The debate highlights the tension between "reuse" and "plagiarism" within the AI open-source ecosystem, raising questions about the ethical implications of model development [22][21]. - The high costs associated with developing large models, estimated at $12 million for effective training, make it common practice to build upon existing open-source models [25][26]. - The distinction between "reuse" and "plagiarism" remains ambiguous, particularly regarding model parameters and adherence to open-source licenses [28][29]. Group 3: Competitive Landscape - The incident reflects the intense competition between Huawei and Alibaba in the Chinese AI market, with Alibaba currently serving 90,000 enterprises through its Tongyi series models [37][42]. - Huawei's Pangu model is crucial for its strategy to establish a comprehensive AI ecosystem, while Alibaba has leveraged its cloud infrastructure and open-source ecosystem to gain a competitive edge [32][36]. - The silence from Alibaba's Tongyi Qianwen team amid the controversy suggests a strategic decision to avoid escalating the situation into a public dispute [40][47]. Group 4: Industry Implications - The controversy serves as a "stress test" for the current AI open-source ecosystem, exposing its vulnerabilities and the lag in governance [52]. - The industry is urged to establish clearer rules regarding model citation and derivation standards, akin to plagiarism detection systems in academia [53]. - There is a call for greater transparency in model development processes, including the promotion of "Model Cards" and data transparency [54].
【产业互联网周报】华为盘古大模型被质疑抄袭;AI人才争夺加剧,DeepSeek在海外大举招聘人才;微软被曝将“AI使用量”纳入员工考核,直接挂钩绩效;设...
Tai Mei Ti A P P· 2025-07-08 03:37
Group 1 - Huawei's Pangu team announced the open-source release of the Pangu 7B dense and 72B mixture of experts models, but faced allegations of plagiarism from Alibaba's Tongyi Qwen-2.5 14B model, with a high similarity score of 0.927 in attention parameter distribution [2][3] - Huawei's Noah's Ark Lab responded that the Pangu Pro MoE model was developed and trained on its Ascend hardware platform and not based on other vendors' models [2] - An article published on GitHub by a self-identified member of Huawei's Pangu team claimed that the team fabricated technological breakthroughs and used competitor models for training [3] Group 2 - Tencent responded to user complaints about the new "AI search" feature in WeChat, clarifying that it integrates public information without using user privacy data [4][5] - Baidu announced its largest search business overhaul in a decade, allowing for over 1,000 characters in search queries and integrating AI writing and image generation capabilities [6] Group 3 - The 2025 Global Digital Economy Conference revealed a list of the top 100 talents in the AI field, with a significant representation of Chinese individuals [7] - DeepSeek is reportedly ramping up overseas recruitment, aiming to attract talent for positions focused on artificial general intelligence (AGI) [9] Group 4 - ByteDance has produced over 1,000 robots in two and a half years, with a long-term goal of achieving embodied intelligence [10] - Zhipu AI released and open-sourced the GLM-4.1V-Thinking series, a multimodal model with 9 billion parameters, demonstrating superior performance in various benchmark tests [10] Group 5 - Yonyou Network Technology submitted an H-share listing application to the Hong Kong Stock Exchange, marking a significant step in its internationalization strategy [14] - Wisdom Eye was included in KPMG's inaugural "China Health Technology Top 50" list for its innovative applications in healthcare AI [14] Group 6 - Baidu officially open-sourced the Wenxin large model 4.5 series, which includes various models with different parameter configurations [15] - DingTalk launched over 100 free templates for the e-commerce industry, integrating AI functionalities for various business needs [16] Group 7 - Siemens and other EDA companies confirmed the lifting of U.S. export restrictions on chip design software to China, allowing for renewed access to their technologies [17][18] - Trump announced new tariffs set to take effect on August 1, with rates potentially reaching up to 70% [19] Group 8 - Microsoft is set to lay off nearly 9,000 employees as part of a restructuring plan aimed at optimizing processes and reducing management layers [20] - Elon Musk's xAI company completed a $10 billion funding round to further develop its AI solutions and data centers [20] Group 9 - Google announced the global availability of its latest AI video generation model, Veo3, which significantly enhances video production capabilities [21] - CoreWeave became the first AI cloud service provider to deploy NVIDIA's GB300 NVL72 system, boasting high AI performance [22] Group 10 - Cursor apologized for a pricing communication issue regarding its Pro Plan and offered refunds to affected users [23] - Cursor's developer Anysphere hired two former executives from Anthropic to strengthen its leadership team [25] Group 11 - Microsoft is incorporating AI usage into employee performance evaluations, reflecting its commitment to integrating AI tools into daily operations [26] - Apple is considering using AI technologies from Anthropic or OpenAI for its Siri assistant, indicating a potential shift in its AI strategy [27] Group 12 - Meta established a new department called the "Meta Superintelligence Lab," recruiting several prominent figures from the AI industry [28] - Multiple European companies urged the EU to pause the implementation of the upcoming AI Act, citing concerns over its impact on innovation [29] Group 13 - Figma submitted its IPO application, aiming to list on the NYSE, following a previous failed acquisition attempt by Adobe [31] - Remark completed a $16 million Series A funding round to expand its online retail guidance services [32] Group 14 - Zhiyu Technology went public in Hong Kong, raising approximately 320 million HKD for research and international market expansion [37] - Domestic GPU company Sunrise raised nearly 1 billion RMB in funding to support its high-performance GPU development [38]
华为回应盘古大模型抄袭;DeepSeek 在海外招聘;马斯克宣布成立“美国党”,明年参加大选|AI 周报
AI前线· 2025-07-06 04:03
Core Viewpoint - The article discusses various developments in the AI industry, including controversies surrounding Huawei's Pangu model, recruitment efforts by DeepSeek, and significant personnel changes in major tech companies like ByteDance and Microsoft. Group 1: Huawei and AI Models - Huawei's Pangu team responded to allegations of plagiarism regarding their open-source models, claiming that their MoE model is based on their own development and not on other companies' models [1][2] - The Pangu models include various parameter specifications, such as the Pangu E series for mobile applications and the Pangu S series for super-large models, aimed at enhancing AI technology applications across different sectors [5] Group 2: Recruitment and Personnel Changes - DeepSeek has recently begun recruiting overseas talent, indicating a strategic move to attract skilled professionals in the AI field [6][7] - ByteDance's AI product lead, Wang Xuan, has left the company to pursue a new venture in AI hardware, with backing from a prominent investment firm [8] - The core product lead of the AI programming project "Xinyan Yima" has secured new funding, doubling the company's valuation to several hundred million USD [9] Group 3: Microsoft and AI Integration - Microsoft announced a second round of layoffs affecting approximately 9,000 positions, with a focus on cost control and streamlining operations [11][12] - The company is integrating AI usage into employee performance evaluations, emphasizing the importance of AI tools in daily operations [12][13] Group 4: Other Industry Developments - Apple is considering using AI technologies from Anthropic or OpenAI for Siri, potentially sidelining its internal models [13] - The U.S. has lifted export restrictions on EDA software to China, allowing major chip software companies to resume supply [16] - AMD's CEO has received a significant salary increase and stock options, reflecting the company's strong market position [17] - ByteDance has reportedly produced over 1,000 robots, focusing on logistics applications and aiming for advancements in embodied intelligence [18][19]
为什么 DeepSeek 大规模部署很便宜,本地很贵
AI前线· 2025-07-04 06:10
Core Insights - The article discusses the trade-off between throughput and latency in AI inference services, particularly focusing on models like DeepSeek-V3, which are said to be fast and cheap at scale but slow and expensive when run locally [1][12]. - It highlights the importance of batch processing in improving GPU efficiency, where larger batch sizes can lead to higher throughput but increased latency due to waiting for the batch to fill [2][12]. Batch Processing and GPU Efficiency - Batch processing allows multiple tokens to be processed simultaneously, leveraging the GPU's ability to perform large matrix multiplications efficiently [3][4]. - The efficiency of GPUs is maximized when executing large matrix multiplications in a single command, reducing overhead and memory access times compared to multiple smaller operations [4][12]. - In inference servers, a "collect window" is used to queue user requests, balancing the need for low latency (5-10 milliseconds) against the benefits of higher throughput with larger batch sizes [5][12]. Expert Mixture Models and Pipeline Efficiency - Expert mixture models, like DeepSeek-V3, require larger batch sizes to maintain GPU efficiency, as they involve multiple independent weight blocks that can lead to low throughput if not properly batched [6][12]. - Large models with many layers need to avoid "pipeline bubbles" by ensuring that the batch size exceeds the number of layers in the pipeline, which can otherwise lead to inefficiencies and increased latency [8][12]. - The article notes that maintaining a full queue is challenging due to the need for sequential processing of tokens, which complicates the batching of requests from the same user [9][10]. Implications for Inference Providers - Inference providers must choose batch sizes that optimize throughput while managing latency, as larger batch sizes can lead to significant delays for users waiting for their tokens to be processed [12]. - The performance of models from companies like OpenAI and Anthropic suggests they may utilize more efficient architectures or advanced inference techniques to achieve faster response times compared to models like DeepSeek [12].
全国首例!深圳龙岗智慧教育AI平台率先接入华为盘古大模型
Nan Fang Du Shi Bao· 2025-07-02 08:58
Core Insights - Huawei has announced the open-source release of its Pangu model with 7 billion parameters and the Pangu Pro MoE model with 72 billion parameters, alongside model inference technology based on Ascend [1] - Longgang District Education Bureau in Shenzhen has become the first government department in China to deploy the open-source Pangu model, marking a significant step in the "AI + Education" strategy [1][5] Strategic Foundation - Longgang recognizes the need for a robust, secure, and innovative AI technology to transition from "digitalization" to "intelligentization" in education, making Huawei's Pangu model a necessary choice [3] - The collaboration with Huawei allows Longgang to actively participate in the development of a model tailored to its educational needs, integrating local educational resources and methodologies [3] Industry Collaboration - The Longgang Education Bureau is linking local AI technology resources to ensure the effective implementation of the Pangu model, with several tech companies involved in developing training and optimization solutions [4] - The existing smart education AI platform in Longgang supports various intelligent applications, laying a foundation for AI-enabled education [4] Scenario Empowerment - The introduction of the Pangu model aims to reshape the educational ecosystem, benefiting both teachers and students [6] - Teachers will receive support in lesson preparation, grading, and personalized assignments, allowing them to focus on creative teaching [6] - Students will have access to a "one-on-one intelligent companion" for 24/7 assistance, personalized learning paths, and comprehensive growth tracking [6] Future Vision - Longgang's initiative to adopt the Pangu model is a key move in its "All in AI" city strategy, showcasing a model of technological innovation and educational application [8] - The Longgang Education Bureau aims to explore further innovative applications of AI in areas such as moral education and family-school collaboration, focusing on developing future learners with intelligent collaboration and complex judgment skills [8]