Workflow
TurboDiffusion
icon
Search documents
让AI沉下来:北京锻造人工智能第一城
1月8日上午9点30分,港交所交易大厅内掌声响起,智谱AI正式挂牌上市。两千多公里外的北京海淀, 东升大厦前的一块巨大的电子屏亮起:"智谱 全球大模型第一股",左上方"AI原点社区"的标志很醒目 ——这里,曾是人工智能在北京萌芽的地方。 新质生产力加速成长、产业升级步履铿锵、首都功能不断提升……"十四五"时期,我们见证了北京高质 量发展的坚实步伐。带着超5.2万亿地区生产总值的底气,这座城市开启了"十五五"的新征程。值此 2026北京"两会"之际,我们推出"乘风'十五五' 北京的领与先"系列专题报道,聚焦北京在科技创新、产 业跃升、城市治理、民生改善等各领域的先行探索与引领实践,擘画高质量发展新图景的思路与举措。 2026年1月的一次公开演讲上,清华大学人文学院副教授唐宸向观众展示了一张特殊的星图:金星与昴 宿在夜空中重合——这是他利用识典古籍助手对公元757年的天象进行的高精度复原。 同一座城市里,另一场协同正在发生。在智源研究院的推动下,一套名为FlagOS的系统软件栈,成了 连接国产AI芯片与大模型的"通用语言"。 两个故事,看似毫无联系,却共同指向一个事实:在全球人工智能大浪淘沙的激烈筛选中,北京正 ...
让AI沉下来 北京锻造人工智能第一城
Bei Jing Shang Bao· 2026-01-26 14:29
Group 1 - Beijing is witnessing high-quality development with a regional GDP exceeding 5.2 trillion, marking the beginning of the "15th Five-Year Plan" [1] - The city aims to establish itself as the "Artificial Intelligence Capital" by leveraging its talent density, full-stack ecosystem, and industrial clusters [3][4] - The goal is to achieve a core AI industry scale exceeding 1 trillion within two years, with a focus on source innovation and full-chain implementation [3][12] Group 2 - The "AI Origin Community" in Haidian is one of the first four AI innovation districts, highlighting the concentration of educational and research resources [4][5] - Beijing has 15,000 AI scholars, accounting for 30% of the national total, with 148 recognized as the most influential globally [5] - The city ranks second globally in AI innovation, supported by two national AI laboratories and numerous AI-focused educational institutions [7] Group 3 - The FlagOS system software stack connects domestic AI chips with large models, facilitating collaboration among various AI enterprises [6][9] - The RoboBrain 2.5 project demonstrated the capabilities of domestic GPUs in training advanced AI models, marking a significant milestone for local technology [9][11] - High-quality data is essential for AI development, with companies like Guanglun Intelligent utilizing simulated data for extensive training [10] Group 4 - The Beijing AI industry is projected to double from approximately 450 billion in 2025 to over 1 trillion by 2027, driven by efficiency improvements and market demand [13][14] - The consumer sector is emerging as a new growth driver, with companies like Kuaishou's Keling AI achieving significant revenue growth and user engagement [14] - The focus on practical applications and technological refinement is shaping a systematic innovation landscape in Beijing's AI sector [14]
从实验室到全球基建:IonQ 百比特算力落子韩国
国泰海通· 2026-01-03 08:21
Investment Rating - The report does not explicitly provide an investment rating for the industry or companies discussed. Core Insights - The technology industry experienced 196 financing events globally from December 22, 2025, to January 1, 2026, with 181 events in China and 15 abroad, highlighting significant investment activity in advanced manufacturing, artificial intelligence, and enterprise services [9]. - The semiconductor sector is witnessing advancements with the global release of high-purity P-type SiC substrates and the development of the first 12-inch high-quality silicon carbide epitaxial wafer, which are expected to enhance production efficiency and reduce costs in the semiconductor industry [33][38]. - The artificial intelligence sector is advancing with new technologies such as the TurboDiffusion framework for video generation, which can accelerate video creation by up to 200 times, and the introduction of the "Shan Hai" S30FP/S30P SPU IP, which provides comprehensive security solutions for high-performance computing chips [4][41]. Summary by Sections Financing Overview - A total of 196 financing events occurred in the technology sector during the specified period, with advanced manufacturing, artificial intelligence, and enterprise services leading the way in terms of the number of events [9]. IPO Updates - Several companies went public, including: - **InSilico Medicine** listed on the Hong Kong Stock Exchange, focusing on AI-driven drug discovery, significantly reducing the time for drug development from an average of 4.5 years to 12-18 months [11][12]. - **Tiansu Measurement** listed on the Shenzhen Stock Exchange, providing independent third-party measurement and testing services across various industries [15][16]. - **Nobikang** also listed on the Hong Kong Stock Exchange, specializing in AI solutions for railway and power companies [18][19]. Semiconductor Sector Developments - **SuperChip** launched a high-purity P-type SiC substrate, addressing critical impurities that have historically hindered the industry, thus enhancing the reliability of high-voltage IGBT devices [33][36]. - **Hantian Technology** developed the world's first 12-inch high-quality silicon carbide epitaxial wafer, which is expected to significantly improve production efficiency and lower costs in the semiconductor industry [38][40]. - **Arm Technology** introduced the "Shan Hai" S30FP/S30P SPU IP, enhancing security for high-performance computing applications [41][42]. AI and Quantum Technology Innovations - The report highlights advancements in AI, including a collaboration between Shenshu Technology and Tsinghua University to accelerate video generation, and IBM's new framework for large language model planning [4][41]. - In quantum technology, IonQ has established a significant presence in South Korea with its quantum computing capabilities, marking a strategic expansion in the global infrastructure [4][6].
计算机行业周报:MiniMax发布MiniMaxM2.1大模型,清华大学发布TurboDiffusion-20251231
Huaxin Securities· 2025-12-31 13:00
Investment Rating - The report maintains a "Buy" rating for the companies mentioned, including Weike Technology (301196.SZ), Nengke Technology (603859.SH), and Hehe Information (688615.SH) [9][58]. Core Insights - The AI industry is experiencing significant advancements, particularly with the release of MiniMax M2.1, which has achieved state-of-the-art performance in multilingual code evaluation and enhanced problem-solving capabilities [2][6][30]. - The introduction of TurboDiffusion by Tsinghua University marks a breakthrough in AI video generation, reducing generation time from minutes to seconds while maintaining video quality [3][32][39]. - The financing landscape for AI startups is robust, exemplified by LemonSlice's $10.5 million seed round aimed at developing interactive video avatars, indicating a shift towards multimodal AI interactions [4][44][48]. Summary by Sections 1. Computing Power Dynamics - MiniMax released the M2.1 model, achieving 72.5% in the SWE-bench Multilingual evaluation, surpassing competitors like Gemini 3 Pro and Claude Sonnet 4.5 [23][24]. - The rental prices for computing power remain stable, with specific configurations priced at approximately 28.64 CNY/hour for Tencent Cloud and 31.58 CNY/hour for Alibaba Cloud [22][23]. 2. AI Application Dynamics - Discord's weekly traffic increased by 9.44%, indicating growing engagement in AI applications [31]. - TurboDiffusion has accelerated AI video generation significantly, achieving up to 200 times faster processing without compromising quality [32][36]. 3. Financing Trends - LemonSlice secured $10.5 million in seed funding to enhance its real-time interactive avatar technology, showcasing the potential for AI to evolve from text-based to multimodal interactions [44][46]. 4. Market Review - The AI application index and AI computing power index showed positive performance, with notable gains in specific companies like Yingwei Technology and Shengyi Technology [50][56]. 5. Investment Recommendations - The report highlights the strategic acquisition of Groq by NVIDIA for approximately $20 billion, reinforcing NVIDIA's leadership in the AI chip sector and emphasizing the high demand for efficient computing solutions [6][56].
马斯克评宇树机器人「下黑脚」/OpenAI联创:从未感到如此落后/围棋比赛选手戴AI眼镜引争议|Hunt Good周报
Sou Hu Cai Jing· 2025-12-28 07:28
Group 1 - Elon Musk commented on the Chinese robot, UTree's G1 humanoid robot, which unexpectedly kicked a test engineer during a demonstration, leading to viral attention on social media [1][2] - The eighth "Pig Killing Conference" Go tournament raised controversy when amateur player Li Meng was found wearing AI glasses, leading to accusations of cheating after winning against several professional players [2][12] - The Beijing Economic-Technological Development Area announced a humanoid robot half marathon scheduled for April 19, 2026, featuring both autonomous navigation and remote control categories [12][13][15] Group 2 - OpenAI acknowledged that its AI browser, Atlas, is vulnerable to prompt injection attacks, which can manipulate the AI to execute hidden commands, such as sending resignation emails [16][17] - The market share of generative AI tools is shifting, with ChatGPT's share dropping to 68% from 87.2% a year ago, while Google's Gemini has surged to 18.2% [17][20] - Microsoft CEO Satya Nadella has taken a hands-on approach to improve the Copilot AI assistant, expressing dissatisfaction with its performance compared to competitors [21][23] Group 3 - Joshua Bengio, a Turing Award winner, expressed concerns about AI risks, emphasizing the need for responsible development and the potential dangers of AI systems resisting shutdowns [42][44] - Former Tesla AI director Andrej Karpathy highlighted the significant transformation in the programming profession due to AI advancements, suggesting that programmers must adapt to new tools and methodologies [45][48] - Leonardo DiCaprio discussed the impact of AI on filmmaking, asserting that while AI can enhance creativity, true artistry must originate from human experience [50]
视频生成DeepSeek时刻!清华&生数开源框架提速200倍,一周斩获2k Star
机器之心· 2025-12-26 04:35
Core Insights - The article discusses the launch of TurboDiffusion, an open-source framework developed by Tsinghua University's TSAIL team and Shenshu Technology, which significantly accelerates video generation, reducing the time required to generate videos from minutes to seconds [1][3][7]. Group 1: Technological Breakthrough - TurboDiffusion marks a pivotal shift from traditional video rendering and waiting to real-time generation, addressing the high inference latency that has limited the practical use of video generation models [3][7]. - The framework achieves approximately 200 times acceleration in generating high-quality videos, allowing a 5-second 720p video to be produced in just 24 seconds on a single RTX 5090 GPU [26][43]. - The technology employs four core techniques: mixed attention acceleration, efficient step distillation, and W8A8 linear layer quantization, which collectively enhance video generation efficiency without compromising quality [13][20][21]. Group 2: Implementation and Performance - Mixed attention acceleration includes SageAttention and Sparse-Linear Attention (SLA), which optimize attention mechanisms for faster processing [14][17]. - Efficient step distillation reduces the number of sampling steps required for video generation from 100 to as few as 3 or 4, maintaining high video quality [20]. - The W8A8 linear layer quantization compresses model size by about 50%, utilizing INT8 Tensor Cores for faster linear layer computations [21]. Group 3: Industry Impact - TurboDiffusion's introduction lowers the computational barrier for high-end video creation, making it accessible to individual creators using consumer-grade GPUs [51]. - The framework enables near real-time video generation, enhancing creative exploration by allowing instant feedback on adjustments to prompts [52]. - The advancements in video generation technology, including TurboDiffusion, are expected to facilitate the development of applications requiring immediate feedback, such as AI video live streaming and AR/VR content rendering [52].
腾讯研究院AI速递 20251226
腾讯研究院· 2025-12-25 16:57
Group 1 - Nvidia has reached a non-exclusive licensing agreement with AI chip startup Groq, reportedly worth $20 billion, acquiring Groq's founder Jonathan Ross and engineering team [1] - Groq focuses on LPU chips for inference, achieving an output speed of 500 tokens per second per card, which is ten times faster than Nvidia's GPUs, utilizing a temporal instruction set architecture to mitigate HBM shortages and reduce costs [1] - This transaction represents a "technology licensing + talent acquisition" model, allowing Groq to continue its cloud business independently while Nvidia aims to enhance its inference computing capabilities targeting the Google TPU market [1] Group 2 - Tsinghua TSAIL Laboratory and Shengshu Technology have jointly open-sourced the TurboDiffusion video generation acceleration framework, reducing the processing time of a 1.3B-480P model on a single RTX 5090 from 184 seconds to 1.9 seconds, achieving a 97-fold acceleration [2] - The framework integrates four core technologies: SageAttention2++ quantization, SLA sparse linear attention, rCM step distillation, and W8A8 quantization, decreasing end-to-end latency from 900 seconds to 8 seconds [2] - SageAttention has been successfully integrated into NVIDIA TensorRT and deployed on platforms such as Huawei Ascend and Moole Technology, with major companies like Tencent, ByteDance, and Alibaba already applying it [2] Group 3 - Shanghai Municipal Planning and Resources Bureau and SenseTime have launched the first 600 billion parameter foundational model in the national planning and resources field, named "Yunyu Xingkong," which can answer questions, adjust maps, perform statistics, recognize images, and generate reports [3] - The model is trained on the Kunyu Jinglue corpus and is integrated with the government intranet's professional version and core business systems, achieving a 98% accuracy rate for specialized terms and a 95% approval rate for human Q&A [3] - It employs a "1+6" (base + vertical) model system and an intelligent scheduling engine, supporting natural language calls for 2D and 3D spatial data, exploring a new paradigm for data productization and service-oriented government models [3] Group 4 - Tencent Cloud and Anhui Yilu Weixing have launched the first AI assistant in the ETC field, named "Assistant Agent," based on Tencent's Mix Yuan model, which has served over one million users since its internal testing began in April [4] - The assistant integrates multimodal interaction technology, supporting both text and voice input, achieving a 95% accuracy rate in Q&A and a 90% problem-solving rate, capable of handling complex requests such as device inquiries, traffic record checks, and invoicing [4] - It deploys 105 state monitoring algorithms to collect real-time device operation data, enabling voice interaction and key status reporting for a "service find person" capability, allowing users to control devices via voice commands [4] Group 5 - Dexmal has proposed the GeoVLA framework, utilizing a dual-stream architecture to retain VLM semantic understanding while endowing robots with 3D geometric perception capabilities through point cloud embedding networks and spatial awareness action experts [6] - In the LIBERO-90 long-range multi-task test, it achieved a 97.7% success rate, surpassing OpenVLA-OFT, and reached an average success rate of 77% in ManiSkill2, with an overall average of 86.3% in real-world tasks [6] - It demonstrated outstanding performance in out-of-distribution scene robustness tests, maintaining a 60% success rate with varying basket heights and a 70% success rate with a 45° viewpoint shift, proving its understanding of true 3D spatial structures [6] Group 6 - The SciMaster team, composed of Shanghai Jiao Tong University's TSAIL Laboratory, Shanghai Algorithm Innovation Research Institute, and DeepSense Technology, has launched ML-Master 2.0, achieving a 56.44% medal rate in the MLE-bench, topping the leaderboard [7] - This system is designed for real machine learning engineering, introducing a hierarchical cognitive caching mechanism that models context as Experience, Knowledge, and Wisdom [7] - It employs a "generate-validate" protocol to achieve ultra-long-range autonomous capabilities, with applications already in theoretical computational physics and embodied intelligence, currently open for Waiting List applications via the SciMaster platform [7] Group 7 - Jim Fan, head of embodied intelligence at Nvidia, stated that Tesla's FSD v14 is the first AI to pass the physical Turing test, with Elon Musk noting that "perception is maturing," and the software has been launched in seven countries including the US [9] - Tesla has established 14 technical barriers, including a sensor freezing scheme for 4-6 years to accumulate data, an instant value judgment engine for intelligent data filtering, and a Neural Codec for processing raw Bayer data [9] - The end-to-end transformer facilitates the transition from photon input to motor torque output, with hardware-in-loop quantization training conducted on the Cortex supercomputer's vehicle chip, updating 12 versions within 77 days, although issues remain with lane switching and lane change decisions [9]
单卡2秒生成一个视频,清华联手生数开源TurboDiffusion,视频DeepSeek时刻来了
3 6 Ke· 2025-12-25 12:12
Core Insights - The article discusses the launch of TurboDiffusion, an open-source framework developed by Tsinghua University's TSAIL lab and Shenshu Technology, which significantly accelerates video generation, achieving speed increases of over 200 times while maintaining quality [1][14]. Group 1: Technology and Performance - TurboDiffusion allows for the generation of a 5-second 480P video on a single RTX 5090 GPU in just 1.9 seconds, compared to 184 seconds with the original model, resulting in a 97-fold speed increase [5][6]. - For larger models, such as a 14B video generation model at 720P, the generation time is reduced to 38 seconds, and for a 480P model, it takes only 9.9 seconds [5][6]. - The framework employs four key technologies: SageAttention, Sparse-Linear Attention (SLA), rCM step distillation, and W8A8 quantization, which collectively enhance performance and reduce computational load [9][10][11][12]. Group 2: Industry Impact - TurboDiffusion's advancements enable real-time video generation, making it feasible for individual creators and small businesses to produce high-quality content quickly [14]. - The reduction in inference time by 100 times allows cloud service providers to serve significantly more users with the same computational resources, lowering operational costs [14]. - The technology is compatible with domestic AI chip architectures, promoting self-sufficiency in China's AI infrastructure [14][15]. Group 3: Future Implications - The framework signifies a paradigm shift in video generation, where high-quality AI video can be produced without sacrificing efficiency, thus transforming AI from a post-production tool to a creative partner [16]. - As generation speeds approach human reaction times (under 5 seconds), the potential for real-time interactive video creation becomes a reality, expanding creative possibilities [16].
单卡2秒生成一个视频!清华联手生数开源TurboDiffusion,视频DeepSeek时刻来了
量子位· 2025-12-25 11:51
Core Viewpoint - The article discusses the introduction of TurboDiffusion, an open-source framework developed by Tsinghua University's TSAIL lab and Shenshu Technology, which significantly accelerates video generation, achieving speeds up to 200 times faster while maintaining high quality [2][3][39]. Group 1: Speed and Efficiency - TurboDiffusion allows for the generation of a 5-second video at 480P resolution in just 1.9 seconds on a single RTX 5090 GPU, compared to the original time of approximately 184 seconds [3][13]. - For a 720P video, the TurboDiffusion framework can generate content in 24 seconds, a substantial improvement over previous models [12]. - The framework's enhancements enable real-time video generation, reducing the generation delay from 900 seconds to just 8 seconds for high-quality 1080P videos [16][39]. Group 2: Technical Innovations - TurboDiffusion incorporates four key technologies to optimize video generation: SageAttention, Sparse-Linear Attention (SLA), rCM step distillation, and W8A8 quantization [22][24][32]. - SageAttention2++ reduces the computational load of attention mechanisms, achieving a speed increase of 3-5 times while halving memory usage [25][27]. - SLA focuses on important pixels and maintains linear complexity, allowing for additional speed improvements when combined with SageAttention [28][29]. Group 3: Industry Impact - The advancements made by TurboDiffusion are expected to lower cloud inference costs significantly, enabling service to 100 times more users with the same computational power [42]. - The technology is compatible with domestic AI chip architectures, promoting self-sufficiency in China's AI infrastructure [42]. - The framework opens up new possibilities for real-time video editing, interactive video generation, and automated short film production, potentially leading to innovative product forms in the AIGC sector [42].
清华系DeepSeek时刻来了,硅谷沸腾,单卡200倍加速,视频进入秒级时代
3 6 Ke· 2025-12-23 10:46
Core Insights - The launch of TurboDiffusion by Tsinghua University and Shengshu Technology marks a significant advancement in AI video generation, reducing generation time from minutes to seconds while maintaining high quality [1][3][7]. Group 1: TurboDiffusion Overview - TurboDiffusion is an open-source video generation acceleration framework designed specifically for diffusion models, achieving speed improvements of 100-200 times on consumer-grade GPUs like the RTX 5090 [8][24]. - The framework allows for efficient video generation from both image-to-video (I2V) and text-to-video (T2V) inputs, maintaining impressive performance even for high-resolution and long-duration videos [8][14]. Group 2: Performance Metrics - In practical tests, TurboDiffusion demonstrated a speed increase of approximately 97 times, generating a 5-second video in just 1.9 seconds compared to the standard implementation which took 184 seconds [10]. - For a 14B model generating a 5-second 720P video, TurboDiffusion reduced the generation time from over 4549 seconds to just 38 seconds, achieving a speedup of about 120 times [14][17]. Group 3: Core Technologies - TurboDiffusion employs four key technologies: 1. SageAttention for low-bit quantization of attention mechanisms, enhancing GPU performance [24]. 2. Sparse-Linear Attention (SLA) to reduce redundant calculations in sparse computing, further accelerating inference [24]. 3. rCM step distillation to minimize the number of diffusion steps required without sacrificing quality [24]. 4. W8A8 INT8 quantization for linear layers, optimizing speed and reducing memory usage [24][26]. Group 4: Industry Impact - The introduction of TurboDiffusion is seen as a pivotal moment in the AI video generation industry, transitioning from a niche, time-consuming process to a more accessible and rapid content creation tool [29]. - The technology has already been integrated into various leading tech companies' products, showcasing its potential for significant economic benefits [26].