博弈论算法
Search documents
腾讯研究院AI速递 20260228
腾讯研究院· 2026-02-27 16:01
Group 1: Meta's Shift in AI Hardware Strategy - Meta has abandoned two generations of self-developed training chips, Iris and Olympus, due to high risks in software stability and mass production, opting for a multi-billion dollar TPU leasing agreement with Google [1] - Meta acquired Rivos, a RISC-V chip startup, in October 2025, which has developed a 3.1GHz processor and a CUDA-compatible software stack, facilitating seamless migration of AI workloads from NVIDIA's ecosystem [1] - Meta has also secured a deal for millions of GPUs with NVIDIA and a 6GW GPU agreement with AMD, diversifying its risk and enhancing computational power [1] Group 2: DeepSeek's DualPath System - DeepSeek, in collaboration with Tsinghua University and Peking University, has launched the DualPath inference system, which addresses storage bandwidth bottlenecks in pre-fill-decode architectures through a dual-path KV-Cache loading mechanism [2] - The system achieves a 1.87x throughput increase in offline scenarios and a 1.96x increase in online service scenarios, leveraging idle storage bandwidth for global resource pooling [2] - It employs a traffic management system centered around computing network cards and adaptive request scheduling to ensure KV-Cache transmission does not interfere with latency-sensitive model inference communication, validated on a 1152 GPU cluster [2] Group 3: Google's Nano Banana 2 Model - Google has released the Nano Banana 2 image generation model, which integrates with the Gemini knowledge base and web search, significantly enhancing spatial understanding and Chinese text rendering capabilities [3] - The model can maintain consistency in the appearance of five character faces or fourteen objects in a single generation, supporting resolutions from 512px to 4K, with API pricing reduced to half that of the previous Pro model [3] - Free users can generate 100 images within 24 hours, while Pro users can generate 1000 images, with the simultaneous upgrade of SynthID digital watermarking and C2PA content credentialing technology, which has been invoked over 20 million times [3] Group 4: Kunlun Wanwei's SkyReels V4 - Kunlun Wanwei has introduced SkyReels V4, the world's first video foundation model that supports multi-modal input and joint audio-video generation, ranking second globally in the Artificial Analysis benchmark [4] - The model generates cinema-quality audio and video at 1080p resolution and 32 FPS for 15 seconds, utilizing a dual-stream multi-modal diffusion Transformer architecture for audiovisual depth collaboration [4] - It innovatively combines channel and temporal stitching into a unified paradigm, transforming generation, repair, and editing tasks into specific mask configuration repair problems without the need to switch tools [4] Group 5: Block's Workforce Reduction - Block's CEO Jack Dorsey announced a 40% workforce reduction, cutting the number of employees from over 10,000 to below 6,000, citing the strong performance of the company but acknowledging that AI tools are fundamentally changing the nature of building and operating companies [5] - Laid-off employees will receive 20 weeks of pay plus an additional week for each year of service, continued stock vesting until the end of May, six months of healthcare, and a $5,000 transition stipend, with open communication channels until Thursday [6] - Dorsey indicated a preference for decisive action rather than gradual layoffs, suggesting that future customers may directly build their own functionalities, with AI not merely assisting but replacing human roles [6] Group 6: DeepMind's AlphaEvolve - Google DeepMind has utilized AlphaEvolve to evolve a new game theory algorithm, treating algorithm source code as a genome and using Gemini as a genetic operator to perform natural selection on core algorithms CFR and PSRO [7] - The evolved VAD-CFR algorithm employs counterintuitive mechanisms previously unconsidered by humans, including forgetting old experiences in chaotic situations and doubling down on good moves immediately, outperforming classic solutions in nearly all test games [7] - This marks a paradigm shift in AI from executing algorithms to inventing them, with DeepMind planning to extend this framework to the complete design of deep reinforcement learning agents and collaborative game mechanism discovery [7] Group 7: Surge in Chinese Open Source Model Usage - OpenRouter data indicates that the usage of Chinese AI models surged by 127% over three weeks in February 2026, surpassing U.S. models for the first time, with four out of the top five models being MiniMax M2.5, Kimi K2.5, GLM-5, and DeepSeek V3.2 [8] - In agent mode, token consumption shifted from per-use to per-traffic, with the share of programming task tokens skyrocketing from 11% to over 50%, and the API output price of Chinese models being only 1/12 to 1/5 of Claude's [8] - The release of Zhipu GLM-5 coincided with a price increase of at least 30%, indicating a transition of domestic models from a price war to a demand-driven era, with Kimi K2.5 generating more revenue in less than a month than in the entire year of 2025 [8] Group 8: Step 3.5 Flash Engineering Insights - The CEO, CTO, chief scientist, and core algorithm team of Step 3.5 Flash shared insights on Reddit, revealing that the design intentionally kept the scale within the 128GB memory operational range, with 11 billion active parameters balancing capability and local deployment [9] - The architecture employs MTP-3 multi-token prediction to achieve a maximum generation speed of 350 TPS, combined with GQA8+SWA attention and sparse MoE design, with post-training integrating verifiable signals and preference feedback through an extensible RL framework [9] - A commitment was made to release the base model and integrated training code library within one to two weeks, with the next version 3.6 set to support cognitive intensity switching and resolve tool compatibility issues [9] Group 9: Claude Code Tool Preferences - Amplifying.ai conducted 2430 tool selection tests on Claude Code, revealing that custom/DIY implementations accounted for 12% of all major choices, indicating a preference for self-written solutions over third-party tools [10] - A default tech stack has emerged, including Vercel for deployment, PostgreSQL for databases, Stripe for payments, and Tailwind+shadcn/ui for front-end, with over 90% lock-in rates for single tools in some categories [10] - Project context was found to be more critical than instruction wording, with stability rates of 76% for different expressions within the same project, and Opus 4.6 showing a tendency to recommend new tools and custom solutions, while Sonnet 4.5 preferred established mainstream tools [10]
AlphaEvolve再进化,DeepMind用A“养殖”算法,碾压所有人类设计
3 6 Ke· 2026-02-27 10:51
Core Insights - DeepMind's latest paper introduces AlphaEvolve, which treats algorithm source code as a genome and uses Gemini as a genetic operator to evolve new game theory algorithms through a process akin to natural selection [1][5][20] - The evolved algorithms outperform human-designed optimal solutions in various tests, utilizing mechanisms that human researchers had not previously considered [1][22] Group 1: AlphaEvolve and Its Mechanism - AlphaEvolve is described as an evolutionary coding agent that uses the source code as a genome, with LLM acting as a genetic operator to mutate the code [5][20] - The process involves evaluating the fitness of each "offspring algorithm" based on its exploitability in a set of benchmark games, allowing the best-performing algorithms to survive and evolve further [5][6][20] Group 2: Target Algorithms and Historical Context - The focus of AlphaEvolve is on two core algorithm families in multi-agent reinforcement learning: Counterfactual Regret Minimization (CFR) and Policy Space Response Oracles (PSRO) [6][7] - Historically, researchers have manually tuned and designed variations of these algorithms, but AlphaEvolve automates this process, significantly expanding the search space for potential solutions [10][16][20] Group 3: Implications of AI-Driven Algorithm Design - The paper highlights that the design of game theory algorithms has traditionally been a challenging task due to the complexity of incomplete information games [12][13] - AlphaEvolve's approach allows for meaningful mutations in code, leading to the discovery of effective strategies that human experts had not conceived [17][25] - The results indicate a paradigm shift where AI not only executes algorithms but also invents them, achieving superior performance compared to human-designed methods [22][25] Group 4: Future Directions - DeepMind plans to apply this evolutionary framework to the complete design of deep reinforcement learning agents and explore mechanism discovery in cooperative games [25]