Coding
Search documents
智谱IPO后唐杰首次公开亮相:「Chat之战」已结束,押注Coding的选择非常正确
IPO早知道· 2026-01-12 02:04
Core Viewpoint - The article discusses the advancements and future plans of the company Zhipu AI, emphasizing its focus on innovation and coding capabilities in the context of AGI development and competition with U.S. models [2][9]. Group 1: Company Developments - Zhipu AI's GLM-4.5 model integrates reasoning, coding, and agent capabilities, marking a significant step in AI model development [5]. - The GLM-4.7 model, launched in December 2025, achieved top rankings in various coding assessments, outperforming competitors like GPT-5.2 and Claude Sonnet 4.5 [7]. - Zhipu AI's AutoGLM model gained rapid popularity, reaching 10,000 stars on GitHub within three days, indicating strong community interest [7]. Group 2: Market Position and Competition - Despite the success of Chinese models in open-source rankings, there is a recognition that the gap between Chinese and U.S. models may still be widening due to the latter's closed-source developments [9]. - The company aims to enhance its cloud revenue through high-performance coding tools like GLM CodingPlan and AutoGLM, which are expected to have a significant impact in 2026 [8]. Group 3: Future Focus Areas - Zhipu AI plans to concentrate on scaling known and unknown paradigms, technical innovations, and multi-modal capabilities to enhance AI's functionality in real-world applications [11]. - The company anticipates 2026 to be a pivotal year for AI in scientific applications, driven by improved capabilities and the potential for AI to perform long-term tasks in human environments [11].
X @Elon Musk
Elon Musk· 2025-12-23 16:06
RT X Freeze (@XFreeze)xAI just launched the Grok Collections APIA built-in RAG system that lets you upload and search entire datasets - from PDFs and Excel sheets to codebases, without building your own indexing systemWhy it’s huge:• Easy file uploads & updates with auto reindexing• Powerful search for finance, legal & code data• Free first week of indexing & storage, then $2.50 / 1K searchesIn head-to-head tests, Grok Collections outperformed Gemini 3 Pro and GPT-5.1 on key retrieval tasks pulling more acc ...
Google just dropped Gemini 3 FLASH! ⚡⚡⚡
Matthew Berman· 2025-12-18 00:19
Model Performance & Cost - Gemini 3 Flash is presented as a highly competitive model, achieving near parity with Gemini 3 Pro in various benchmarks while being significantly cheaper and faster [1][6][18] - The model excels in coding, even surpassing Gemini 3 Pro in the SweetBench verified benchmark [1][8] - Industry analysis indicates Gemini 3 Flash demonstrates efficient token usage, requiring fewer tokens on average to achieve comparable results [19] - Benchmarks show Gemini 3 Flash achieving high scores in areas like multimodal understanding and reasoning (MMU Pro) and code execution [7] Economic Viability & Market Impact - The input price for Gemini 3 Flash is $0.50 per million tokens, a quarter of the price of Gemini 3 Pro ($2 per million tokens) and a fraction of the cost compared to GPT 5.2% and Claude Sonnet 4.5% [6] - Google is positioning Gemini 3 Flash as the default model in its Gemini app and AI mode in Google search, making it widely accessible at no cost to users [14][15] - The industry views Gemini 3 Flash as potentially the most economically viable model due to its balance of performance, speed, and cost-effectiveness [14] - The model's speed and efficiency are particularly beneficial for computer use models, addressing previous slowness issues [20] Strategic Positioning - Google's strategic advantage lies in its combination of high-performing, cost-effective models, extensive distribution channels, vast data resources, and custom silicon [16] - The company's decision to offer Gemini 3 Flash for free is seen as a critical competitive move, potentially disrupting the market for agentic coding and AI applications [15][17] - Industry experts suggest Gemini 3 Flash is becoming the new default for vibe coding and is being rapidly adopted across Google's product suite [16][13]
Gemini 3 FLASH is insane
Matthew Berman· 2025-12-17 20:11
Model Performance & Pricing - Gemini 3 Flash is significantly cheaper, costing a fourth of Gemini 3 Pro and about a third of GPT 5.2's price [1] - Gemini 3 Flash achieves an Arc AGI score of 33.6%, surpassing other models except GPT 5.2% [1] - In SWE verified coding ability tests, Gemini 3 Flash scores 78%, outperforming Gemini 3 Pro's 76% [2] Technical Specifications & Capabilities - Gemini 3 Flash requires 17,000 tokens compared to Gemini 3 Pro's 24,000 tokens [3] - Gemini 3 Flash has multimodal reasoning capabilities, processing video, images, and audio [3] Market Impact & Strategy - Google has made Gemini 3 Flash the default model in Google Search's AI mode [3] - The industry considers speed crucial for search, suggesting Google aims to gain a competitive edge [3]
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-12 19:00
Grok Rankings Update – Dec 12Grok Code Fast 1 (The Market Dominator)This model remains the leading choice for high-volume, cost-efficient coding workflows and continues to drive strong developer adoption.#1 Overall Position on the OpenRouter Leaderboard (757B tokens, leading the second position by over 300B tokens)#1 in Categories Token Share (31.2 percent dominance)#1 in Market Share on OpenRouter (xAI vendor share: 17.0 percent)#1 on Kilo Code Leaderboard (Top Coding App)#1 on Cline Leaderboard (Top Codin ...
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-10 20:02
Grok Rankings Update December 11Grok 4.1 Fast (The Agentic and Volume Model)This model is currently leading overall volume and excels in complex agentic workflows.#1 Overall Position on OpenRouter Leaderboard by total token usage#1 on τ²-Bench Telecom agentic tool use benchmark#1 on Berkeley Function Calling Benchmark#2 in Tool Calls showing strong adoption among agent developers#2 in Multilingual Usage overall token shareGrok Code Fast 1 (The Market Dominator)This model remains the definitive leader for hi ...
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-03 21:07
Grok Rankings Update — December 3xAI continues to lead across multiple benchmarks and usage categories. Here is the full breakdown:## Grok 4.1 Fast — The Agentic ModelSpecialized for tool-calling, long-context, and high-speed performance.#1 on τ²-Bench Telecom (Agentic Tool Use Benchmark)#1 on Berkeley Function Calling Benchmark#1 in Programming Category (Overall Token Share)#1 in Multilingual Usage (Overall Token Share)#2 on OpenRouter Overall Leaderboard (By Token Usage)## Grok Code Fast 1 — The Market Do ...
X @Avi Chawla
Avi Chawla· 2025-11-26 06:31
Learn about AI coding bottlenecks here: https://t.co/hEwlahsyRi ...
X @Anthropic
Anthropic· 2025-11-24 18:55
RT Claude (@claudeai)Introducing Claude Opus 4.5: the best model in the world for coding, agents, and computer use.Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done. https://t.co/mid2Z1qzIf ...