LLM
Search documents
X @Avi Chawla
Avi Chawla· 2025-12-22 20:25
RT Avi Chawla (@_avichawla)I built my own ChatGPT from scratch, and you can too.Karpathy's nanochat is a single, clean, minimal, and hackable codebase to build a modern LLM.By setting this up, you'll learn how to:> train a tokenizer from the ground up> pre-training: master next-word prediction> mid-training: teach the model to hold conversations> sft: fine-tune on high-quality dialogue datasets> evaluate and log every step of the processI've done this on a LightningAI studio, and you can reproduce everythin ...
X @Nick Szabo
Nick Szabo· 2025-12-20 03:39
RT Nick Szabo (@NickSzabo4)@SeanParnellUSA You're just parroting word-for-word what Hegseth said. An LLM has more originality. ...
From Arc to Dia: Lessons learned building AI Browsers – Samir Mody, The Browser Company of New York
AI Engineer· 2025-12-19 18:15
Product Development & Strategy - The Browser Company's mission is to rethink how people use the internet, believing the browser is a critical piece of software that hasn't evolved with changing user needs [2] - The company shifted from building Arc, an incremental improvement, to DIA, an AI-native browser, recognizing AI's transformative potential for internet use [5][7] - DIA aims to provide an AI assistant within the browser to personalize the experience, enhance productivity, and improve app usage [8] - The company emphasizes optimizing tools and processes for faster iteration, building, shipping, and learning, especially in the context of AI-native products [10] - Model behavior is treated as a craft and discipline, focusing on behavior design, data collection for measurement and training, and model steering to shape the AI assistant's personality [24][25] AI Security - Prompt injections are a critical security concern for browsers, as they can lead to data exfiltration, malicious command execution, or ignoring safety rules [32] - The company addresses prompt injections by blending technological approaches with user experience design, such as implementing confirmation steps for actions like autofill, scheduling events, and writing emails [38][39][40] Company Culture & Innovation - The company fosters a culture of broad participation in product ideation and refinement, enabling employees from various roles to contribute to AI product development [14][16][30] - Embracing technological shifts with conviction is crucial for companies, requiring changes in building processes, team structures, and approaches to security [44]
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-18 19:22
Market Position - Grok Code Fast 1 is the most deployed and trusted model in production [1] - Overall 1 on OpenRouter [2] - The model holds 39% market share based on 508 billion tokens processed [2] Usage & Popularity - Grok Code Fast 1 is the most-used model by developers globally [2] - The model is 1 in Token Usage & Market Share [2] - The model is the most popular LLM for English usage [2]
每日机构分析:12月18日
Sou Hu Cai Jing· 2025-12-18 10:41
Group 1 - ANZ forecasts Malaysia's GDP to grow by 4.5% in 2026, driven by strong domestic demand, AI-driven electronic exports, and prudent fiscal policies focusing on tax reform and spending restraint, with the ringgit expected to strengthen to 4.00 against the USD by year-end [1] - Maybank Securities predicts the Philippine peso may weaken in the second half of 2026 due to a stronger USD and ongoing domestic negative factors, including corruption scandals affecting government spending and foreign investment confidence, potentially leading to an additional 50 basis points rate cut by the central bank [1] - LPL Financial's chief economist suggests that current inflation above target is temporary, with demand cooling in the coming months expected to ease price pressures, providing relief for the market [1] Group 2 - Bank of America notes that tariffs are raising goods inflation while healthcare factors may lead to a slowdown in services inflation, potentially prompting the Federal Reserve to maintain rates in January [2] - Bank of America highlights India as a leading AI consumer market due to low data costs and a large young population, although local startups face increased competition from international giants [2] - Yuanta Bank's economist emphasizes that relying solely on non-core measures will not curb the depreciation of the Korean won, urging authorities to take substantial actions to stabilize the currency [2] Group 3 - Zerohedge reports that large withdrawals from JPMorgan are disrupting liquidity across the U.S., reminiscent of the 2019 repo market crisis, prompting the Federal Reserve to consider "light QE" measures [3] - State Street indicates that the recent weakness of the USD is primarily due to U.S. investors significantly reducing their overseas investment currency hedging, rather than foreign capital increasing U.S. asset holdings [3]
2026 将近,世界模型到底更「世界」了吗?
机器之心· 2025-12-13 02:30
Core Viewpoint - The recent launch of GWM Worlds and GWM Robotics by Runway pushes video generation towards an interactive "world simulation" paradigm, reigniting discussions on the definition and scope of "world models" as interfaces for creation and interaction, simulators for training and evaluation, or cognitive frameworks for reasoning and decision-making [1]. Group 1: Evolution of World Models - Over the past two years, world models have evolved to be considered on par with LLMs in the AGI landscape, transitioning from a narrow definition focused on reinforcement learning to a broader understanding that includes generative modeling [4]. - Initially, world models were seen as internal environment models for agents, predicting future states based on current conditions and actions, allowing for internal simulation and decision-making [5]. - The engineering perspective defined world models as a combination of three capabilities: compressing high-dimensional perception into usable representations, predicting future states over time, and utilizing predictions for planning and decision-making [6]. - By 2024, the understanding of world models expanded to encompass general world evolution modeling, with a trend from language generation to image generation, and ultimately to 3D and world generation [6]. - The boundaries of the world model concept have become more ambiguous, with ongoing debates about the nature of representations, the incorporation of physical laws, and the organization of input relationships [6]. Group 2: Industry Layout and Trends - Major companies are investing in world models, questioning whether they are enhancing their "data engines" or building new frameworks for "spatiotemporal cognition" [3]. - In February 2024, OpenAI referred to the video generation model Sora as "world simulators," emphasizing their ability to learn the three-dimensional structure and physical laws of the real world [6]. - Concurrently, LeCun introduced V-JEPA, which focuses on predicting masked video segments in abstract representation space, allowing for higher training efficiency by discarding unpredictable information [6]. - The current discourse has shifted from whether to develop world models to how to model them, with debates on whether to abstract from pixel levels or to directly operate in abstract spaces [7]. - There is a recognition that existing approaches may only capture partial physical laws, indicating a need for representations of isolated objects and a priori laws of change across space and time to achieve a coherent world model [7]. Group 3: Definition and Ambiguity of World Models - By 2025, world models are positioned alongside LLMs, with companies like Google DeepMind, Meta, and Nvidia shifting focus from pure LLMs to world models, aiming for "Physical AI + superintelligence" due to stagnation in LLM advancements [8]. - The distinction between world models and existing generative AI lies in the former's goal to construct internal representations of environments that include physical, temporal, and spatial dimensions for planning and decision-making [9]. - The term "world model" has become ambiguous, referring to latent states within systems, game-like simulators for training agents, or any content pipeline capable of generating navigable 3D scenes [9]. - An analysis from Entropy Town in November 2025 categorized world models into three technical routes: interface, simulator, and cognitive framework, highlighting the ongoing ambiguity in the field [9].
X @Demis Hassabis
Demis Hassabis· 2025-12-12 01:51
Technological Advancement - Starcloud-1 成功在太空中使用 Nvidia H100 训练了首个 LLM [1] - Starcloud-1 首次在太空运行了 Google 的 Gemini 模型的一个版本 [1] - 该技术是朝着将几乎所有计算转移到太空的重要一步 [1] Environmental Impact - 目标是停止耗尽地球的能源资源 [1]
X @Tesla Owners Silicon Valley
Tesla Owners Silicon Valley· 2025-12-11 21:08
Market Leadership (Grok Code Fast 1) - Grok Code Fast 1 recaptures the number one overall spot as the high-volume, cost-efficient leader for all developer agents [1] - Grok Code Fast 1 ranks 1 overall on OpenRouter Leaderboard with 880 billion tokens, nearly double the nearest competitor [1] - Grok Code Fast 1 achieves 1 in Categories Token Share with 36.8 percent dominance [1] - Grok Code Fast 1 leads in Languages Token Share with 16.3 percent [1] - Grok Code Fast 1 is the 1 most popular LLM for English by overall usage volume [1] Agentic Capabilities (Grok 4.1 Fast) - Grok 4.1 Fast remains xAI's specialized tool-calling and agentic system, focused on high-value, complex tasks [1] - xAI vendor share is 2 in Market Share on OpenRouter with 19.4 percent [1] - Grok 4.1 Fast ranks 1 on τ²-Bench Telecom agentic tool use benchmark and Berkeley Function Calling Benchmark [1] Reasoning and Communication (Grok 4.1 Thinking Mode) - Grok 4.1 Thinking Mode is the top choice for complex reasoning, personality, and human-preferred communication [1] - Grok 4.1 Thinking Mode ranks 1 overall on LMArena Text Arena human preference Elo score and EQ-Bench3 emotional intelligence benchmark [1]
Trace OpenRouter Calls to LangSmith — No Code Changes Needed
LangChain· 2025-12-11 17:13
Hey, I'm Tanish from Langchain. Today I'm going to go through how to use Open Router's new broadcast feature with Langchain to send traces to Langmith. The cool thing about broadcast is it stores destination information, which in this case is Langmith server side.So the only thing that you need to worry about in your code is your open router API key. Let's go through how to set this up. So let me walk you through a quick code snippet.This uses lang chain's init chat model in order to initialize a model and ...
How to debug voice agents with LangSmith
LangChain· 2025-12-09 21:39
Voice is one of the most natural ways to interact with AI. And as the models are getting better, I'm excited about new use cases and interaction patterns that it's going to unlock, especially in industries like education and customer service. It's surprisingly easy to get started building a voice agent.And so let's go through that in this video. I'm Tannushri and I'm going to show you how to build a voice agent, specifically a French tutor with this framework called Pipecat. going to walk through how it wor ...