Workflow
大语言模型(LLM)
icon
Search documents
ICLR 2026史上最严新规:论文用LLM不报,直接拒稿
3 6 Ke· 2025-08-29 03:23
Core Points - ICLR 2026 has introduced strict regulations regarding the use of Large Language Models (LLMs) in paper writing and reviewing, requiring explicit acknowledgment of LLM usage [1][15][16] - The new policies aim to ensure accountability among authors and reviewers, mandating that they take full responsibility for their contributions [16][20] Group 1: New Regulations - The ICLR 2026 committee has established two main policies regarding LLM usage: all LLM usage must be clearly stated, and authors and reviewers must be accountable for their contributions [15][16] - The policies are in line with ICLR's ethical guidelines, which emphasize the importance of acknowledging all research contributions [15][16] - Violations of these policies will result in immediate rejection of submissions, reflecting the committee's commitment to maintaining ethical standards [17] Group 2: Submission Details - The submission deadlines for ICLR 2026 are set, with the abstract submission deadline on September 19, 2025, and the paper submission deadline on September 24, 2025 [9] - The total number of submissions for ICLR 2025 reached 11,565, with a 32.08% acceptance rate, indicating a growing trend in submissions [3][5] Group 3: Ethical Concerns - There have been instances of authors using hidden prompts to manipulate reviewer feedback, which is considered a serious ethical violation [21][24] - The committee has highlighted the potential risks associated with LLMs, including the possibility of generating false information or breaching confidentiality [20][24] Group 4: AI in Review Process - The use of LLMs in the review process has been tested, showing that AI suggestions were adopted in 12,222 instances, with 26.6% of reviewers updating their evaluations based on AI feedback [29][32] - The integration of LLMs has been shown to enhance the quality of reviews and increase engagement during the rebuttal phase [32][34]
晚点独家丨理想自研智驾芯片上车路测,部分计算性能超英伟达 Thor-U
晚点LatePost· 2025-08-28 06:09
Core Viewpoint - Li Auto's self-developed autonomous driving chip M100 has successfully passed key pre-mass production stages and is expected to be mass-produced next year, aiming to enhance efficiency and cost-effectiveness in its autonomous driving algorithms [4][6]. Summary by Sections Chip Development - Li Auto's M100 chip has completed functional and performance testing, demonstrating significant computational capabilities, such as matching the effective computing power of 2 NVIDIA Thor-U chips for large language model tasks and 3 Thor-U chips for traditional visual tasks [4][6]. - The company has allocated a budget of several billion dollars for the development of its self-research chip project, indicating the high costs associated with chip development [6]. Strategic Approach - Li Auto is adopting a dual strategy: relying on external partners like NVIDIA and Horizon for current market competitiveness while developing its own chip for future core advantages [7][8]. - The CTO of Li Auto, Xie Yan, is leading a strategy that combines hardware and software development to maximize chip performance and efficiency [6]. Market Positioning - In its current electric vehicle lineup, Li Auto is using NVIDIA's high-performance chips in flagship models, while employing a mixed strategy in its range-extended models by using either NVIDIA Thor-U or Horizon Journey 6M chips based on different autonomous driving versions [8]. - The core reason for developing its own chip is to optimize performance specifically for Li Auto's algorithms, enhancing cost-effectiveness and efficiency [8].
盘后跌超3%!英伟达二季度Blackwell提速,数据中心稳居核心,为何股价还会跳水?(附Q2财报详情)
美股IPO· 2025-08-27 23:46
Core Viewpoint - Nvidia's Q2 revenue growth has slowed to its lowest rate in over two years, yet it still exceeded analyst expectations, with a significant inventory release of $180 million for H20 in China [1][18][20] Financial Performance - Q2 revenue reached $46.743 billion, a year-on-year increase of 56%, surpassing analyst expectations of $46.23 billion and Nvidia's own guidance of $44.1 to $45.9 billion [9][18] - Non-GAAP EPS for Q2 was $1.05, up 54% year-on-year, exceeding analyst expectations of $1.01 [9][19] - Adjusted gross margin for Q2 was 72.7%, down 3 percentage points year-on-year, but above analyst expectations of 72.1% [10][18] Business Segment Performance - Data center revenue for Q2 was $41.1 billion, a year-on-year increase of 56%, but below analyst expectations of $41.29 billion [10][22] - Gaming and AI PC revenue reached $4.3 billion, up 49% year-on-year, exceeding analyst expectations of $3.82 billion [11][22] - Automotive and robotics revenue was $586 million, a 69% year-on-year increase, slightly below analyst expectations [13][26] Guidance and Market Outlook - Q3 revenue guidance is set at $54 billion, with a range of $52.92 billion to $55.08 billion, slightly above analyst expectations of $53.46 billion [15][27] - The Q3 guidance does not account for any sales of H20 chips to China, indicating potential future revenue opportunities [27][28] - CEO Jensen Huang mentioned that China could present a $50 billion market opportunity this year, with an expected annual growth rate of around 50% [28] Shareholder Returns - Nvidia announced an additional $60 billion in stock buyback authorization, with no set deadline for execution [30][32] - In the first half of fiscal 2026, Nvidia returned $24.3 billion to shareholders through stock buybacks and dividends [31]
拒稿警告,靠大模型「偷摸水论文」被堵死,ICLR最严新规来了
机器之心· 2025-08-27 08:36
Core Viewpoint - The article discusses the newly established policies regarding the use of large language models (LLMs) in academic research, particularly in the context of the ICLR conference, aiming to ensure academic integrity and mitigate risks associated with LLMs [2][4][14]. Group 1: ICLR Conference Policies - ICLR 2026 has introduced specific policies for the use of LLMs, which are based on the conference's ethical guidelines [2][4]. - The conference received 11,565 submissions in 2025, with an acceptance rate of 32.08% [2]. - The policies emphasize that any use of LLMs must be disclosed, and authors and reviewers are ultimately responsible for their contributions [6][7]. Group 2: Specific Policy Applications - Authors must disclose the use of LLMs in writing assistance, and they are responsible for all content, including any errors generated by the LLM [9]. - When LLMs are used for research ideas or data analysis, authors must verify the validity and accuracy of the contributions made by the LLM [9]. - Reviewers must also disclose their use of LLMs in writing reviews and are responsible for maintaining the confidentiality of submitted papers [11]. Group 3: Prohibited Practices - The article highlights the prohibition of "prompt injection," where authors manipulate the review process through hidden prompts, which is considered collusion and a serious academic misconduct [12]. - Violations of these policies can lead to severe consequences, including desk rejection of submissions [7]. Group 4: Broader Context - The article notes that ICLR is not alone in implementing such policies; other major conferences like NeurIPS and ICML have also established guidelines for LLM usage [13][15]. - The increasing reliance on LLMs raises concerns about academic integrity, including issues like false citations and plagiarism, prompting the need for clear guidelines [14].
榨干GPU性能,中兴Mariana(马里亚纳)突破显存壁垒
量子位· 2025-08-26 05:46
Core Insights - The article discusses the challenges of expanding Key-Value Cache (KV Cache) storage in large language models (LLMs), highlighting the conflict between reasoning efficiency and memory cost [1] - It emphasizes the need for innovative solutions to enhance KV Cache storage without compromising performance [1] Industry Exploration - Nvidia's Dynamo project implements a multi-level caching algorithm for storage systems, but faces complexities in data migration and latency issues [2] - Microsoft's LMCahce system is compatible with inference frameworks but has limitations in distributed storage support and space capacity [3] - Alibaba proposed a remote storage solution extending KV Cache to Tair database, which offers easy scalability but struggles with low-latency requirements for LLM inference [3] Emerging Technologies - CXL (Compute Express Link) is presented as a promising high-speed interconnect technology that could alleviate memory bottlenecks in AI and high-performance computing [5] - Research on using CXL to accelerate LLM inference is still limited, indicating a significant opportunity for exploration [5] Mariana Exploration - ZTE Corporation and East China Normal University introduced a distributed shared KV storage technology named Mariana, which is designed for high-performance distributed KV indexing [6] - Mariana's architecture is tailored for GPU and KV Cache storage, achieving 1.7 times higher throughput and 23% lower tail latency compared to existing solutions [6] Key Innovations of Mariana - The Multi-Slot lock-based Concurrency Scheme (MSCS) allows fine-grained concurrency control at the entry level, significantly reducing contention and improving throughput [8] - Tailored Leaf Node (TLN) design optimizes data layout for faster access, enhancing read speeds by allowing simultaneous loading of key arrays into SIMD registers [10] - An adaptive caching strategy using Count-Min Sketch algorithm identifies and caches hot data efficiently, improving read performance [11] Application Validation - Mariana's architecture supports large-capacity storage by distributing data across remote memory pools, theoretically allowing unlimited storage space [13] - Experimental results indicate that Mariana significantly improves read/write throughput and latency performance in KV Cache scenarios [14] Future Prospects - Mariana's design is compatible with future CXL hardware, allowing seamless migration and utilization of CXL's advantages [18] - The advancements in Mariana and CXL technology could lead to efficient operation of large models on standard hardware, democratizing AI capabilities across various applications [18]
电改“136号文”半年考,新能源资产后服务赛道马太效应放大
Core Viewpoint - The implementation of the "136 Document" marks a significant shift in China's renewable energy sector from a policy-driven model to a market-driven one, leading to a substantial increase in renewable energy installations and a transformation in the post-service market for renewable assets [1][3]. Industry Overview - In the first half of the year, China's renewable energy installed capacity increased by 268 million kilowatts, a year-on-year growth of 99.3%, accounting for 91.5% of the new installed capacity [1]. - The post-service market for renewable energy is evolving from an internal production function to an independent operational service sector, requiring comprehensive asset operation capabilities that cover maintenance, trading, and digitalization [1][4]. Company Insights - Beijing Xiehe Operation and Maintenance Wind Power Technology Co., Ltd. (Xiehe Operation and Maintenance) has transitioned from being an internal service department to a leading provider of professional operational services in the renewable energy post-service industry, managing over 40 GW of renewable assets and over 8 GW of trading assets [2][4]. - The company has recently received a new round of equity investment from Xinjing Holdings, highlighting the growing investment interest in the renewable energy post-service market [3]. Market Dynamics - The post-service market for renewable energy is projected to exceed 100 billion yuan, with the wind and solar operation and maintenance service market expected to surpass 70 billion yuan by 2024 [4]. - The demand for asset management services is increasing as the investor base diversifies beyond major state-owned enterprises to include local state-owned assets, city investment companies, equipment manufacturers, and individual investors [4][5]. Competitive Landscape - The competitive landscape is characterized by a "Matthew Effect," where leading companies can leverage scale advantages to build barriers, while medium-sized companies face challenges in developing market service capabilities [6]. - Smaller companies tend to focus on basic services such as parts replacement and cleaning, relying heavily on local resources and customer relationships for survival [6]. Value Reassessment - The value logic of renewable assets has fundamentally changed, with a focus on comprehensive operational capabilities being essential for long-term stability in the market [7]. - The integration of AI technology and digital tools is enhancing operational efficiency and profitability for renewable assets, with companies like Xiehe Operation and Maintenance developing systems to analyze operational data and optimize performance [7][8]. Future Directions - The future of the renewable energy post-service market will see increased automation and the application of AI in operational processes, although human expertise will remain crucial for complex maintenance tasks [8]. - The ongoing expansion of renewable installations and the deepening of market reforms will continue to shape the asset operation capabilities as a core determinant of project profitability [9].
理想VLA到底是不是真的VLA?
自动驾驶之心· 2025-08-21 23:34
Core Viewpoint - The article discusses the capabilities of the MindVLA model in autonomous driving, emphasizing its advanced scene understanding and decision-making abilities compared to traditional E2E models. Group 1: VLA Capabilities - The VLA model demonstrates effective defensive driving, particularly in scenarios with obstructed views, by smoothly adjusting speed based on remaining distance [4][5]. - In congested traffic situations, VLA shows improved decision-making by choosing to change lanes rather than following the typical detour logic of E2E models [7]. - The VLA model exhibits enhanced lane centering abilities in non-standard lane widths, significantly reducing the occurrence of erratic driving patterns [9][10]. Group 2: Scene Understanding - VLA's decision-making process reflects a deeper understanding of traffic scenarios, allowing it to make more efficient lane changes and route selections [11]. - The model's ability to maintain stability in trajectory generation is attributed to its use of diffusion models, which enhances its performance in various driving conditions [10]. Group 3: Comparison with E2E Models - The article highlights that E2E models struggle with nuanced driving behaviors, often resulting in abrupt maneuvers, while VLA provides smoother and more context-aware driving responses [3][4]. - VLA's architecture allows for parallel optimization across different scenarios, leading to faster iterations and improvements compared to E2E models [12]. Group 4: Limitations and Future Considerations - Despite its advancements, VLA is still classified as an assistive driving technology rather than fully autonomous driving, requiring human intervention in certain situations [12]. - The article raises questions about the model's performance in specific scenarios, indicating areas for further development and refinement [12].
3000万融资,20%付费转化,语音输入工具Wispr Flow如何精准找到PMF?
Founder Park· 2025-08-21 07:30
Core Insights - Wispr Flow successfully pivoted from hardware to software, focusing on a voice input tool that meets user needs, resulting in $30 million in funding and a 20% conversion rate to paid users [2][11] - The company experienced high user engagement, with active users averaging 100 dictations per day and keyboard input dropping to 25-30% of total input [2][13] Group 1: Company Transformation - Initially, the company developed a hardware device that lacked a clear consumer market, leading to its eventual failure [7][10] - The decision to pivot was driven by the realization that the software ecosystem was not ready for their hardware product, prompting a shift to focus solely on the Wispr Flow software [9][10] - The transition involved significant layoffs, reducing the team from 40 to 5 employees to ensure a focused and stable environment for the remaining staff [12][19] Group 2: Product Market Fit (PMF) - The company accelerated its product launch timeline, achieving a successful release of Wispr Flow within six weeks, which garnered millions of views and topped charts on Product Hunt [13][14] - The product resonated strongly with users, leading to a conversion rate of nearly 20% to paid subscriptions, significantly higher than the industry average of 3-4% [13][14] Group 3: Key Lessons Learned - Rapid decision-making and execution are crucial to avoid stagnation and ensure effective leadership during transitions [17] - It is essential to make decisive cuts in staffing to provide clarity and stability for the remaining team members [18] - Gathering genuine feedback from customers is vital, as assumptions about product desirability can lead to misguided efforts [20]
个人AI助理开发万字指南:从0到1,把AI打造成顶级思考伙伴
3 6 Ke· 2025-08-20 07:10
Group 1 - The article emphasizes that AI should be viewed as a collaborative partner rather than a replacement for human skills, highlighting the importance of providing context for effective use [7][84]. - It discusses the author's initial resistance to using AI tools, which changed after realizing their potential in enhancing productivity and efficiency in tasks like writing user stories [4][5]. - The author shares experiences of using AI to streamline complex tasks, demonstrating how AI can transform chaotic thoughts into structured outputs [6][10]. Group 2 - The article outlines a framework for building an AI assistant, which includes hiring the assistant, onboarding it with relevant knowledge, and initiating projects through dedicated chat threads [12][13][35]. - It suggests that the AI assistant can help with strategic decision-making, brainstorming, and even emotional support, thus enhancing overall productivity [10][11]. - The importance of continuous context updates and knowledge sharing with the AI assistant is emphasized to ensure it remains effective and relevant [74][81]. Group 3 - The article provides practical steps for utilizing AI in project management, including creating project knowledge bases and using specific prompts to guide the AI's responses [22][36]. - It highlights the significance of maintaining an ongoing dialogue with the AI to keep it informed about changes and developments within the organization [65][66]. - The potential for AI to evolve into a more proactive and connected tool is discussed, suggesting that future AI assistants could offer reminders and insights based on user activities [83][84].
大模型给自己当裁判并不靠谱!上海交通大学新研究揭示LLM-as-a-judge机制缺陷
量子位· 2025-08-17 03:43
Core Viewpoint - The article discusses the evolution of large language models (LLMs) from tools to evaluators, specifically focusing on their ability to judge AI-generated content, which has not been thoroughly validated for reliability and consistency with human judgment [1][6]. Group 1: Research Background - A fundamental question arises regarding whether AI evaluators can accurately identify who is speaking in a dialogue before assessing the model's performance [2]. - The research paper titled "PersonaEval: Are LLM Evaluators Human Enough to Judge Role-Play?" by a team from Shanghai Jiao Tong University introduces a new benchmark test called PersonaEval, aimed at evaluating LLMs' ability to identify speakers in dialogues [2][11]. Group 2: Testing Results - The results indicate that even the best-performing model, Gemini-2.5-pro, achieved an accuracy of only 68.8%, while the average accuracy of human participants was 90.8% [4][15]. - This significant gap highlights the current limitations of LLMs in accurately judging role-play scenarios [17]. Group 3: Model Evaluation and Challenges - The paper emphasizes that LLMs tend to focus on superficial language style rather than the underlying intent and context of the dialogue, leading to misjudgments [9][10]. - The PersonaEval benchmark is designed to align evaluations with human judgment and includes carefully selected distractors to challenge the models [13][12]. Group 4: Improvement Strategies - The authors explored two common strategies for improving model performance: training-time adaptation and test-time compute [18][20]. - Interestingly, fine-tuning models on role-related data did not enhance their identification capabilities and could even degrade performance, suggesting that rote memorization of character knowledge interferes with general reasoning abilities [20][22]. Group 5: Future Directions - The research calls for a reevaluation of how to construct AI systems that align with human values and judgment, emphasizing the need for reasoning-oriented enhancement methods rather than merely increasing character knowledge [24][25].