语言建模
Search documents
ICLR 2026 Oral | Revela:用语言建模重新定义稠密检索器训练
机器之心· 2026-03-26 11:41
Core Insights - The article discusses the development of Revela, a new approach to training dense retrievers within retrieval-augmented generation (RAG) systems, which has won prestigious awards for its innovative methodology [2][24]. Group 1: Challenges in Dense Retriever Training - Training high-quality dense retrievers is challenging due to reliance on manually annotated data, which is costly in specialized fields like law and code [4]. - The difficulty of negative sample mining introduces additional complexity, as random negative samples provide weak signals [4]. - There is a disconnect between contrastive loss and mainstream language model pre-training objectives, making it hard to leverage pre-trained knowledge effectively [4]. Group 2: Revela's Approach - Revela unifies the training objective of dense retrievers under a language modeling framework, allowing for a more natural training path [6]. - It introduces an in-batch attention mechanism that dynamically references other relevant documents during the prediction of the next token, enhancing the similarity scores between text chunks [6][13]. - The architecture consists of a retriever for encoding text and calculating similarity, and a language model providing training signals, both optimized together [10]. Group 3: Advantages of Revela - The training objective aligns closely with language modeling, activating existing semantic understanding capabilities in pre-trained models [11]. - It is fully self-supervised, significantly reducing the need for manual annotations, which is advantageous in data-scarce professional domains [11]. - Revela demonstrates strong scalability, with performance improving as the retriever size and batch size increase [11]. Group 4: Experimental Results - In code retrieval (CoIR), Revela-3B achieved an average nDCG@10 of 60.1, surpassing several supervised models trained on large annotated datasets [18]. - In reasoning-intensive retrieval (BRIGHT), Revela-3B outperformed commercial APIs, achieving an average nDCG@10 of 20.1 with only Wikipedia text for training [21]. - For general retrieval (BEIR), Revela-3B matched the performance of a weakly supervised baseline while using significantly less training data and resources [22]. Group 5: Future Directions - Revela opens avenues for dynamic index construction, which could enhance semantic relevance in batch processing but poses computational challenges [24]. - There is potential for further model and data expansion, which could lead to performance improvements [24]. - The insights gained from the retriever could also inform improvements in language model training, suggesting a reciprocal enhancement potential [24].
脑机接口行业近况交流
2025-08-13 14:53
Summary of Brain-Computer Interface Industry Conference Call Industry Overview - The brain-computer interface (BCI) industry is accelerating commercialization with companies like Ladder Medical and Brain Supplement actively advancing clinical trials expected to complete within one to two years for medical device registration [1][3] - Neuralink has completed multiple clinical trials and plans to expand into more countries [1][3] - BCI technology focuses on motor cortex decoding, language decoding, and visual presentation, with significant advancements in helping paralyzed patients walk independently [1][3][4] Key Points and Arguments Clinical Trials and Regulatory Process - Invasive BCI clinical trials require type testing, ethical review, and recruitment of 30 to 100 subjects, taking one to two years to validate platform stability and implantation processes before applying for registration [1][5] - Non-invasive BCIs are widely used in consumer markets for brain state monitoring, closed-loop control, rehabilitation devices, and human-computer interaction, facing challenges like low signal-to-noise ratio [1][11][13] Market Potential and Applications - The most commercially viable applications are motor decoding and language modeling, with significant market demand and clear technological feasibility [6][7] - Visual prosthetics and treatments for paralysis show different progress rates, with some patients regaining walking ability through advanced research [8][9] Cost and Pricing Dynamics - Invasive BCI devices are costly, with domestic prices dropping to 50,000 to 100,000 yuan, while non-invasive devices are cheaper and have higher profit margins [2][16] - Non-invasive products are priced between 2,000 to 3,000 yuan, with production costs around 200 yuan, allowing for broader market access [16][17] Competitive Landscape - Companies with strong financing capabilities, like Ladder Medical and New Power Island, are likely to succeed in the invasive BCI market, while smaller teams leveraging academic resources may also find success [21][22] - Domestic companies are closing the gap with international leaders like Neuralink, with advancements in multi-channel systems and chip technology [19][20] Challenges and Considerations - Major challenges include high R&D costs, regulatory hurdles, and the need for effective marketing strategies to ensure product acceptance [22][23] - The low signal-to-noise ratio in non-invasive BCIs requires innovative solutions to enhance data quality and usability [24][26] - Clinical trials in China face difficulties due to insufficient policy support and ethical review challenges [27] Future Outlook - The BCI industry is expected to see significant breakthroughs in the next few years, with advancements in technology and regulatory approvals paving the way for broader market adoption [3][4][21]