Workflow
告别Transformer,重塑机器学习范式:上海交大首个「类人脑」大模型诞生
机器之心·2025-08-13 09:29

Core Viewpoint - The article discusses the introduction of BriLLM, a new language model inspired by human brain mechanisms, which aims to overcome the limitations of traditional Transformer-based models, such as high computational demands, lack of interpretability, and context size restrictions [3][8]. Group 1: Limitations of Current Models - Current Transformer-based models face three main issues: high computational requirements, black-box interpretability, and context size limitations [6][8]. - The self-attention mechanism in Transformers has a time and space complexity of O(n²), leading to increased computational costs as input length grows [7]. - The internal logic of Transformers lacks transparency, making it difficult to understand the decision-making process within the model [7][8]. Group 2: Innovations of BriLLM - BriLLM introduces a new learning mechanism called SiFu (Signal Fully-connected Flowing), which replaces traditional prediction operations with signal transmission, mimicking the way neural signals operate in the brain [9][13]. - The model architecture is based on a directed graph, allowing all nodes to be interpretable, unlike traditional models that only provide limited interpretability at the input and output layers [9][19]. - BriLLM supports unlimited context processing without increasing model parameters, allowing for efficient handling of long sequences [15][16]. Group 3: Model Specifications - BriLLM has two versions: BriLLM-Chinese and BriLLM-English, with non-sparse model sizes of 16.90 billion parameters for both languages [21]. - The sparse version of the Chinese model has 2.19 billion parameters, while the English version has 0.96 billion parameters, achieving a parameter reduction of approximately 90% [21]. - The model's design allows for the integration of multiple modalities, enabling it to process not just language but also visual and auditory inputs [25][26]. Group 4: Future Prospects - The team aims to develop a multi-modal brain-inspired AGI framework, which will integrate perception and motion [27]. - BriLLM has been selected for funding under Shanghai Jiao Tong University's "SJTU 2030" plan, which supports groundbreaking research projects [27].