BERT

Search documents
X @THE HUNTER ✴️
GEM HUNTER 💎· 2025-09-23 16:57
Trending crypto today:$DOG$TOSHI$ASTER$APEX$MOMO$TRUMP$WLFI$PUMP$SUN$UFD$TROLL$BERT$NMR$BITCOIN$BLESSI'm still missing some, let me know which ones ...
张小珺对话OpenAI姚顺雨:生成新世界的系统
Founder Park· 2025-09-15 05:59
Core Insights - The article discusses the evolution of AI, particularly focusing on the transition to the "second half" of AI development, emphasizing the importance of language and reasoning in creating more generalizable AI systems [4][62]. Group 1: AI Evolution and Language - The concept of AI has evolved from rule-based systems to deep reinforcement learning, and now to language models that can reason and generalize across tasks [41][43]. - Language is highlighted as a fundamental tool for generalization, allowing AI to tackle a variety of tasks by leveraging reasoning capabilities [77][79]. Group 2: Agent Systems - The definition of an "Agent" has expanded to include systems that can interact with their environment and make decisions based on reasoning, rather than just following predefined rules [33][36]. - The development of language agents represents a significant shift, as they can perform tasks in more complex environments, such as coding and internet navigation, which were previously challenging for AI [43][54]. Group 3: Task Design and Reward Mechanisms - The article emphasizes the importance of defining effective tasks and environments for AI training, suggesting that the current bottleneck lies in task design rather than model training [62][64]. - A focus on intrinsic rewards, which are based on outcomes rather than processes, is proposed as a key factor for successful reinforcement learning applications [88][66]. Group 4: Future Directions - The future of AI development is seen as a combination of enhancing agent capabilities through better memory systems and intrinsic rewards, as well as exploring multi-agent systems [88][89]. - The potential for AI to generalize across various tasks is highlighted, with coding and mathematical tasks serving as prime examples of areas where AI can excel [80][82].
LeCun团队揭示LLM语义压缩本质:极致统计压缩牺牲细节
量子位· 2025-07-04 01:42
Core Viewpoint - The article discusses the differences in semantic compression strategies between large language models (LLMs) and human cognition, highlighting that LLMs focus on statistical compression while humans prioritize detail and context [4][17]. Group 1: Semantic Compression - Semantic compression allows efficient organization of knowledge and quick categorization of the world [3]. - A new information-theoretic framework was proposed to compare the strategies of humans and LLMs in semantic compression [4]. - The study reveals fundamental differences in compression efficiency and semantic fidelity between LLMs and humans, with LLMs leaning towards extreme statistical compression [5][17]. Group 2: Research Methodology - The research team established a robust human concept classification benchmark based on classic cognitive science studies, covering 1,049 items across 34 semantic categories [5][6]. - The dataset provides category affiliation information and human ratings of "typicality," reflecting deep structures in human cognition [6][7]. - Over 30 LLMs were selected for evaluation, with parameter sizes ranging from 300 million to 72 billion, ensuring a fair comparison with human cognitive benchmarks [8]. Group 3: Findings and Implications - The study found that LLMs' concept classification results align significantly better with human semantic classification than random levels, validating LLMs' basic capabilities in semantic organization [10][11]. - However, LLMs struggle with fine-grained semantic differences, indicating a mismatch between their internal concept structures and human intuitive category assignments [14][16]. - The research highlights that LLMs prioritize reducing redundant information, while humans emphasize adaptability and richness, maintaining context integrity [17]. Group 4: Research Contributors - The research was conducted collaboratively by Stanford University and New York University, with Chen Shani as the lead author [19][20]. - Yann LeCun, a prominent figure in AI and a co-author of the study, has significantly influenced the evolution of AI technologies [24][25][29].
盘一盘,2017年Transformer之后,LLM领域的重要论文
机器之心· 2025-06-29 04:23
Core Insights - The article discusses Andrej Karpathy's concept of "Software 3.0," where natural language becomes the new programming interface, and AI models execute specific tasks [1][2]. - It emphasizes the transformative impact of this shift on developers, users, and software design paradigms, indicating a new computational framework is being constructed [2]. Development of LLMs - The evolution of Large Language Models (LLMs) has accelerated since the introduction of the Transformer architecture in 2017, leading to significant advancements in the GPT series and multimodal capabilities [3][5]. - Key foundational papers that established today's AI capabilities are reviewed, highlighting the transition from traditional programming to natural language interaction [5][6]. Foundational Theories - The paper "Attention Is All You Need" (2017) introduced the Transformer architecture, which relies solely on self-attention mechanisms, revolutionizing natural language processing and computer vision [10][11]. - "Language Models are Few-Shot Learners" (2020) demonstrated the capabilities of GPT-3, establishing the "large model + large data" scaling law as a pathway to more general artificial intelligence [13][18]. - "Deep Reinforcement Learning from Human Preferences" (2017) laid the groundwork for reinforcement learning from human feedback (RLHF), crucial for aligning AI outputs with human values [15][18]. Milestone Breakthroughs - The "GPT-4 Technical Report" (2023) details a large-scale, multimodal language model that exhibits human-level performance across various benchmarks, emphasizing the importance of AI safety and alignment [26][27]. - The release of LLaMA models (2023) demonstrated that smaller models trained on extensive datasets could outperform larger models, promoting a new approach to model efficiency [27][30]. Emerging Techniques - The "Chain-of-Thought Prompting" technique enhances reasoning in LLMs by guiding them to articulate their thought processes before arriving at conclusions [32][33]. - "Direct Preference Optimization" (2023) simplifies the alignment process of language models by directly utilizing human preference data, making it a widely adopted method in the industry [34][35]. Important Optimizations - The "PagedAttention" mechanism improves memory management for LLMs, significantly enhancing throughput and reducing memory usage during inference [51][52]. - The "Mistral 7B" model showcases how smaller models can achieve high performance through innovative architecture, influencing the development of efficient AI applications [55][56].
ESG体系下的AI研究(一):多维投资增效,防范伦理风险
ZHESHANG SECURITIES· 2025-06-05 14:23
Group 1: AI and ESG Investment Infrastructure - AI is expected to significantly enhance ESG investment infrastructure by addressing challenges such as high compliance costs and difficulties in data acquisition and analysis[2] - AI can help regulatory bodies reduce tracking costs and improve the implementation of ESG policies through dynamic monitoring and cross-validation systems[2] - Companies can utilize AI tools like knowledge graphs to analyze policies and automate compliance reporting, thereby lowering compliance costs and encouraging ESG practices[2] Group 2: AI's Role in Investment Strategy and Marketing - Traditional ESG data faces issues like low update frequency and high processing costs; AI can streamline data collection and analysis, providing timely insights for investors[3] - Machine learning algorithms can assist in constructing and selecting factor strategies, optimizing risk-return profiles for investors[3] - Generative AI can significantly reduce marketing costs by generating marketing strategies and content, enhancing investor engagement[3] Group 3: Responsible AI and Ethical Risk Management - The integration of responsible AI principles with ESG frameworks can help identify companies with ethical risks associated with AI, aiding investors in risk management[4] - AI's dual impact on environmental, social, and governance aspects necessitates a robust ethical risk analysis framework to mitigate potential negative consequences[4] - Investors can leverage communication with companies to gather information on AI governance measures, enhancing their understanding of associated risks[4] Group 4: Risk Considerations - Potential risks include slower-than-expected economic recovery, instability of AI models, and fluctuations in market sentiment and preferences[5]
AI浪潮录丨王晟:谋求窗口期,AI初创公司不要跟巨头抢地盘
Bei Ke Cai Jing· 2025-05-30 02:59
Core Insights - Beijing is emerging as a strategic hub in the AI large model sector, driven by technological innovation and a supportive ecosystem for breakthroughs [1] - The role of angel investors is crucial in the AI industry, providing essential support to startups and helping them take their first steps [4] - The AI large model wave has gained momentum globally since 2023, with early investments in generative models proving to be prescient [5][6] Group 1: AI Development and Investment Trends - The AI large model trend is characterized by a shift from previous waves focused on computer vision and autonomous driving to the current emphasis on AI agents and embodied intelligence [5][6] - Investors are increasingly favoring experienced founders with strong academic and research backgrounds, as seen in the case of companies like DeepMind and the Tsinghua NLP team [12][16] - The emergence of open-source models like Llama has accelerated competition among AI companies, allowing them to shorten development timelines [13] Group 2: Investment Strategies and Market Dynamics - Angel investors are focusing on a select number of projects, often operating in a "water under the bridge" manner, avoiding fully marketized projects [14][15] - The investment landscape is divided between long-term oriented funds that prioritize innovation and those focused on immediate revenue generation [21][22] - The success of companies like DeepSeek highlights the challenges faced by startups in competing with established giants, as the consensus around large models has solidified post-ChatGPT [26][27] Group 3: Entrepreneurial Characteristics and Market Challenges - Current AI entrepreneurs are predominantly scientists or technical experts, forming a close-knit community that is easier to identify and engage with [18][19] - The academic foundation of AI startups is critical, as many successful ventures are built on decades of research and development from their respective institutions [16][20] - The market is witnessing a shift where the ability to innovate is becoming more important than merely having financial resources, as the previous model of "buying capability" is no longer sustainable [27][28]
DeepSeek技术溯源及前沿探索报告
Zhejiang University· 2025-05-22 01:20
浙江大学DS系列专题 DeepSeek技术溯源及前沿探索 主讲人:朱强 浙江大学计算机科学与技术学院 人工智能省部共建协同创新中心(浙江大学) https://person.zju.edu.cn/zhuq 1 Outline 一、语言模型 三、ChatGPT 二、Transformer 四、DeepSeek 五、新一代智能体 2 语言模型:终极目标 Language Modeling 对于任意的词序列,计算出这个序列是一句话的概率 我们每天都和语言模型打交道: I saw a cat I saw a cat on the chair I saw a cat running after a dog I saw a ca car I saw a cat in my dream 3 语言模型:基本任务 编码:让计算机理解人类语言 She is my mom 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 只有一个1,其余均为0 One-hot Encoding有什么缺点吗? One-hot Encoding 4 编码:让计算机理解人类语言 Word Embedding A bottle of tez ...
一文讲透AI历史上的10个关键时刻!
机器人圈· 2025-05-06 12:30
Core Viewpoint - By 2025, artificial intelligence (AI) has transitioned from a buzzword in tech circles to an integral part of daily life, impacting various industries through applications like image generation, coding, autonomous driving, and medical diagnosis. The evolution of AI is marked by significant breakthroughs and challenges, tracing back to the Dartmouth Conference in 1956, leading to the current technological wave driven by large models [1]. Group 1: Historical Milestones - The Dartmouth Conference in 1956 is recognized as the birth of AI, where pioneers gathered to explore machine intelligence, laying the foundation for AI as a formal discipline [2][3]. - In 1957, Frank Rosenblatt developed the Perceptron, an early artificial neural network that introduced the concept of optimizing models using training data, which became central to machine learning and deep learning [4][6]. - ELIZA, created in 1966 by Joseph Weizenbaum, was the first widely recognized chatbot, demonstrating the potential of AI in natural language processing by simulating human-like conversation [7][8]. - The rise of expert systems in the 1970s, such as Dendral and MYCIN, showcased AI's ability to perform specialized tasks in fields like chemistry and medical diagnosis, establishing its application in professional domains [9][11]. - IBM's Deep Blue defeated world chess champion Garry Kasparov in 1997, marking a significant milestone in AI's capability to outperform humans in strategic decision-making [12][14]. - The 1990s to 2000s saw a shift towards data-driven algorithms in AI, emphasizing the importance of machine learning [15]. - The emergence of deep learning in 2012, particularly through the work of Geoffrey Hinton, revolutionized AI by utilizing multi-layer neural networks and backpropagation techniques, leading to significant advancements in model training [17][18]. - The introduction of Generative Adversarial Networks (GANs) in 2014 by Ian Goodfellow transformed the field of generative models, enabling the creation of realistic synthetic data [20]. - AlphaGo's victory over Lee Sedol in 2016 highlighted AI's potential in complex games requiring intuition and strategic thinking, further pushing the boundaries of AI capabilities [22]. - The development of large language models began with the introduction of the Transformer architecture in 2017, leading to models like GPT-3, which demonstrated emergent abilities and set the stage for the current AI landscape [24][26].