微调
Search documents
正式裁员30000人,赔偿N+4!
菜鸟教程· 2026-01-06 03:30
这也意味着,过去一年 减员了近24940人 。 | 时间 | 员工数量 | 员工环比减少 | | --- | --- | --- | | 截止 2021 年 12月 31 日 | 259316人 | 1 | | 截止 2022 年 03 月 31 日 | 254941 人 | -4375 人 | | 截止 2022 年 06 月 30 日 | 245700 人 | -9241 人 | | 截止 2022 年 09 月 30 日 | 243903 人 | -1797 人 | | 截止 2022 年 12月 31 日 | 239740 人 | -4163 人 | | 截止 2023 年 03 月 31 日 | 235216 人 | -4524 人 | | 截止 2023 年 06月 30 日 | 228675 人 | -6541 人 | | 截止 2023 年 09 月 30 日 | 224955 人 | -3270人 | | 截止 2023年 12月 31 日 | 219260 人 | -૨૯૭૨ V | | 截止 2024 年 03 月 31 日 | 204891 人 | -14369 人 | | 截止 20 ...
有300亿美元也未必“再造GPT-4”?NUS尤洋最新长文:拆穿AI增长瓶颈的真相
量子位· 2025-12-31 03:37
Core Viewpoint - The article discusses the growing anxiety surrounding the "AI bottleneck" as the third anniversary of ChatGPT approaches, questioning whether current technological paradigms can effectively utilize increased computational power to develop models significantly stronger than GPT-4 [1][2]. Group 1: Nature of Intelligence and Its Measurement - Intelligence is fundamentally about energy conversion, where AI has transformed electricity into reusable intelligence over the past decade, but the efficiency of this conversion is now under scrutiny [6]. - The essence of intelligence is not explanation but prediction, characterized by the ability to forecast future states and bear the consequences of those predictions [7][10]. - The current models derive their intelligence primarily from the pre-training phase, which consumes the most energy and computation, raising questions about the stability of intelligence growth with continued computational investment [15][20]. Group 2: Computational Paradigms and Their Limitations - The article emphasizes that the real bottleneck is not the cessation of computational growth but rather the diminishing returns in the relationship between computational power and intelligence growth [22][27]. - It challenges the mainstream narrative by suggesting that pre-training, fine-tuning, and reinforcement learning are fundamentally about gradient computation and parameter updates, rather than distinct methodologies [12][11]. - The success of the Transformer architecture is attributed to its compatibility with GPU systems, which has enabled a stable feedback loop between computational growth, model scaling, and capability enhancement [16][18]. Group 3: Future Directions and Exploration - Future AI infrastructure should focus on the overall scalability of parallel computing systems rather than just single-chip performance, with an emphasis on maintaining or improving the ratio of computational to communication costs [24][25]. - Multiple exploration directions are proposed, including higher precision, advanced optimizers, and more scalable architectures or loss functions, all aimed at ensuring that increased computational investments yield proportional intelligence enhancements [25][26]. - The article concludes that as long as more efficient computational organization methods can be found, the upper limits of intelligence are far from being reached [27].
离谱!裁员裁出新高度了。。。
猿大侠· 2025-12-05 04:11
凌晨一点,群里的粉丝发了这条消息。 "刚收到通知,整个后端小组都被优化了,连 十年经验的大佬 也没留下。" ✅微调: 针对特定任务优化,让模型适配业务 别担心!我邀请了拥有丰富经验的大佬陈旸老师,专为各位开发者设计了 《大模型应用开发实战训练》 课程。 | Alvi Horro Ele M | 25-35K-15薪 | 大模型应用开发工程 20-40K-15薪 | ai应用开发工程师 | 50-70K-16薪 | | --- | --- | --- | --- | --- | | (大模型方向) | | S | ◎ 北京 ·朝阳区 ·大山子 ● 3-5年 | 含 本科 | | 4 北京 ·朝阳区 ·酒仙桥 ● 3-5年 | ◎ 本科 | 2 北京 ·海淀区 ·西北旺 ● 1-3年 窗 本科 | | | | 郑先生 同程旅行 · 招聘者 | | 李石磊 百度 · 资深工程师 | 张先生 同程旅行·數据库运维研发主管 今日活跃 | | | 4月内活跃 | | 昨日活跃 | | | | 职位详情 RAG 智能体 MySQL Redis | 大横型 | 职位详情 智能体 Java Agent 深度学习 大楼型舞法 | ...
确认裁员了,很严重,所有人做好准备吧!
菜鸟教程· 2025-12-04 03:30
几乎同时,我的猎头朋友发了条朋友圈: " 急招AI大模型工程师,年薪120万起! 持续三个月没招到合适的人,求推荐……" "刚收到通知,整个后端小组都被优化了,连 十年经验的大佬 也没留下。" 凌晨一点,群里的粉丝发了这条消息。 | Alvi Horro Ele M | 25-35K-15薪 | 大模型应用开发工程 20-40K-15薪 | ai应用开发工程师 | 50-70K-16薪 | | --- | --- | --- | --- | --- | | (大模型方向) | | S | ◎ 北京 ·朝阳区 ·大山子 ● 3-5年 | 含 本科 | | 4 北京 ·朝阳区 ·酒仙桥 ● 3-5年 | ◎ 本科 | 2 北京 ·海淀区 ·西北旺 ● 1-3年 窗 本科 | | | | 郑先生 同程旅行 · 招聘者 | | 李石磊 百度 · 资深工程师 | 张先生 同程旅行·數据库运维研发主管 今日活跃 | | | 4月内活跃 | | 昨日活跃 | | | | 职位详情 RAG 智能体 MySQL Redis | 大横型 | 职位详情 智能体 Java Agent 深度学习 大楼型舞法 | 职位详情 RAG 向量 ...
离谱!裁员裁出新高度了。。。
程序员的那些事· 2025-11-17 03:59
Core Insights - The article highlights the stark contrast in the job market, where traditional tech positions are being eliminated while demand for AI model engineers is surging, with salaries starting at 1.2 million yuan per year [2][11] - It emphasizes the urgent need for professionals skilled in three core technologies: RAG (Retrieval-Augmented Generation), AI agents, and model fine-tuning, which are essential for developing AI applications [2][8] Group 1: Job Market Dynamics - Traditional tech roles are rapidly being phased out, with even experienced professionals being let go, indicating a shift in industry demands [1] - There is a significant shortage of qualified AI model engineers, as evidenced by the difficulty in filling positions despite high salaries [2][11] - The article suggests that the current AI wave presents a critical opportunity for tech professionals to pivot their careers [2][11] Group 2: Required Skills and Training - Companies are looking for engineers who can integrate external information into models (RAG), enable AI to perform tasks autonomously (AI agents), and optimize models for specific tasks (fine-tuning) [2][8] - A training course titled "Practical Training in AI Model Application Development" is being offered to help professionals acquire these essential skills [3][14] - The course includes live sessions that combine theoretical knowledge with practical projects, focusing on RAG, AI agents, and fine-tuning [3][6] Group 3: Career Advancement Opportunities - Participants in the training will receive a job-seeking package that includes interview question banks and insights into high-paying job opportunities [3][9] - The course aims to help individuals build a competitive edge in the job market, particularly for those concerned about job security as they age [11][16] - Successful completion of the course is expected to enhance participants' resumes and improve their chances of securing high-paying positions [13][16]
很严重了,大家别轻易离职。。
菜鸟教程· 2025-10-10 03:30
Core Insights - The biggest opportunity in the AI industry by 2025 lies in the application layer, with companies like ByteDance rapidly expanding their AI teams and job postings for AI-related positions surging [1][3] - There is a significant demand for large model application development engineers, with over 60% of enterprises pushing for AI product implementation, yet these skilled professionals are extremely scarce [1][3] - The average monthly salary for AI positions is 78,000 yuan, with internships offering daily wages as high as 4,000 yuan, indicating the high value of AI skills in the job market [1][3] Group 1 - Companies are increasingly focusing on three core capabilities for AI application: RAG (Retrieval-Augmented Generation), Agent intelligence, and fine-tuning for specific tasks [1][3] - The rapid growth in job postings for large model-related positions, with over 1,000 companies hiring, highlights the urgent need for skilled professionals in the AI sector [1][3] - The transition to AI roles is lucrative, with some individuals already earning annual salaries exceeding one million yuan after shifting to AI-focused positions [1][3] Group 2 - A specialized course titled "Large Model Application Development Practical Training" is being offered to help developers master essential AI skills, including RAG, Agent, and fine-tuning [3][5] - The course includes live sessions that combine theoretical knowledge with practical project demonstrations, aiming to equip participants with the skills needed for enterprise-level projects [3][5] - Participants will receive a job-seeking package that includes interview question banks and insights into high-paying job opportunities [3][5] Group 3 - The course has already served over 20,000 students, receiving positive feedback for its effectiveness in enhancing learning outcomes and job placement success [8] - The training program emphasizes the importance of building a technical barrier to stand out in the competitive job market and avoid potential layoffs [10][11] - The course also offers opportunities for direct referrals and job placements, increasing the chances of securing high-paying positions in the AI field [13][17]
后训练时代如何延续Scaling Law?这是你该读的LLM后训练综述
机器之心· 2025-05-01 02:11
Core Insights - The article discusses the significance of post-training techniques such as fine-tuning and reinforcement learning (RL) in enhancing the capabilities of large language models (LLMs) [1][2][5]. Summary by Sections Overview of LLM Post-Training - A recent review report on LLM post-training has gained positive feedback, compiling a resource library of related papers and tools that has received over 700 stars [2]. - The review includes contributions from institutions like UAE University of Artificial Intelligence, University of Central Florida, Google DeepMind, and the University of Oxford, covering techniques to enhance LLMs through RL, supervised fine-tuning, and evaluation benchmarks [2]. Challenges in LLMs - Despite advancements, LLMs face issues such as generating misleading content (referred to as "hallucinations") and maintaining logical consistency in longer conversations [5]. - The reasoning capabilities of LLMs are debated, as they operate on implicit statistical patterns rather than explicit logical reasoning, which can lead to difficulties in simple logical tasks [5]. Training Phases of LLMs - The training process of LLMs is divided into two main phases: pre-training and post-training. Pre-training focuses on next-token prediction using large datasets, while post-training involves multiple rounds of fine-tuning and alignment to improve model behavior and reduce biases [6]. Fine-Tuning Techniques - Fine-tuning is essential for adapting pre-trained LLMs to specific tasks, enhancing their performance in areas like sentiment analysis and medical diagnosis. However, it carries risks of overfitting and high computational costs [7][10]. - Efficient techniques like Low-Rank Adaptation (LoRA) and adapters can reduce computational overhead while allowing models to specialize in specific tasks [10]. Reinforcement Learning in LLMs - RL is introduced to improve LLM adaptability through dynamic feedback and optimization of sequential decisions. This differs from traditional RL settings, as LLMs select tokens from a vast vocabulary rather than a limited action space [9][11]. - The feedback in language-based RL is often sparse and subjective, relying on heuristic evaluations rather than clear performance metrics [13]. Scaling Techniques - Scaling is crucial for enhancing LLM performance and efficiency, though it presents significant computational challenges. Techniques like Chain-of-Thought (CoT) reasoning and search-based methods help improve multi-step reasoning and factual accuracy [14][15]. - Despite advancements, challenges such as diminishing returns and increased inference time remain, necessitating targeted strategies for efficient deployment [15]. Evaluation Benchmarks - Various benchmarks have been proposed to assess the performance of LLM post-training, ensuring a comprehensive understanding of their strengths and limitations across different tasks [46]. - These benchmarks play a vital role in improving response accuracy, robustness, and ethical compliance during the post-processing phase [46]. Future Directions - The article highlights the growing interest in RL for optimizing LLMs since 2020, emphasizing the need for interactive methods and robust reward modeling to address challenges like reward hacking [52]. - Key areas for future research include personalized and adaptive LLMs, process versus outcome reward optimization, and the integration of dynamic reasoning frameworks to enhance model performance in complex queries [53].