Workflow
负载均衡
icon
Search documents
腾讯申请负载均衡方法及装置设备存储介质专利,实现业务多可用区多集群部署下多可用区均衡多集群容量均衡目标
Jin Rong Jie· 2025-07-15 04:41
Group 1 - Tencent Technology (Shenzhen) Co., Ltd. has applied for a patent titled "A Load Balancing Method, Device, Equipment, and Storage Medium," with publication number CN120315850A, and the application date is January 2024 [1] - The patent abstract describes a method that involves determining whether the number of replicas of a first business module deployed in a first availability zone is balanced, and if not, it calculates the number of replicas to be migrated [1] - The method aims to achieve balanced deployment across multiple availability zones and clusters, ensuring both multi-availability zone balance and multi-cluster capacity balance [1] Group 2 - Tencent Technology (Shenzhen) Co., Ltd. was established in 2000 and is primarily engaged in software and information technology services, with a registered capital of 2 million USD [2] - The company has made investments in 15 enterprises, participated in 260 bidding projects, and holds 5000 trademark and patent records, along with 472 administrative licenses [2]
3个中国程序员 vs 3个美国程序员,不得不承认,差距太大了!
猿大侠· 2025-06-27 14:57
Core Insights - The article reflects on the missed opportunity of creating a photo-sharing platform similar to Instagram, highlighting the importance of execution and timing in the tech industry [1][4][47]. Technical Architecture - Instagram's initial architecture was designed to be simple, avoiding reinventing the wheel and utilizing reliable technologies [9][7]. - The application was built on Amazon EC2 and Ubuntu Linux, with a focus on scalability and performance [6][7]. User Session Management - User sessions begin when the Instagram app is opened, sending requests to a load balancer that distributes traffic to application servers [10][14]. - Initially, Instagram used two Nginx servers for load balancing, later upgrading to Amazon's Elastic Load Balancer for better reliability [15]. Data Storage and Management - Instagram utilized PostgreSQL for storing user and photo metadata, implementing sharding to manage the large volume of data generated by user activity [21][23]. - The photo storage solution involved Amazon S3 and CloudFront, enabling efficient distribution of images globally [28]. Caching and Performance Optimization - Redis was initially used for mapping photo IDs to user IDs, with optimizations reducing memory usage significantly [30]. - Memcached was employed for session caching, ensuring quick access to frequently used data [31]. Monitoring and Error Handling - Instagram implemented Sentry for real-time error monitoring and used Munin for tracking system metrics, allowing for proactive issue resolution [39][40]. - External service monitoring was managed through Pingdom, with PagerDuty handling event notifications [41]. Reflection on Market Timing - The article emphasizes that the founders' lack of experience with modern technologies and cloud services at the time hindered their ability to capitalize on the emerging market [43][46]. - It concludes that many opportunities may be missed due to a lack of insider knowledge and market readiness [49].
专家一半时间在摸鱼?Adaptive Pipe & EDPB让昇腾MoE训练效率提升70%
雷峰网· 2025-06-03 07:17
Core Viewpoint - The article discusses the challenges and solutions related to the training efficiency of the Mixture of Experts (MoE) models, highlighting that over half of the training time is wasted on waiting due to communication and load imbalance issues [2][3][4]. Group 1: MoE Model Training Challenges - The efficiency of MoE model training clusters faces two main challenges: communication waiting due to expert parallelism and load imbalance leading to computation waiting [4]. - The communication waiting arises from the need for All-to-All communication when splitting experts across devices, causing idle computation units [4]. - Load imbalance occurs as some experts are frequently called while others remain underutilized, exacerbated by varying lengths of training data and differences in computational loads across model layers [4]. Group 2: Solutions Implemented - Huawei developed the Adaptive Pipe and EDPB optimization solutions to enhance MoE training efficiency, likening the system to a smart traffic hub that eliminates waiting [5][22]. - The AutoDeploy simulation platform allows for rapid analysis and optimization of training loads, achieving 90% accuracy in finding optimal strategies for hardware specifications [8][22]. - The Adaptive Pipe communication framework achieves over 98% communication masking, allowing computations to proceed without waiting for communication [10][11]. Group 3: Performance Improvements - The EDPB global load balancing technique improves throughput by 25.5% by ensuring balanced expert scheduling during training [14]. - The system's end-to-end training throughput increased by 72.6% in the Pangu Ultra MoE 718B model training, demonstrating significant performance gains [22][23].
国电通申请基于负载均衡的与外部系统统一数据交互的装置专利,提高了系统的资源利用率
Jin Rong Jie· 2025-04-29 03:13
Group 1 - Beijing Guodian Tong Network Technology Co., Ltd. applied for a patent titled "A Device for Unified Data Interaction with External Systems Based on Load Balancing," published as CN119892740A, with an application date of December 2024 [1] - The patent describes a device that includes a data interaction service management module, timer module, business priority adjustment module, data processing module, and data reset module, which enhances resource utilization, task processing efficiency, and stability in complex business scenarios [1] - The company was established in 2000, located in Beijing, and primarily engages in professional technical services, with a registered capital of 73 million RMB [1] Group 2 - State Grid Information Communication Industry Group Co., Ltd. was founded in 2015, also located in Beijing, focusing on software and information technology services, with a registered capital of approximately 1.5 billion RMB [2] - The company has made investments in 41 enterprises and participated in 5,000 bidding projects, holding 311 trademark records and 4,572 patent records [2] - Additionally, the company possesses 7 administrative licenses [2]
Deepseek-V3/R1利润率545%怎么算的?
小熊跑的快· 2025-03-02 06:45
Core Insights - DeepSeek V3/R1 inference system shows a theoretical daily income of $562,027 against a daily cost of $87,072, resulting in a profit margin of 545% [1] - Actual profit margins are expected to be significantly lower due to factors such as lower pricing for V3, limited monetization of services, and discounts during off-peak hours [2] Profitability Analysis - The theoretical calculations assume full load operation at R1 pricing, but real-world conditions may not allow for such efficiency [3] - Daily average token calls include 6,080 million input tokens and 1,680 million output tokens, leading to a total of 7,760 million tokens called daily [3] - Estimated daily income from V3 is approximately 665,600 yuan, while R1 generates about 1,996,800 yuan, totaling around 2,662,400 yuan in daily API income [3] Technological Advancements - DeepSeek employs a mixture of experts (MoE) model to optimize throughput and reduce latency, utilizing parallel processing across multiple GPUs [5] - The system implements a dual-batch overlapping strategy to minimize communication costs and enhance overall throughput [6] - Load balancing mechanisms are in place to ensure even distribution of computational tasks across GPUs, preventing bottlenecks [7] Infrastructure and Resource Management - A distributed file system (3FS) is utilized for efficient data transfer between computers without CPU intervention, enhancing throughput and reducing latency [8] - The introduction of DualPipe allows for complete overlap of forward and backward computation-communication phases, minimizing pipeline stalls [8] - The use of redundant experts in the expert-parallel load balancer dynamically allocates input to less loaded expert replicas during inference [8] Market Implications - DeepSeek's open-source approach is seen as a significant opportunity for domestic cloud and AI applications, reducing reliance on GPUs and breaking monopolies in the industry [4] - The advancements in DeepSeek's technology are expected to create favorable conditions for large cloud providers and applications in the domestic market [4]