可观测性
Search documents
对话一线架构大佬 Christian Ciceri:颠覆传统认知,顶尖架构师眼中,决定职业生涯上限的不是技术能力
3 6 Ke· 2025-11-12 07:48
近日,InfoQ 独家采访了 Apiumhub 联合创始人、知名软件架构专家 Christian Ciceri,带领读者从一线 架构师的实战经验出发,深度探讨"可度量、可演化的架构"理念,以及 AI 与现代软件工程工具对架构 实践的影响。此次访谈不仅回顾了 Ciceri 本人的成长历程,也提供了丰富的架构实践智慧,让读者在快 速变化的技术环境中,理解如何保持架构质量与团队适应性。 Ciceri 的职业路径颇具代表性:他从一线软件开发与架构设计实践中积累经验,目睹了大型企业中灵活 性不足、交付周期漫长、流程效率低下的常见挑战。2014 年,他与叶夫根尼·普雷丁(Evgeny Predein) 在巴塞罗那共同创立了 Apiumhub,立志将敏捷方法论与软件架构紧密结合到业务运营的核心。正是在 长期实践中,Ciceri 逐渐形成了"可度量、可演化架构"的理念,并将这一理念凝练在其著作《软件架构 指标》中。他强调,构建稳固且具适应性的系统,不仅能提升软件交付质量,还能保证系统随业务需求 同步成长。 在采访中,Ciceri 对"可观测性"(Observability)和"架构治理"进行了深入阐述。他指出,系统运行时的 ...
多维无界,观测有道|Bonree ONE 2025秋季版全球发布!
Jing Ji Guan Cha Bao· 2025-10-29 10:07
Core Insights - Bonree Data launched the Bonree ONE 2025 Fall Edition, an integrated intelligent observability platform aimed at helping enterprises navigate complex digital systems more effectively [1] - The company emphasizes that observability is a strategic cornerstone for businesses, especially in the context of AI-driven industrial transformation [1] Group 1: AI Deep Integration - The platform features a multi-dimensional intelligent module collaboration framework that integrates observability with AI, enabling autonomous operational decision-making and precise root cause analysis [3][4] - The "Xiao Rui Assistant" serves as a unified interaction entry point, offering intelligent Q&A, navigation guidance, and AI writing capabilities to enhance user experience [3] Group 2: Comprehensive Multi-Dimensional Observability - The observability capability is centered around business forms, organizing IT operations data for layered and categorized presentation, allowing for quick emergency recovery and business continuity [5] - Users can customize key path views around core business processes, enabling a holistic view of system architecture and operational status [5] Group 3: Architecture Breakthrough and Upgrades - The core ETL engine, Ingester, has been restructured to reduce resource consumption by 65% and achieve millisecond-level data access, enhancing query efficiency [6][8] - The QueryService has significantly improved compatibility with PromQL, increasing query convenience and capability [6][9] Group 4: AI Service and Intelligent Capabilities - The AI Service is built around large model technology, featuring intelligent modules for smart Q&A, next-generation root cause analysis, and natural language-driven intelligent retrieval [10] - The platform supports flexible scheduling and closed-loop service capabilities, facilitating comprehensive coverage of AI technology from generation to implementation [10] Group 5: Industry Recognition and Application - Bonree ONE has gained recognition from over a hundred leading clients across key sectors such as finance, internet, energy, and manufacturing [11] - Guotou Securities has implemented Bonree ONE to enhance its end-to-end observability system, improving collaboration efficiency across various operational scenarios [11] Group 6: Future Outlook - Bonree Data plans to increase overseas investments, focusing on deepening its presence in Southeast Asia and expanding its global business footprint [16] - The company aims to become a top-tier high-tech firm in the enterprise service sector, committed to building smarter and more reliable observability capabilities for global clients [16]
AI 时代可观测性的“智”变与“智”控 | 直播预告
AI前线· 2025-10-12 05:32
Core Viewpoint - The article discusses a live event featuring experts from Alibaba Cloud, ByteDance, and Xiaohongshu, focusing on the theme of observability in the AI era, highlighting the transformation and control of intelligence in this context [2][3]. Group 1: Event Details - The live event is scheduled for October 15, from 20:00 to 21:30, and will be hosted by Zhang Cheng, a senior technical expert from Alibaba Cloud [2]. - The guest speakers include Dr. Li Ye, an algorithm expert from Alibaba Cloud, Dr. Dong Shandong, the algorithm lead for ByteDance's Dev-Infra observability platform, and Wang Yap, the head of the observability team at Xiaohongshu [3]. Group 2: Discussion Topics - The event will address the "route dispute" regarding whether the implementation of large models should prioritize intelligent governance or algorithms [3]. - It will also cover the efficiency revolution, specifically how SRE Agents can reduce noise and improve efficiency [6]. Group 3: Live Event Benefits - Attendees will receive an AI observability resource package, which includes insights on building a general intelligent closed loop of "observability - analysis - action" [6]. - The package will provide foundational principles for observability metrics attribution and share experiences with eBPF in large-scale operations [6]. - A new attribution platform is highlighted, which can locate 80% of online faults within minutes, providing essential support for mobile fault mitigation [6].
AI低质代码泛滥、API经济盛行,老牌科技厂商 F5 如何应对大模型应用“后遗症”?
AI前线· 2025-09-10 13:01
Core Insights - The article discusses the significant impact of AI programming tools on development efficiency while highlighting new challenges such as security vulnerabilities, low-quality code, and the complexity of debugging AI-generated code [2][4]. Group 1: AI Tools and Challenges - AI programming tools have been reported to significantly enhance development efficiency, but they also introduce new security vulnerabilities and low-quality code issues [2]. - The increase in API numbers due to AI tools has led to a heavier operational burden for enterprises [2]. - The "black box" issue complicates the understanding of AI-generated code, making debugging and security checks more time-consuming [2]. Group 2: Security and Performance - Performance is crucial for user experience, and balancing security with user-friendly authentication processes is a pressing challenge [4]. - Over 91% of users have implemented WAAP (Web Application and API Protection) to secure AI and machine learning models [5]. Group 3: AI in Operations - A significant percentage of operational staff are utilizing AI to streamline processes: 57% use AI for script generation, 56% for custom policy creation, and 55% for executing scripts [7]. - Observability is key for AI-driven automation, with 65% of respondents leveraging it for this purpose [7]. Group 4: Application Trends - The proportion of modern applications is expected to surpass traditional applications by 2025, with modern applications rising from 29% in 2020 to 53% [7]. - By 2025, 54% of application and API performance analysis will be based on large models [7]. Group 5: AI Implementation Challenges - Complex IT architectures, unique security needs, and cost control are identified as major challenges for enterprises adopting AI applications [9]. - By 2028, 80% of enterprises are expected to embed AI capabilities, with 94% of AI applications deployed in hybrid cloud environments [12]. Group 6: F5's Response - F5 has transitioned to an Application Delivery and Security Platform (ADSP) to meet the growing demand for integrated performance and security solutions [11]. - The ADSP platform aims to provide seamless operation across various environments, addressing the complexities of modern application security [14]. Group 7: AI Gateway and Security - F5 has introduced the AI Gateway, which offers capabilities for routing based on large language models and provides protection against prompt injection and PII data leakage [16]. - The AI Gateway enhances GPU utilization rates by 30-60% while improving service success rates by at least 8% in specific applications [16]. Group 8: Comprehensive Services - F5 offers comprehensive application delivery and security services, including load balancing, DNS, CDN, and API gateways, adaptable to various deployment environments [17]. - The platform integrates capabilities across NetOps, SecOps, and DevOps, providing unified policy management and deep security analysis [17]. Group 9: AI Assistant - F5 has launched an AI assistant that enhances the platform's intelligence, capable of explanation, generation, and optimization across all F5 products [19].
券商信息系统稳定性保障迈入标准化阶段
Zheng Quan Ri Bao· 2025-08-07 16:42
Core Viewpoint - The China Securities Association (CSA) is developing a standard for the stability assurance system of information systems in the securities industry to enhance the stability of capital markets and address existing pain points in system management [1][2][3] Group 1: Industry Challenges - The industry faces four main challenges: lack of resilience design in system development, high operational risk prevention costs, reliance on expert experience for emergency response, and insufficient application of intelligent technologies [2][3] - Current operational risk perception is primarily reactive, lacking proactive data-driven risk detection capabilities [2] - Emergency response efficiency is hindered by dependence on individual expert knowledge rather than data-driven collaborative capabilities [2] Group 2: Standard Development Principles - The standard is based on four principles: compliance, controllability, closed-loop processes, and data-driven approaches [2] - It aims to provide technical support for securities firms to meet regulatory compliance requirements while being adaptable to different institutional sizes [2][3] Group 3: Stability Assurance Framework - The standard proposes a "three-in-one" stability assurance framework, which includes organizational support, institutional support, and process support [3] - Organizational support defines the structure and personnel competency requirements for stability assurance [3] - Institutional support encompasses regulations, technical standards, and operational procedures to ensure traceability and implementation [3] Group 4: Innovative Approaches - The standard integrates advanced technologies such as AI algorithms and big data analysis into stability management processes [3][4] - It establishes measurable stability evaluation metrics, including fault monitoring discovery rates and recovery capability standards [4] - A continuous improvement mechanism is proposed, focusing on monitoring, evaluation, and optimization [4]
事关券商交易系统稳定性!中证协出手!
券商中国· 2025-08-07 09:17
Core Viewpoint - The China Securities Association is seeking industry feedback on the draft standard for the stability assurance system of securities industry information systems, aiming to enhance the security and stability of network and information systems in the capital market [1][2]. Summary by Sections Current Issues in System Operation - The securities market requires high transaction continuity, and any anomalies in trading systems can directly impact investor rights and market order. The complexity of system architecture has increased significantly due to the widespread adoption of technologies like cloud computing and distributed architecture, making traditional operation and maintenance models inadequate [3]. - Current practices in stability management include change control, emergency response, and monitoring mechanisms, but the deep application of distributed architecture and microservices has led to exponential complexity, necessitating a proactive and intelligent stability assurance system [3]. - There is a lack of embedded resilience design in system development, insufficient capabilities in monitoring and automation, and a predominant reactive approach to risk perception, which hinders the ability to preemptively address potential issues [3]. Proposed "Three-in-One" Assurance System - The draft standard aims to integrate best practices from leading securities firms to provide a practical framework for stability assurance, promoting the digital, standardized, and collaborative development of technical capabilities across the industry [4]. - The standard focuses on the actual needs of the securities industry, extracting replicable technical solutions and management processes while allowing flexibility for different-sized institutions. It incorporates advanced technologies like AI algorithms and big data analysis into stability management processes [4]. - The "Three-in-One" framework includes organizational assurance, institutional assurance, and process assurance, detailing the organizational structure, personnel competency requirements, and management goals [4]. Process Assurance Focus - The standard emphasizes ten core processes related to stability architecture management, observability management, monitoring and alerting, and fault management, each with mechanisms, key activities, and evaluation elements [5]. - The content was developed with input from nearly 20 industry experts, focusing on the core value of stability assurance and guiding the industry to enhance operational resilience through digital means [5]. - Measurable stability evaluation elements such as "fault monitoring discovery rate" and "automation release rate" are established, with a continuous assessment and review mechanism to form a closed-loop improvement process [5].
2025年行业发展研究报告:金融数字化转型中的可观测性实践与趋势洞察
Sou Hu Cai Jing· 2025-07-20 02:07
Core Insights - The report highlights the significance of observability practices in enhancing operational efficiency and service quality during the digital transformation of the financial industry, particularly in banking, securities, and insurance sectors [1][2][8]. Group 1: Industry Overview - The financial industry's digital transformation is transitioning from basic informatization to intelligent systems, with global digital transformation spending expected to approach $4 trillion by 2027, and China's financial IT spending projected to reach 335.936 billion yuan by 2025 [8][12]. - Observability is increasingly recognized as a critical support for digital transformation, with its market experiencing explosive growth driven by policy and demand [12][14]. Group 2: Technological Trends - Real-time data collection and analysis technologies are evolving from mere support tools to core systems, enabling real-time decision-making and agile responses [20][21]. - Artificial intelligence (AI) is becoming a key driver in reshaping observability capabilities, enhancing risk prediction, root cause diagnosis, and user experience optimization [24][25]. - Distributed system monitoring is essential for ensuring business continuity and system reliability, with innovations in monitoring solutions addressing the unique challenges of the financial sector [26][27]. Group 3: Sector-Specific Practices - In banking, full-link monitoring has reduced fault localization time by 80%, significantly enhancing system stability [34][36]. - The securities sector has optimized trading system performance, achieving response times under 300 milliseconds, crucial for high-frequency trading scenarios [34][40]. - The insurance industry has improved underwriting efficiency by 35% through data visualization and real-time monitoring tools, enhancing process optimization and risk management capabilities [34][46]. Group 4: Challenges and Future Directions - The financial industry faces challenges in optimizing data flow and enhancing monitoring comprehensiveness and accuracy due to increasing business complexity and system scale [2][12]. - Future developments in observability will focus on establishing industry standards, building intelligent operation ecosystems, and adapting traditional architectures to new observability frameworks [8][12][30].
Datadog:利用人工智能功能实现核心基础设施可能性
美股研究社· 2025-07-01 12:19
Core Viewpoint - Datadog is focusing on enhancing its AI capabilities and monitoring solutions for AI workloads, with a strong buy rating and a target price of $145 per share [1][12]. Group 1: AI Capabilities and Product Offerings - Datadog showcased new AI features for its infrastructure monitoring platform at the DASH 2025 event, emphasizing observability for AI workloads [1]. - The platform offers GPU optimization and troubleshooting capabilities, allowing real-time monitoring of AI cluster performance [3]. - Datadog launched AI agents for event response, product development, and security training, which integrate into its core observability platform [3]. - The introduction of Code Security tools aims to assist developers in identifying and prioritizing vulnerabilities [3]. Group 2: Financial Performance - In Q1 2025, Datadog reported a revenue growth of 24.6% and a 1.2% increase in adjusted operating income [4]. - The number of customers with annual recurring revenue (ARR) exceeding $100,000 grew to 3,770, reflecting a year-over-year growth of 12.9% [6]. - The percentage of customers using multiple products increased, with 13% using eight or more products, indicating a high product attach rate [5]. Group 3: Future Projections - Datadog expects a revenue growth of approximately 20% for FY 2025, while adjusted operating income is projected to decline by 6.5% [7]. - Analysts predict a 360 basis point increase in annual profit margins driven by improved product attach rates and operational leverage [8]. - The overall observability market is expected to grow at a compound annual growth rate (CAGR) of 10.5% from 2024 to 2032, with Datadog anticipated to outpace this growth [8]. Group 4: Valuation and Market Position - The fair value of Datadog is calculated at $145 per share based on discounted cash flow (DCF) analysis [12]. - Datadog's competitive position is challenged by ServiceNow, which has a strong observability platform and extensive data integration capabilities [13].
没有RAG打底,一切都是PPT,RAG作者Douwe Kiela的10个关键教训
Hu Xiu· 2025-07-01 04:09
Core Insights - The article discusses the challenges faced by companies in implementing AI, particularly in achieving human-like conversation and high accuracy in AI systems. It highlights the need for effective engineering and project management in AI projects [1][15][18]. Group 1: AI Challenges - AI often struggles with human-like conversation, leading to stiff interactions even when using RAG or knowledge bases [1]. - The accuracy of AI systems is often insufficient, with a typical business requirement being 95% accuracy, while AI may only cover 80% of scenarios [1]. - The Context Paradox suggests that tasks perceived as easy for humans are often harder for AI, while complex tasks can be easier for AI to handle [3][12]. Group 2: Engineering and Project Management - Engineering capabilities are more critical than model complexity in AI projects, as many projects fail due to inadequate engineering and project management [15][18]. - A typical AI project may require extensive documentation, with one SOP potentially needing 5,000 to 10,000 words of prompts, leading to a total of 250,000 to 500,000 words for complex projects [17]. - The majority of challenges in AI projects stem from data engineering, which constitutes about 80% of the difficulty [19]. Group 3: Specialization and Data - Specialized AI solutions tailored to specific industries outperform general-purpose AI assistants, as they can better understand industry-specific language and needs [20][22]. - Data is becoming a crucial competitive advantage, as technical barriers diminish; companies must focus on leveraging unique data to create a moat [26][28]. - Companies should prioritize making AI capable of handling large volumes of noisy, real-world data rather than spending excessive time on data cleaning [26]. Group 4: Production Challenges - Transitioning from pilot projects to production environments is significantly more challenging, requiring careful design from the outset [29][31]. - Speed in deployment is more important than perfection; early user feedback is essential for iterative improvement [33][36]. - Companies must be cautious about the asymmetry in AI projects, where initial successes in demos may not translate to production success [30]. Group 5: Accuracy and Observability - Achieving 100% accuracy in AI is nearly impossible; companies should focus on managing inaccuracies and establishing robust monitoring systems [46][50]. - Observability and the ability to trace errors back to their sources are critical for continuous improvement in AI systems [47][50]. - Companies should develop a feedback loop to ensure that inaccuracies are addressed and corrected in future iterations [51][52].
博睿数据: 公司关于上海证券交易所《关于北京博睿宏远数据科技股份有限公司2024年年度报告的信息披露监管问询函》的回复公告
Zheng Quan Zhi Xing· 2025-06-23 17:07
Core Viewpoint - The company, Beijing Bonree Macro Data Technology Co., Ltd., reported a revenue of 141 million yuan in 2024, a year-on-year increase of 16.42%, but a net profit loss of 115 million yuan, a decrease of 8.02% compared to the previous year. The company has faced continuous losses since its listing in 2020, with revenue consistently below 150 million yuan and an increasing trend in losses [1][2]. Group 1: Financial Performance - In 2024, the company achieved a revenue of 141 million yuan, up 16.42% year-on-year, but recorded a net profit loss of 115 million yuan, down 8.02% year-on-year [1]. - Since its listing in 2020, the company's revenue has remained below 150 million yuan, and it has been operating at a loss since 2021, with losses expanding over time [1][2]. - The concentration of sales to the top five customers has been decreasing, with the sales amount to the top five customers in 2020 being 28.203 million yuan, accounting for 20.06% of total sales [1]. Group 2: Customer and Revenue Analysis - The company categorized its main business into monitoring services, software sales, technical development services, and system integration, with a total revenue of 140.525 million yuan from 2022 to 2024 [2]. - The revenue from the internet and software information industry showed a compound annual growth rate (CAGR) of 48.72% from 2022 to 2024, driven by the launch of the Bonree ONE product [2][3]. - The financial industry revenue increased from 542.05 million yuan in 2022 to 2,048.44 million yuan in 2024, with a CAGR of 94.40%, attributed to the expansion of financial industry clients [2][3]. Group 3: Market and Competitive Landscape - The APMO market in China is projected to reach 3.41 billion yuan in 2024, with a total market size for IT infrastructure management and application performance management estimated at 2.62 billion yuan, reflecting a year-on-year growth of 2.8% [4][5]. - The company identified three development stages for its products: proactive product introduction, passive tool exploration, and passive platform transformation, with the current focus on the latter [6][7]. - The company is experiencing a transitional phase where the demand for IT operations products is diversifying, particularly in the financial and manufacturing sectors, which are expected to drive future growth [6][7]. Group 4: Customer Acquisition and Strategy - The company has made significant strides in customer acquisition, with 12 new banking clients, 9 manufacturing clients, and 4 insurance clients in 2024, indicating improved penetration in key industries [3]. - The average contract amount has increased from 387,100 yuan in 2022 to 555,500 yuan in 2024, with a CAGR of 19.79%, reflecting a growing demand for the company's products [8]. - The company plans to implement a distribution model to further drive revenue growth and leverage its cloud ecosystem for stable income growth [8].