机器学习
Search documents
Nature Biotechnology | 病毒分类工具的代际飞跃:vConTACT3如何超越前代,重塑宏基因组分析标准?
Xin Lang Cai Jing· 2025-12-24 09:40
在这个星球上,没有任何生命形式在数量和多样性上能与病毒相提并论。据估算,地球上大约存在1031 个病毒 颗粒。这是一个什么概念?如果把它们排列起来,长度甚至超过了银河系的直径。然而,面对如此庞大的数 字,人类目前所掌握的知识却显得微不足道。 即使是在基因组测序技术突飞猛进的今天,像 IMG/VR (Integrated Microbial Genome/Virus Resource) 这样最庞大 的病毒基因组数据库,也仅仅收录了约 1530万 个病毒基因组片段。这与真实世界相比,连九牛一毛都算不 上。更令人焦虑的是,即便是这"九牛一毛"中,能够被国际病毒分类委员会 (ICTV) 正式分类和命名的,还不 到 0.01%。 这是一个巨大的不对称:我们的测序能力在指数级增长,每一滴海水、每一克土壤都在告诉我们要发现成千上 万的新病毒,但我们的分类体系却像是一台老旧的打字机,试图跟上超级计算机的输出速度。传统的分类方法 依赖于专家的手工 (Manual Curation),这种方式虽然严谨,但在宏基因组学 (Metagenomics) 的海量数据面前显 得捉襟见肘。 12月19日,《Nature Biotechnol ...
专访西湖大学卢培龙:AI蛋白质设计目前还无需严格监管,否则可能减缓科学进步
生物世界· 2025-12-24 08:00
Core Insights - The article discusses the transformative impact of machine learning-based AI tools on protein structure research, highlighting advancements in computational tools for predicting protein structures and properties, and their applications in protein design [1][2][3]. Group 1: Advances in Protein Modeling - Machine learning tools are addressing challenges in understanding macromolecular dynamics and functions, moving beyond single structure limitations [2]. - AlphaFold has revolutionized protein structure biology, making it a common perspective in experimental design [2]. - Recent advancements, particularly AlphaFold3 and RoseTTAFold All-Atom, have significantly improved the accuracy and scope of protein structure and interaction predictions [3]. Group 2: Challenges in Predicting Complex Structures - While tools like AlphaFold-Multimer can accurately predict many tightly interacting complexes, challenges remain in simulating large, dynamic, or transient complexes [4]. - Membrane proteins and intrinsically disordered proteins present significant challenges due to the lack of high-resolution experimental data [4][5]. - The main obstacles to improving predictions for these complex structures include the scarcity of high-resolution experimental data and the need for new deep learning methods [4][5]. Group 3: Dynamic Structures and Environmental Conditions - Current tools do not predict folding pathways and have limitations in considering different solution conditions or temperatures [6][9]. - Integrating multiple data sources into machine learning models is crucial for describing dynamic structures and predicting conformational changes in response to environmental variations [8][10]. Group 4: Tools for Designing Functional Proteins - Current tools struggle to capture the dynamic characteristics of protein functions, which depend on various interactions and require quantitative predictions of conformational changes [10][11]. - Progress has been made in designing tools for properties beyond structure, but challenges remain in accurately predicting binding affinities and controlling dynamic properties [11][12]. Group 5: Integration of Machine Learning with Other Methods - Significant progress has been made in integrating machine learning with molecular dynamics and other computational methods to enhance protein design [17][18]. - AI 2 BMD exemplifies a hybrid system that combines machine learning with quantum mechanics to simulate large biomolecules with high accuracy [18]. Group 6: Future Directions and Data Needs - Expanding datasets to include functional and biophysical measurements is essential for improving model predictions [15][16]. - The development of databases for dynamic and polymorphic conformations is critical for simulating proteins that rely on structural dynamics for functionality [15][16]. Group 7: Responsible Use of AI in Protein Design - Concerns about the responsible use of AI in protein design include the potential for creating harmful proteins and the energy consumption of AI models [19][20]. - Continuous improvement of detection algorithms and collaboration between academia and industry are necessary to ensure safety while promoting scientific advancement [20][21].
瑞士信息与通信科技公司ZYTLYN Technologies 研发旅游预测分析智能体,为旅游业提供准确定价策略 | 瑞士创新100强
3 6 Ke· 2025-12-23 04:00
图源ZYTLYN Technologies 瑞士信息与通信科技公司ZYTLYN Technologies(以下简称ZYTLYN)成立于2021年,公司致力于研发旅游预 测分析智能体。该智能体能够全面分析旅游行业影响因素,通过机器学习预测需求趋势,为旅游业企业提供 定价、营销等决策建议。 ZYTLYN研发了旅游预测分析智能体,该智能体依托公司专有机器学习技术,可对旅游业相关各领域数据进 行综合分析,准确反映和预测旅客的出行目的地、时间及潜在支付意愿等,为旅游业企业提供精细化、可操 作的需求模式预测,可用于定价、营收管理和销售等诸多方面。 ZYTLYN的智能体会利用机器学习技术处理和转换各类行业相关数据,从中发现最新趋势影响关联性和行为 模式并生成预测结果,为用户提供基于情境的预测建议。ZYTLYN拥有自有的全球旅行数据源,以及涵盖风 险预警、突发事件、气象条件、宏观经济指标等诸多领域的外部数据源。其智能体会从以上各种数据流中提 取数据,对数据进行清洗、规范化与聚合,并对聚合数据集进行转换,生成训练数据集。随后,智能体将进 行机器学习模型的开发和优化,包含特征工程、超参数优化等内容,并将优化后的模型交付至应用情景 ...
桔子数科荣膺“2025年度智能风控科技创新应用典型案例”奖,以科技赋能金融高质量发展
Xin Lang Cai Jing· 2025-12-20 06:44
转自:六安新闻网 (来源:六安新闻网) 活力——通过技术创新不断优化风控模型,提升金融机构的业务拓展能力。 韧性——强化风险识别与应对能力,增强金融体系的抗风险韧性。 拓新——探索AI、大数据等新技术在风控领域的深度应用,开辟金融科技新赛道。 2025年12月12日,由华夏机构投资者年会组委会主办的第十九届华夏机构投资者年会暨华夏金融(保 险)科技论坛在北京隆重举行。本次论坛以"活力与韧性拓新与赋能"为主题,汇聚了知名学者、金融机 构领导者及行业专家,共同探讨金融科技的未来发展方向。在本次盛会上,桔子数科凭借其自主研发的 智能风控系统"桔盾",荣获"2025年度智能风控科技创新应用典型案例"奖项,彰显了公司在金融科技领 域的创新实力与行业影响力。 科技赋能,打造金融风控新标杆 当前,全球经济环境复杂多变,金融机构在数字化转型过程中,既需要保持发展活力,又需增强风险抵 御能力。桔子数科始终坚持以技术驱动风控革新,通过人工智能、大数据、机器学习等前沿技术,构建 了覆盖全业务场景的智能风控体系,助力金融机构提升风险管理效率,优化用户体验。 此次获奖的"桔盾"智能风控系统,是桔子数科在金融科技领域的核心成果之一。该系 ...
2025年中国通信平台即服务(CPaaS)产业链、市场规模、企业格局及未来趋势研判:行业规模突破470亿元,竞争格局分散,未来集中度有望提升[图]
Chan Ye Xin Xi Wang· 2025-12-19 01:22
Core Insights - The article discusses the growth and competitive landscape of the Communication Platform as a Service (CPaaS) market in China, highlighting its increasing importance in enterprise IT strategies and digital transformation efforts [1][10]. Group 1: Overview of CPaaS - CPaaS is a cloud-based communication platform that simplifies the integration of communication features such as voice, video conferencing, and messaging into applications without the need for backend infrastructure [1][2]. - Key services offered by CPaaS include messaging services, voice services, mobile traffic services, IoT SIM cards and data, virtual goods recharge, and RCS messaging services [1][2]. Group 2: Industry Chain - The CPaaS market's industry chain consists of upstream telecom operators and cloud infrastructure providers, midstream CPaaS service providers, and downstream customers including enterprise software developers and SaaS providers [1][7]. - Telecom operators provide essential telecom resources and channels, while cloud infrastructure suppliers offer the necessary computing, storage, and operational support [1][7]. Group 3: Market Structure - As of 2024, there are approximately 400 CPaaS service providers in China, indicating a highly competitive market with a low concentration of market share among the top players, who collectively hold only 20.6% of the market [10][12]. - Tencent and WuTong Holdings have a higher market share, each exceeding 7% [10][12]. Group 4: Market Size and Growth - The CPaaS market in China is projected to reach 44.8 billion yuan in 2024, reflecting a year-on-year growth of 3.0%, with further growth expected to 47.2 billion yuan in 2025, marking a 5.4% increase [10]. - By 2029, the market size is anticipated to reach 65 billion yuan, driven by digital transformation and the expansion of innovative CPaaS services [10]. Group 5: Competitive Landscape - The competitive landscape includes major global players such as Twilio, Infobip, and Sinch, with Tencent Cloud being the only Chinese company recognized in the Gartner Magic Quadrant for CPaaS [10][11]. - The market is characterized by a significant number of small to medium-sized enterprises, which face increasing competitive pressure as larger providers consolidate their market positions through innovation and resource integration [10][12]. Group 6: Future Trends - The CPaaS market is expected to continue its strong growth trajectory, with advancements in AI technologies enhancing service capabilities and efficiency [10]. - The increasing maturity of the CPaaS market in China will likely lead to higher industry concentration, intensifying competition for smaller players [10].
登上Cell子刊封面:山东大学利用AI揭示发酵食品微生物组中的酶多样性
生物世界· 2025-12-17 08:30
Core Viewpoint - The research highlights the hidden enzyme diversity and distribution within the microbiome of fermented foods, emphasizing the untapped potential for enzyme resource development in food research [2][3][8]. Group 1: Research Findings - The study utilized AI-assisted functional annotation to uncover enzyme diversity in the fermented food microbiome, providing valuable insights for future microbial function exploration in food research [3][10]. - The research team explored 10,202 metagenomic assembled genomes from global fermented foods, identifying over 5 million enzyme sequences categorized into 98,693 homologous clusters, representing more than 3,000 enzyme types [6]. - Functional analysis revealed that 84.4% of these clusters are unannotated in current databases, with terpenoid and polyketide metabolic enzymes showing high novelty [6]. Group 2: Environmental Adaptability - Peptidases exhibited broad environmental adaptability based on predicted optimal temperature and pH, with 31.3% of enzyme clusters demonstrating food type specificity [6]. - A machine learning model was developed to classify the source of fermented foods based on enzyme clusters, highlighting the potential for targeted optimization in food production [6]. Group 3: Related Commentary - A commentary article published in the same journal emphasizes that AI-assisted functional annotation reveals hidden microbial enzyme diversity and distribution, providing clues for elucidating ecological roles and biotechnological potential [9][10].
商用清洁机器人企业清越智能完成B+轮3亿元融资!加速具身智能清洁机器人全球化落地
机器人圈· 2025-12-16 09:55
Core Insights - Qingyue Intelligent, a commercial cleaning robot company, has completed a 300 million yuan Series B+ financing round, led by Guotou Innovation, with participation from Shenzhen Capital Group and Sequoia China [1] - The funds will be used to expand global sales networks, build a comprehensive service system, and enhance research on embodied intelligent algorithms to solidify the company's leading position in the commercial cleaning robot sector [1][5] Industry Overview - The global commercial service robot industry is experiencing rapid growth, driven by rising labor costs, an aging population, and the demand for improved service quality [3] - It is projected that by 2030, the global commercial service robot market will exceed $10 billion, with a compound annual growth rate (CAGR) of over 20%, with commercial cleaning and delivery robots being key growth drivers [3] Company Development - Established just over three years ago, Qingyue Intelligent has completed five rounds of financing, demonstrating strong business growth and technological barriers that attract top-tier capital [4] - The company focuses on the autonomous research and global layout of commercial cleaning robots, aiming to transition the industry from automation to intelligence [4] Product Innovation - Qingyue Intelligent has developed a diverse product matrix, including flagship products like the SP50 commercial sweeping and suction robot, and the L50 and L4 commercial floor washing robots, designed for large commercial spaces [4] - The newly launched L3 commercial floor washing robot features advanced capabilities such as automatic map updates and intelligent path decision-making, catering to small and medium-sized commercial environments [4][5] Future Plans - Following the recent financing, Qingyue Intelligent plans to increase R&D investment, accelerate the iteration of new technologies and products, and expand its overseas market presence [5] - The company aims to drive technological innovation to facilitate the intelligent transformation of the commercial cleaning robot industry, enhancing cost efficiency and quality in global commercial services [5]
如何规划企业数据湖以成功实现数据价值
3 6 Ke· 2025-12-15 06:16
Core Insights - The implementation of data lakes addresses the limitations of traditional databases in handling the explosive growth of data volume and complexity, providing a unified and scalable infrastructure for storing structured, semi-structured, and unstructured data [2][7] - Data lakes serve as the foundation for modern analytics and artificial intelligence, enabling real-time insights, self-service business intelligence, and predictive modeling [2][6] Group 1: Definition and Importance of Data Lakes - A data lake is a centralized storage system that allows organizations to store all types of data in its raw format until needed for analysis, contrasting with traditional data warehouses that require data to be structured before storage [6][7] - The construction of a data lake is crucial for organizational success, as it provides a flexible, cost-effective, and future-proof solution for data storage and analysis [7][10] - Data lakes enable organizations to combine historical and real-time data, supporting advanced use cases such as predictive analytics and fraud detection [6][10] Group 2: Core Architecture of Data Lakes - Data lakes are organized into multiple layers that work together to transform raw information into valuable business insights, including ingestion, storage, processing, governance, and consumption layers [11][20] - The ingestion layer brings data from various sources into the data lake, preserving its original format for later analysis [12] - The storage layer holds raw data in scalable and cost-effective repositories, supporting all data types [13][14] - The processing layer cleans, validates, and enriches data, organizing it into different zones for business analysis [15] - The governance layer ensures data remains trustworthy, secure, and compliant throughout its lifecycle [16] - The consumption layer provides tools for users to extract value from data, enabling self-service analytics while maintaining governance controls [17] Group 3: Implementation Steps and Best Practices - The first step in implementing a data lake is to clarify objectives and identify key use cases, translating them into key performance indicators (KPIs) [23] - Selecting the appropriate cloud platform is crucial, with options like AWS, Azure, and GCP offering various tools for storage, analysis, and governance [24][26] - Designing a layered architecture helps maintain data organization and trustworthiness, with clear definitions for raw, refined, and business-ready data [27][28][29] - Implementing governance and security measures from the outset is essential, including data ownership, access controls, and compliance tracking [31] - Continuous monitoring, optimization, and documentation of data processes are necessary to ensure the data lake remains scalable and efficient [33][42] Group 4: Real-World Case Studies - Shell Energy built a data lake on Microsoft Azure to integrate IoT, operations, and energy management data, reducing data preparation time by 60% and enhancing collaboration between data scientists and business teams [55] - Comcast utilized a Databricks data lake to integrate customer interaction, billing, and service data, enabling near-real-time analysis and improving customer retention rates [56] - HSBC adopted a cloud-based data lake to upgrade its risk management and compliance framework, enhancing the accuracy and transparency of regulatory reporting [57]
用机器学习解锁量化投资新边界
Qi Huo Ri Bao Wang· 2025-12-10 01:33
Core Insights - The article highlights the successful implementation of a day trading strategy using machine learning and pressure factors to identify trading opportunities in the futures market [2][4]. Strategy Overview - The day trading strategy, initiated in February 2023, employs a machine learning framework to analyze market data and predict daily returns for selected futures [2]. - The strategy focuses on 40 to 50 mainstream commodity futures but trades only the top five predicted by the model, optimizing for a higher Sharpe ratio during backtesting [2]. - Position allocation follows an "equal market value" principle, which has shown comparable performance to a "strong signal high position" model while simplifying operations [2]. Data Utilization - The strategy captures short-term fluctuations using 1-minute K-line data while also considering daily data for long-term trends, generating signals twice a day [3]. - The approach avoids frequent predictions to reduce model complexity, especially in the absence of significant incremental information during trading hours [3]. Risk Management - A multi-layered risk control framework is established, including mandatory position closing before market close to avoid overnight risks and immediate liquidation in case of market reversals [4]. - The strategy has demonstrated strong drawdown control, with a real trading drawdown rate of 5% to 6% and a maximum drawdown of 28.95% during a specific competition [4]. - The strategy is best suited for volatile markets, relying on a reversal effect, and incorporates traditional trend sub-strategies to mitigate risks [4]. Future Plans - The company plans to expand the trading universe to 10 to 15 products, which will enhance capital capacity while maintaining profitability through diversified order placements [5]. - A new product based on the day trading strategy has been registered, indicating a move towards a broader asset management market [5]. - The focus will remain on the commodity futures CTA sector, with ongoing investments in factor exploration and model optimization to ensure robust performance for clients [5].
我国学者开发出环状RNA模型,预测肺癌患者的免疫治疗响应
生物世界· 2025-12-09 00:05
Core Insights - Lung cancer is the most common malignant tumor globally and the leading cause of cancer-related deaths, with non-small cell lung cancer (NSCLC) accounting for over 85% of cases. Despite advancements in clinical management, the 5-year overall survival rate for NSCLC patients has only increased from 15% to 25% [2] - Immune checkpoint inhibitors (ICIs), such as PD-1 and PD-L1 inhibitors, have transformed the treatment landscape for NSCLC. However, the objective response rate (ORR) for unselected NSCLC patients receiving ICI treatment is only 10%-30%, with some patients experiencing accelerated disease progression or early death [2] - A new study identified a circRNA signature (circRNA-Sig) consisting of 11 circRNAs that can predict the response to immunotherapy in advanced NSCLC, potentially guiding clinical treatment [3][8] Summary by Sections CircRNA and Cancer - CircRNA is associated with dysregulated RNA expression in cancer and has potential as a biomarker for predicting responses to ICIs [3] Research Findings - The research team analyzed circRNA expression profiles from 891 advanced NSCLC patients in the OAK and POPLAR clinical trials, identifying significantly differentially expressed circRNAs [4] - A predictive model was constructed using machine learning, which was validated and revealed key circRNAs that may influence the efficacy of NSCLC immunotherapy [4] CircRNA-Sig Model - The circRNA-Sig model demonstrated an area under the curve (AUC) of 0.71 in the OAK trial and 0.67 in the POPLAR trial for predicting the efficacy of atezolizumab [5] - Survival analysis indicated that patients with low circRNA-Sig scores benefited significantly more from ICI treatment compared to chemotherapy (HR=1.347), while high-score patients showed no significant difference [5] - Enrichment analysis suggested that low-score patients exhibited an activated tumor immune microenvironment, indicating a mechanistic link between circRNA and ICI treatment sensitivity [5] Clinical Application - The circRNA-Sig model, validated across two large clinical trial cohorts, offers a new stratification tool for NSCLC patients undergoing atezolizumab treatment, enhancing personalized treatment strategies [8]