Workflow
机器学习
icon
Search documents
以高水平监测更好服务“三个治污”
Core Viewpoint - The article emphasizes the importance of ecological environment monitoring as a foundation for ecological protection and pollution prevention, advocating for improved monitoring data quality and the implementation of advanced technologies to enhance monitoring capabilities [1][2][3]. Group 1: Improving Monitoring Data Quality - The article suggests enhancing the accuracy, comprehensiveness, and timeliness of monitoring data to support precise pollution control. It highlights the need for the widespread application of Laboratory Information Management Systems (LIMS) and unified regulatory frameworks for monitoring institutions [1]. - It calls for a shift in focus for monitoring personnel from merely ensuring data quality to also emphasizing the application of monitoring data, thereby strengthening its role in precise pollution control [1]. Group 2: Accelerating Digital Transformation of Monitoring Systems - The article advocates for the digital transformation of ecological environment monitoring systems, leveraging technologies such as artificial intelligence and cloud platforms to modernize monitoring capabilities [2]. - It emphasizes the need to develop monitoring technologies with independent intellectual property rights and to enhance the automation and intelligence of monitoring processes [2]. - The establishment of a comprehensive ecological environment smart monitoring system is recommended, which would improve the ability to trace pollution sources and enhance environmental quality forecasting [2]. Group 3: Strengthening Legal and Regulatory Frameworks - The article stresses the necessity of a solid legal foundation for ecological environment monitoring, particularly in clarifying the legal status of automatic monitoring data from polluting entities [3]. - It points out that currently, only data from waste incineration power plants can be directly used for administrative enforcement, indicating a need for broader legal recognition of monitoring data [3]. - The role of social monitoring institutions is highlighted, with a call for clear legal definitions regarding the use of their data in environmental enforcement to enhance their contribution to ecological management [3].
OpenAI大神:人工智能导论课程停在15年前,本科首选该是机器学习导论
机器之心· 2025-09-01 08:46
Core Viewpoint - The article emphasizes the importance of selecting the right introductory course in artificial intelligence (AI) for beginners, suggesting that "Introduction to Machine Learning" should be prioritized over "Introduction to AI" due to the outdated content of the latter [2][3]. Group 1: Course Recommendations - Noam Brown, a researcher from OpenAI, advises undergraduate students interested in AI to be cautious and not to choose "Introduction to AI" as their first course [2]. - The article highlights that many universities' "Introduction to AI" courses have not evolved significantly over the past 15 years, often lacking comprehensive coverage of machine learning topics [3]. - A well-structured introductory course should ideally include topics such as linear regression, gradient descent, backpropagation, and reinforcement learning [3]. Group 2: Course Content Comparison - "Introduction to AI" often covers traditional topics like rule-based systems and expert systems, while "Introduction to Machine Learning" focuses on modern AI technologies, including linear regression, neural networks, and deep learning [6]. - The renowned course "CS229: Machine Learning" at Stanford, taught by Andrew Ng, covers supervised learning, unsupervised learning, generative models, and foundational deep learning concepts [6]. Group 3: Industry Relevance - The article notes that most breakthroughs in AI today stem from machine learning and deep learning, rather than the older topics covered in traditional AI courses [11]. - There is a growing sentiment that students should focus on practical skills like prompt engineering and programming to navigate the evolving AI landscape effectively [11].
中山大学发表最新Science论文
生物世界· 2025-09-01 00:00
Core Viewpoint - The article emphasizes the urgent need for global carbon dioxide reduction and enhancing ecosystems' carbon absorption capabilities, highlighting afforestation as a cost-effective natural climate solution [4]. Group 1: Research Findings - A study published in the journal Science quantifies the carbon sequestration potential of soil during global forest restoration, integrating ecological, climatic, and policy factors to redefine afforestation's role in climate change mitigation [4][6]. - The research developed a machine learning model to quantify soil carbon changes post-afforestation, revealing a coexistence of carbon increase and loss primarily in surface soil (0-30 cm) [6]. - If afforestation is limited to areas that avoid unintended warming effects and ensure water resources and biodiversity, approximately 389 million hectares could sequester 39.9 Pg of carbon by 2050, significantly lower than previous estimates [6]. Group 2: Policy Implications - If land is further restricted to existing policy commitments (120 million hectares), the carbon sequestration potential drops to 12.5 Pg [6]. - The study suggests that to achieve larger-scale climate mitigation, there is an urgent need to expand dedicated afforestation areas and enhance commitments from countries with significant undeveloped potential [6][8]. - The findings provide actionable insights for optimizing land use policies and afforestation strategies to maximize climate benefits [8].
机器学习因子选股月报(2025年9月)-20250831
Southwest Securities· 2025-08-31 04:12
Quantitative Models and Construction Methods - **Model Name**: GAN_GRU **Model Construction Idea**: The GAN_GRU model combines Generative Adversarial Networks (GAN) for processing volume-price time-series features and Gated Recurrent Unit (GRU) for encoding time-series features to create a stock selection factor[4][13][41] **Model Construction Process**: 1. **GRU Component**: - Input features include 18 volume-price features such as closing price, opening price, turnover, and turnover rate[14][17][19] - Training data consists of the past 400 days of these features, sampled every 5 trading days, forming a 40x18 matrix to predict cumulative returns over the next 20 trading days[18] - Data preprocessing includes outlier removal and normalization at both time-series and cross-sectional levels[18] - Model architecture: Two GRU layers (128, 128) followed by an MLP (256, 64, 64), with the final output being the predicted return (pRet), which serves as the stock selection factor[22] - Training method: Semi-annual rolling training, with training conducted on June 30 and December 31 each year[18] - Optimization: Adam optimizer, learning rate of 1e-4, IC loss function, early stopping after 10 epochs, and a maximum of 50 training epochs[18] 2. **GAN Component**: - GAN consists of a generator (G) and a discriminator (D)[23] - Generator: Uses LSTM to preserve the time-series nature of the input features, transforming random noise into realistic data samples[33][37] - Loss function: $$ L_{G} = -\mathbb{E}_{z\sim P_{z}(z)}[\log(D(G(z)))] $$ where \( z \) represents random noise, \( G(z) \) is the generated data, and \( D(G(z)) \) is the discriminator's output probability[24][25] - Discriminator: Uses CNN to process the two-dimensional volume-price time-series features, distinguishing between real and generated data[33][37] - Loss function: $$ L_{D} = -\mathbb{E}_{x\sim P_{data}(x)}[\log D(x)] - \mathbb{E}_{z\sim P_{z}(z)}[\log(1-D(G(z)))] $$ where \( x \) is real data, \( D(x) \) is the discriminator's output for real data, and \( D(G(z)) \) is the output for generated data[27][29] - Training: Alternating updates of the generator and discriminator parameters until convergence[30] **Model Evaluation**: The GAN_GRU model effectively captures both time-series and cross-sectional features, leveraging the strengths of GAN and GRU for stock selection[4][13][41] --- Model Backtesting Results - **GAN_GRU Model**: - **IC Mean**: 11.36%[41][42] - **ICIR (Non-Annualized)**: 0.88[42] - **Turnover Rate**: 0.83[42] - **Recent IC**: -2.56%[41][42] - **1-Year IC Mean**: 8.94%[41][42] - **Annualized Return**: 38.09%[42] - **Annualized Volatility**: 23.68%[42] - **IR**: 1.61[42] - **Maximum Drawdown**: 27.29%[42] - **Annualized Excess Return**: 23.52%[41][42] --- Quantitative Factors and Construction Methods - **Factor Name**: GAN_GRU Factor **Factor Construction Idea**: Derived from the GAN_GRU model, this factor encodes volume-price time-series features to predict stock returns[4][13][41] **Factor Construction Process**: - The factor is generated using the output of the GAN_GRU model, which combines GAN-based feature generation and GRU-based time-series encoding[4][13][41] - The factor undergoes industry and market capitalization neutralization, as well as standardization, before being used for testing[22] **Factor Evaluation**: The GAN_GRU factor demonstrates strong predictive power across various industries, with consistent outperformance in recent years[4][13][41] --- Factor Backtesting Results - **GAN_GRU Factor**: - **IC Mean**: 11.36%[41][42] - **ICIR (Non-Annualized)**: 0.88[42] - **Turnover Rate**: 0.83[42] - **Recent IC**: -2.56%[41][42] - **1-Year IC Mean**: 8.94%[41][42] - **Annualized Return**: 38.09%[42] - **Annualized Volatility**: 23.68%[42] - **IR**: 1.61[42] - **Maximum Drawdown**: 27.29%[42] - **Annualized Excess Return**: 23.52%[41][42]
德国耐驰:树脂基复材在线固化监测与智能化生产控制
DT新材料· 2025-08-27 16:04
Core Viewpoint - The article emphasizes the innovative solutions provided by NETZSCH in the polymer and polymer-based composite processing industry, particularly through the application of their sensXPERT technology in Airbus's manufacturing processes [2][6]. Group 1: Industry Challenges - Various industries, including automotive and aerospace, face similar challenges such as reducing production cycles, increasing yield, and dynamically controlling each product [3]. - The increasing use of thermosetting plastics and composites in high-performance parts production necessitates customized resin and formulation materials, which introduces significant production challenges [3]. - A critical issue in composite manufacturing is the lack of "data transparency," particularly in real-time curing process data, which hinders process optimization and efficiency improvements [3]. Group 2: NETZSCH Solutions - NETZSCH has been selected by Airbus to provide intelligent sensor solutions, integrating innovative sensors with advanced analytics and machine learning to enhance polymer and composite manufacturing methods [6]. - The sensors installed in molds can measure key material properties in real-time, such as curing degree and glass transition temperature, thereby improving production efficiency [6]. - The combination of material science with real-time data from the manufacturing environment allows for the application of AI on the production floor, creating dynamic processes based on historical and new data [6]. Group 3: Benefits of sensXPERT - The sensXPERT solution aims to reduce scrap rates and achieve operational excellence by optimizing processes in real-time [10]. - It provides maximum equipment efficiency and transparency in the manufacturing process through customizable dashboards that allow for product traceability [10]. - The solution also accounts for variations in data from different batches due to factors like transportation, storage, and environmental conditions, ensuring a reliable manufacturing process [6]. Group 4: Upcoming Events - The 2025 Polymer Industry Annual Conference will take place from September 10-12 in Hefei, where industry leaders will discuss new opportunities in emerging industries, including AI and aerospace [12][18]. - Zeng Zhiqiang, Vice President of Market and Applications at NETZSCH, will present on "Online Curing Monitoring and Intelligent Production Control of Resin-Based Composites" during the conference [8][20].
字节跳动再失大将,豆包大模型视觉研究负责人冯佳时离职
Sou Hu Cai Jing· 2025-08-27 05:06
Core Insights - ByteDance has lost a significant figure in the AI field, Feng Jiashi, who was the leader of the Doubao large model visual research team, raising concerns in the industry [1][3] - Feng Jiashi's departure follows rumors from June, which were initially denied by ByteDance, indicating a confirmed exit [1][3] Group 1: Impact of Departure - Feng Jiashi's exit is expected to impact ByteDance, as he brought extensive academic and practical experience to the company, having previously served as an assistant professor at the National University of Singapore [3][11] - He has published over 400 papers in deep learning and related fields, with over 69,000 citations on Google Scholar, highlighting his significant contributions to AI research [3][11] Group 2: Talent Loss Context - Feng Jiashi's departure is part of a broader trend of talent loss at ByteDance, with several key figures leaving since December, including leaders from various product lines [13] - Despite these challenges, ByteDance is actively recruiting globally to fill the talent gaps, having previously hired key members from Alibaba and Google DeepMind [13][19] Group 3: Competitive Landscape - The competition for AI talent is intensifying, and ByteDance is striving to maintain its leading position in the industry despite the ongoing talent exodus [19]
打磨7年,李航新书《机器学习方法(第2版)》发布,有了强化学习,赠书20本
机器之心· 2025-08-27 03:18
Core Viewpoint - The article discusses the release of the second edition of "Machine Learning Methods" by Li Hang, which expands on traditional machine learning to include deep learning and reinforcement learning, addressing the growing interest in these areas within the AI community [4][5][22]. Summary by Sections Overview of the Book - The new edition of "Machine Learning Methods" includes significant updates and additions, particularly in reinforcement learning, which has been gaining attention in AI applications [4][5]. - The book is structured into four main parts: supervised learning, unsupervised learning, deep learning, and reinforcement learning, providing a comprehensive framework for readers [5][22]. Supervised Learning - The first part covers key supervised learning methods such as linear regression, perceptron, support vector machines, maximum entropy models, logistic regression, boosting methods, hidden Markov models, and conditional random fields [7]. Unsupervised Learning - The second part focuses on unsupervised learning techniques, including clustering, singular value decomposition, principal component analysis, Markov chain Monte Carlo methods, EM algorithm, latent semantic analysis, and latent Dirichlet allocation [8]. Deep Learning - The third part introduces major deep learning methods, such as feedforward neural networks, convolutional neural networks, recurrent neural networks, Transformers, diffusion models, and generative adversarial networks [9]. Reinforcement Learning - The fourth part details reinforcement learning methods, including Markov decision processes, multi-armed bandit problems, proximal policy optimization, and deep Q networks [10]. - The book aims to provide a systematic introduction to reinforcement learning, which has been less covered in previous textbooks [4][10]. Learning Approach - Each chapter presents one or two machine learning methods, explaining models, strategies, and algorithms in a clear manner, supported by mathematical derivations to enhance understanding [12][19]. - The book is designed for university students and professionals, assuming a background in calculus, linear algebra, probability statistics, and computer science [22]. Author Background - Li Hang, the author, is a recognized expert in the field, with a background in natural language processing, information retrieval, machine learning, and data mining [24].
对话菁英投顾——“智选多资产ETF”主创何嘉文
Core Viewpoint - The article emphasizes the advantages of using ETFs as a diversified investment tool in a complex market environment, highlighting their superior performance compared to individual stocks over various time frames [2][4]. Performance Analysis - As of August 13, 2023, 83% of individual stocks have risen, while nearly 95% of ETFs/LOFs have increased in value [2][3]. - The performance of ETFs over different periods shows a consistently higher percentage of rising ETFs compared to individual stocks: - 6 months: 77% for stocks vs. 90% for ETFs - 1 year: 93% for stocks vs. 81% for ETFs - 2 years: 65% for stocks vs. 74% for ETFs - 3 years: 59% for stocks vs. 66% for ETFs [3]. Investment Philosophy - The investment philosophy centers on "risk diversification and long-term stability," suitable for investors seeking asset preservation and moderate risk tolerance [6]. - The approach encourages systematic allocation to mitigate market volatility and emphasizes the importance of using professional quantitative tools [8][9]. Investment Strategy - The investment strategy employs a systematic framework that includes a data engine, risk prediction using neural networks, and risk parity for diversified and smooth returns [12]. - The model quantifies safety margins using "downside volatility," adjusting asset allocation based on historical data rather than traditional valuation methods [13]. Selection Criteria - The selection process for ETFs involves evaluating three key indicators: shrinkage in ETF size, liquidity decline, and tracking error expansion [15]. - The strategy focuses on a limited number of holdings, with no single position exceeding 20% of the portfolio [16]. Market Approach - The investment approach is primarily top-down, analyzing macroeconomic risks to determine asset class allocations before selecting specific ETFs [17]. - The model incorporates dynamic thresholds for risk management, triggering automatic adjustments based on volatility predictions [18]. Client Engagement - The service targets investors who are patient and understand the value of diversified investments, while also emphasizing the importance of risk management [22].
媒体看天正 | 国家级智能工厂的“数字医生”
Sou Hu Cai Jing· 2025-08-26 11:44
Core Insights - The article highlights the emergence of industrial internet operation engineers as essential professionals in the digital transformation of traditional industries, acting as "digital doctors" for smart factories [3][4] - Zhejiang Tianzheng Electric Co., Ltd. has established a smart circuit breaker factory that utilizes a digital twin platform to monitor production conditions and data in real-time [3] - The demand for digital transformation in traditional industries is increasing, with Tianzheng's smart factory being recognized as one of the first batch of excellent smart factories in China [4] Industry Overview - The development of the industrial internet is characterized by high integration, intelligence, and ecological features, widely applied across manufacturing, energy, transportation, and healthcare sectors [3] - Industrial internet operation engineers require a comprehensive skill set, including traditional internet knowledge, big data analysis, artificial intelligence, and machine learning [3] Company Insights - Tianzheng's information department employs nearly 30 industrial internet operation engineers who work closely with upstream enterprises and frontline workers to address challenges in the digital transformation process [4] - The company’s smart factory was recognized as the only excellent smart factory in Wenzhou, showcasing its leadership in the industry [4]
推理速度快50倍,MIT团队提出FASTSOLV模型,实现任意温度下的小分子溶解度预测
3 6 Ke· 2025-08-26 07:23
Core Insights - The research team from MIT has developed an improved model for predicting organic solubility using a new organic solubility database, BigSolDB, which enhances the accuracy and speed of solubility predictions [1][2][22] - The new model, named FASTSOLV, shows a reduction in RMSE by 2-3 times compared to existing state-of-the-art (SOTA) models and achieves a speed increase of up to 50 times [2][14][22] Group 1: Model Development and Performance - The FASTSOLV model integrates solute and solvent molecular structures along with temperature parameters to directly regress logS, improving upon traditional methods that are time-consuming and less accurate [2][11] - In strict solute extrapolation scenarios, the optimized model's RMSE is significantly lower than that of the Vermeire model, demonstrating superior performance [14][22] - The model's training and evaluation were conducted using a rigorous system that ensures independence and reliability, avoiding data overlap issues [6][9][13] Group 2: Data Utilization and Methodology - BigSolDB serves as the core data source, systematically collecting solubility data across various solvents and temperatures, which is crucial for training generalizable predictive models [6][11] - The research emphasizes the importance of a well-structured training and evaluation system to achieve reliable extrapolation without prior conditions [6][9] - The study highlights the need for high-quality organic solvent datasets to further enhance model performance, indicating that simply increasing training data may not overcome performance limits [22][25] Group 3: Industry Implications and Applications - The advancements in solubility prediction technology are seen as key solutions to industry challenges such as long experimental times and high R&D costs [24][25] - Companies in the pharmaceutical sector are particularly interested in high-throughput, low-cost solubility assessment technologies, which can significantly improve efficiency in drug development processes [25] - The integration of academic research models into industrial applications is evident, with companies leveraging data-driven models to optimize production processes and enhance product quality [25][26]