LSTM
Search documents
柒星通信取得基于LSTM的卫星天线波束优化控制方法专利
Sou Hu Cai Jing· 2026-01-24 03:11
声明:市场有风险,投资需谨慎。本文为AI基于第三方数据生成,仅供参考,不构成个人投资建议。 来源:市场资讯 天眼查资料显示,柒星通信科技(北京)有限公司,成立于2016年,位于北京市,是一家以从事科技推 广和应用服务业为主的企业。企业注册资本1000万人民币。通过天眼查大数据分析,柒星通信科技(北 京)有限公司共对外投资了3家企业,参与招投标项目17次,财产线索方面有商标信息5条,专利信息51 条,此外企业还拥有行政许可4个。 柒星通信科技(安徽)有限公司,成立于2022年,位于芜湖市,是一家以从事计算机、通信和其他电子设 备制造业为主的企业。企业注册资本595.238万人民币。通过天眼查大数据分析,柒星通信科技(安徽)有 限公司专利信息26条,此外企业还拥有行政许可3个。 国家知识产权局信息显示,柒星通信科技(北京)有限公司、柒星通信科技(安徽)有限公司取得一项名 为"一种基于LSTM的卫星天线波束优化控制方法"的专利,授权公告号CN120528497B,申请日期为 2025年5月。 ...
杨植麟揭秘Kimi预训练策略:提升Token efficiency,实现长文本
Xin Lang Cai Jing· 2026-01-10 12:09
Core Insights - The core focus of the article is on the strategies for pre-training AI models, specifically emphasizing Token Efficiency and Long Context as critical components for enhancing performance in complex tasks [2][6]. Group 1: Token Efficiency - Token Efficiency is crucial because the reasoning or training of agents is fundamentally a search process, where better pre-training reduces the search space and enhances prior knowledge [3][7]. - The importance of Token Efficiency is highlighted by the need for AI to develop complex systems, such as an operating system, without enumerating every possible token combination, which may be meaningless or incorrect [7]. Group 2: Long Context - The architecture of Transformers shows significant advantages in long context scenarios, with experiments indicating that performance drops below LSTM when context length exceeds 1000 tokens, underscoring the importance of context length in model design [2][6]. - In the current Agentic era, many tasks require long contexts to execute complex instructions, making architectures with lower positional loss more technically capable [2][6]. Group 3: Aesthetic Considerations in AI - The development of AI models is not just a technical challenge but also involves aesthetic considerations, where the creation of a model reflects a worldview and values, akin to the concept of "Taste" as articulated by influential figures like Steve Jobs [3][7]. - Each model generates unique tokens that are not interchangeable, indicating that intelligence produced by different roles (e.g., a CEO vs. a designer) varies significantly, leading to an exponential increase in the space of possible "Tastes" [4][8].
被拒≠失败!这些高影响力论文都被顶会拒收过
具身智能之心· 2025-12-12 01:22
Core Insights - Waymo has released a deep blog detailing its AI strategy centered around its foundational model, emphasizing the use of distillation methods to create high-efficiency models for onboard operations [1][2] - Jeff Dean highlighted the significance of knowledge distillation, comparing it to the creation of the Gemini Flash model, which showcases the importance of distillation in AI model efficiency [1][2] Historical Context of Rejected Papers - Many foundational technologies in AI, such as optimizers for large models and computer vision techniques, were initially rejected by top conferences, showcasing a historical pattern of oversight in recognizing groundbreaking innovations [6] - Notable figures in AI, including Geoffrey Hinton and Yann LeCun, have faced rejection for their pioneering work, which was later recognized as transformative [6] Case Studies of Rejected Innovations - LSTM, a milestone for sequence data processing, was rejected by NIPS in 1996 but later became crucial in speech recognition and machine translation, highlighting the delayed recognition of its value [7][10] - SIFT, a dominant algorithm in computer vision, faced rejection from ICCV and CVPR due to its perceived complexity, yet proved to be vital in real-world image processing [11][13] - Dropout, a key regularization method for deep neural networks, was initially rejected for its radical approach but later became essential in training deep networks effectively [17][19] - Word2Vec, despite being rejected at ICLR, became a cornerstone in NLP due to its efficiency and practical application, eventually receiving recognition for its impact [20][24] - YOLO transformed object detection by prioritizing speed over precision, facing rejection for its perceived shortcomings but later becoming a widely adopted framework in the industry [28][30] Reflection on Peer Review Limitations - The peer review system often struggles to recognize disruptive innovations, leading to a systematic cognitive lag in evaluating groundbreaking research [40][41] - The tendency to equate mathematical complexity with research contribution can hinder the acceptance of simpler yet effective methods [41] - Historical examples illustrate that the true measure of a research's impact is not determined by initial peer review outcomes but by its long-term relevance and problem-solving capabilities [43][47]
AI 赋能资产配置(二十九):AI 预测股价指南:以 TrendIQ 为例
Guoxin Securities· 2025-12-03 13:18
Core Insights - The report emphasizes the growing importance of AI in asset allocation, particularly in stock price prediction, highlighting the capabilities of AI models like TrendIQ in addressing the limitations of traditional machine learning approaches [3][4][10]. Group 1: AI in Stock Price Prediction - The introduction of AI large models has significantly improved the ability to predict stock prices by effectively collecting and analyzing unstructured information, which traditional models struggled with [3][4]. - TrendIQ is presented as a mature financial asset price prediction platform that offers both local and web-based deployment options, catering to different user needs [4][10]. - The report discusses the evolution of predictive models from LSTM to more advanced architectures like Transformers, which provide better handling of complex financial data and improve predictive accuracy [5][10]. Group 2: Model Mechanisms and Limitations - LSTM has been the preferred model for stock price prediction due to its ability to handle non-linear and time-series data, but it has limitations such as single modality and weak interpretability [6][7]. - The report outlines the integration of LSTM with other models like XGBoost and deep reinforcement learning to enhance predictive capabilities, addressing some of LSTM's shortcomings [6][10]. - The emergence of Transformer architecture is noted for its advantages in global context awareness and the ability to perform zero-shot and few-shot learning, which enhances its applicability in financial predictions [8][10]. Group 3: TrendIQ Implementation - The report details the implementation of TrendIQ, which includes a complete framework for data preparation, model training, and user interaction through a web application [12][20]. - The training process involves collecting historical stock data, preprocessing it, and training the LSTM model, ensuring that users can make predictions through a user-friendly interface [12][20]. - The app integrates various components, including real-time data fetching and prediction functionalities, allowing users to interactively engage with the predictive model [20][28]. Group 4: Future Directions - The report anticipates that future developments in AI stock prediction will focus on multi-modal integration, combining visual data from candlestick charts with textual analysis from financial news and numerical data from price sequences [39][40]. - The potential for real-time knowledge integration into predictive models is highlighted, suggesting that future AI models will be able to adapt to new information dynamically, improving their robustness and accuracy [40][41].
国债期货系列报告:多通道深度学习模型在国债期货因子择时上的应用
Guo Tai Jun An Qi Huo· 2025-08-28 08:42
1. Report Industry Investment Rating No relevant content provided. 2. Core Viewpoints of the Report - The report innovatively proposes a dual - channel deep - learning model (LSTM and GRU) that integrates daily - frequency and minute - frequency data, which can effectively capture market information on different time scales, significantly improve the prediction accuracy and stability of the strategy outside the sample (especially during market downturns), and provide a new idea with strong generalization ability for reconstructing the quantitative timing system of the bond market [2]. - The dual - channel model shows excellent generalization ability and robustness in out - of - sample tests, and can maintain a high winning rate in bear markets, effectively making up for the shortcoming of traditional factors failing in market downturns [3]. - In the multi - factor timing framework, the weight of deep - learning factors should be controlled at a relatively low proportion, and machine - learning factors should play a supplementary role to achieve the unity of interpretability and performance improvement [43][44]. 3. Summary by Relevant Catalogs 3.1 Deep - Learning Model Introduction - Traditional quantitative factors in the bond market have declined in performance in recent years, and there is a need to reconstruct and re - mine bond - market quantitative factors. Deep - learning methods can be used to find complex relationships in data, and RNN, LSTM, and GRU are considered suitable for the timing task of Treasury bond futures [7][8]. - RNN can process time - series data but has the problem of gradient disappearance when dealing with long time - series [9]. - LSTM solves the gradient - disappearance problem through a cell state and three gating units, enabling it to learn long - range dependencies in sequences [15]. - GRU simplifies the structure of LSTM, reduces the number of learnable parameters, and has high parameter efficiency and fast training speed [19]. - A dual - channel model is designed to process daily - frequency and minute - frequency data simultaneously to extract features on different time scales and predict the daily - frequency returns of Treasury bond futures, which can reduce the over - fitting risk [22]. 3.2 Treasury Bond Futures Timing Test 3.2.1 Back - testing Settings - The target variable is the open - to - open return of 10 - year Treasury bond futures, and the back - testing time interval is from January 2016 to August 2025, with daily rebalancing, 100% margin, 1 - time leverage, and a bilateral handling fee of 0.01% [25][26][27]. 3.2.2 Daily - frequency Channel Model - The single - daily - frequency channel model based on daily - frequency features performs well within the sample but poorly outside the sample, with obvious over - fitting [33]. 3.2.3 Dual - channel Model - The dual - channel model fuses multi - frequency time - series information. The addition of minute - frequency information significantly improves the prediction effect of the model outside the sample, enhances the generalization ability and stability, and maintains a relatively high winning rate in both long and short positions [40][41][42]. 3.3 Deep - Learning Allocation in the Multi - factor Framework - Deep - learning factors in the multi - factor timing framework have high performance but also have over - fitting risks and lack of interpretability. The weight of deep - learning factors should be controlled at a relatively low proportion, and machine - learning factors should play a supplementary role [43][44]. 3.4 Conclusion - The report explores the application of deep - learning models in Treasury bond futures quantitative timing and proposes a dual - channel deep - learning framework based on multi - frequency data fusion, which can effectively improve the performance of multi - factor strategies [45].
微云全息(NASDAQ: HOLO)提出基于LSTM加密货币价格预判技术: 投资决策的智慧引擎
Cai Fu Zai Xian· 2025-08-06 03:01
Core Insights - The rise of blockchain technology has made cryptocurrencies an important part of the financial sector, but the market's lack of regulation and high volatility pose significant risks for investors [1] - Traditional financial forecasting methods struggle with the non-linear and complex nature of cryptocurrency price data, highlighting the need for more advanced predictive techniques [1] - The introduction of deep learning algorithms, particularly Long Short-Term Memory (LSTM) networks, offers a promising approach for predicting cryptocurrency prices by effectively capturing long-term dependencies and complex patterns [1] Data Collection and Processing - The company has gathered extensive historical trading data from multiple authoritative sources, covering various time periods, trading platforms, and cryptocurrency types, including price, volume, and market depth [2] - Data quality and reliability were ensured through rigorous cleaning and preprocessing, which involved removing duplicates, errors, and outliers, as well as normalizing and standardizing the data for model input [2] Model Development - A specialized LSTM neural network model was constructed to address the challenges of traditional RNNs, incorporating gating mechanisms to mitigate issues like gradient vanishing and explosion [2] - The model's architecture and parameters were tailored to the specific requirements of cryptocurrency price prediction, and various optimization algorithms were employed to minimize prediction errors during training [2] Performance Evaluation and Optimization - The model's performance was assessed using multiple evaluation metrics, including Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE), leading to further optimization and adjustments to enhance prediction accuracy [2] - Techniques such as regularization and dropout were utilized to prevent overfitting, ensuring the model's robustness [2] Future Directions - The company plans to explore and integrate new technologies and algorithms, such as reinforcement learning and Generative Adversarial Networks (GANs), to further improve the accuracy and generalization capabilities of its price prediction models [4] - There is an emphasis on enhancing data processing and analysis through the integration of big data, cloud computing, and IoT technologies, providing stronger technical support for price forecasting [4]