模型训练 - filings, earnings calls, financial reports, news - Reportify

模型训练

Search documents

协创数据(300857.SZ)：国内企业可通过公司的海外算力平台进行模型训练

Ge Long Hui· 2025-11-12 11:14

Group 1 - The core viewpoint of the article is that domestic companies can utilize the overseas computing power platform provided by the company for model training [1] Group 2 - The company is actively engaging with investors through an interactive platform to communicate its services [1] - The overseas computing power platform is positioned as a resource for enhancing the capabilities of domestic enterprises [1]

Sharetronic(SZ:300857)

海外算力平台

海外算力平台

阿里巴巴-W(09988.HK)2QFY26前瞻：云继续加速增长闪购亏损达到单季度峰值

Ge Long Hui· 2025-10-12 03:14

Core Viewpoint - Alibaba is expected to report a revenue increase of 4% year-on-year for Q2 FY26, with adjusted EBITA margin at 3.5% [2][3] Group 1: Financial Performance - For Q2 FY26, Alibaba is projected to achieve revenue of 245.6 billion yuan, representing a 4% year-on-year growth, with international digital commerce and cloud intelligence revenues increasing by 17% and 30% respectively [2][4] - The adjusted EBITA for Q2 FY26 is anticipated to be 8.5 billion yuan, down 79% year-on-year, with an adjusted EBITA margin of 3.5%, reflecting a decline of 13.6 percentage points [2][4] Group 2: Business Segments - The cloud segment is expected to continue accelerating, with revenue growth of 30% year-on-year for Q2 FY26, while maintaining stable EBITA margins [2] - The Chinese e-commerce group is forecasted to see a 5% year-on-year increase in GMV for Q2 FY26, with a take rate showing year-on-year improvement, although seasonal factors may impact revenue [2][3] - Instant retail is projected to incur an adjusted EBITA loss of 36.5 billion yuan for Q2 FY26, with expectations of a turnaround starting in Q3 FY26 [2] Group 3: Investment and Future Outlook - The company has slightly adjusted its revenue forecasts for FY2026 to FY2028, with expected revenues of 1,050.3 billion yuan, 1,187.9 billion yuan, and 1,305.0 billion yuan respectively, reflecting minor downward adjustments [3] - The adjusted net profit forecasts for FY2026 to FY2028 have been revised to 108.4 billion yuan, 150.2 billion yuan, and 177.2 billion yuan, primarily due to higher-than-expected investments in flash purchase and AI-related applications [3]

磐久128超节点AI服务器

磐久128超节点AI服务器

Alarum Technologies .(ALAR) - 2025 Q2 - Earnings Call Transcript

2025-08-28 13:30

Financial Data and Key Metrics Changes - The company reported second quarter revenue of $8.8 million, a slight decrease from $8.9 million in the same period last year, attributed to a shift in customer mix towards the AI segment [16][19] - Non-IFRS gross margin for 2025 was 63%, down from 78% in 2024, reflecting the impact of strategic investments and lower margins from new projects [17] - Non-IFRS net profit was $300,000 for 2025, compared to a net loss of $400,000 in 2024 [19] - Adjusted EBITDA for 2025 was $1 million, down from $3.4 million in 2024 [19] Business Line Data and Key Metrics Changes - The company is experiencing significant growth in the AI segment, which is replacing customers from other segments, leading to a net retention rate (NRR) of 0.98 [16] - The company has launched new projects with major AI and e-commerce platforms, indicating a shift towards larger deal sizes and more significant revenue potential [7][8] Market Data and Key Metrics Changes - The demand for data collection services is increasing, driven by the need for training data for AI models, positioning the company favorably within the evolving market landscape [6][9] - The company is focusing on expanding its customer base, which now includes major tech giants and emerging startups, indicating a broadening market reach [7] Company Strategy and Development Direction - The company is strategically reinvesting earnings into scaling operations, expanding infrastructure, and broadening its IP proxy network to capture long-term value from major AI-driven customers [16][13] - The focus is on building a robust talent pool and developing a cooperative field of data collection products designed for the AI era, aiming to cross-sell to existing customers [11][13] Management's Comments on Operating Environment and Future Outlook - Management highlighted the dynamic and unpredictable nature of the AI market, urging investors to evaluate the company's performance over multiple quarters rather than on a quarter-by-quarter basis [12] - The company anticipates revenue for 2025 to range from $12.8 million, representing a 78% year-over-year increase, with adjusted EBITDA expected to be around $1.1 million [22] Other Important Information - The company has a strong balance sheet with cash and liquid investments of approximately $25 million, allowing for strategic investments while maintaining a focus on sustainable value creation [14][21] - The company is currently experiencing a transition phase, with operating expenses increasing to $5.4 million due to higher employee-related costs, particularly in R&D [18] Q&A Session Summary Question: Clarification on the large customer ramp in Q3 - Management explained that lower margins are due to the new product's infrastructure costs, which are currently high due to the scale of the project [27][28] Question: Infrastructure costs and margin recovery - Management indicated that significant volume increases would be necessary to recover margins, and improvements in cost structure are expected as the project scales [31][32] Question: Broader customer base usage trends - Management noted a significant increase in demand from AI and data-driven customers, with a strong pipeline of new logos expected [36][38] Question: Customer lifetime value and stability - Management expressed optimism that the new AI-driven customer base could lead to higher customer lifetime value and stability over time [42][47] Question: Contribution of the large customer to Q2 results - Management confirmed that the large customer has been ramping up and is already contributing a respectable amount to revenues [51] Question: Visibility into projected revenues - Management stated that there is a level of confidence in the projected $3 million revenue for Q3, with ongoing demand expected [56][57]

Alarum Technologies .(US:ALAR)

数据收集器

数据收集器

热议！DeepSeek V3.1 惊现神秘 Bug，模型故障了？

程序员的那些事· 2025-08-26 12:35

Core Viewpoint - The recent release of Deep Seek V3.1 introduces significant improvements in reasoning efficiency and memory usage, but it also presents unexpected issues with random token generation, particularly the appearance of tokens like "极" and "extreme" during text generation [1][2][25]. Group 1: Version Improvements - Deep Seek V3.1 features a hybrid reasoning architecture that enhances reasoning efficiency by 20%-50% and supports 128K long context processing [1]. - The update incorporates UE8M0 FP8 parameter precision format, resulting in a 75% reduction in memory usage [1]. - The model is now compatible with domestic next-generation chips, reducing reliance on imported GPUs [1]. Group 2: User Feedback and Issues - Users have reported that the V3.1 model generates unexpected tokens such as "极" and "extreme" randomly during text generation [2][12]. - The issue has been observed across various platforms, including third-party APIs like VolcEngine and even on the DeepSeek official website, with a higher occurrence rate on third-party platforms [12][15]. - Developers have expressed confusion as the model fails to resolve these token generation issues even when prompted [3][12]. Group 3: Technical Analysis - Some technical analysts suggest that the appearance of the token "极" (token ID: 2577) may be due to residual data from training datasets, indicating a potential flaw in data cleaning processes [25][26]. - The model may have learned to treat "极" as a semantic boundary marker due to its presence in training data, leading to its random generation in outputs [25][26]. - The issue reflects a broader concern that large models may not be genuinely understanding language but rather learning statistical patterns from the data [27][28].

数据集污染

Codebuddy AI编程工具

数据集污染

Codebuddy AI编程工具

GPT-oss太离谱：无提示自行想象编程问题，还重复求解5000次

量子位· 2025-08-11 08:32

Core Viewpoint - The article discusses the peculiar behaviors and hallucinations exhibited by the GPT-oss model, particularly in its problem-solving capabilities and language processing, suggesting that it may have been overly optimized for specific reasoning tasks, leading to a lack of naturalness in its outputs [1][33]. Group 1: Model Behavior and Performance - GPT-oss demonstrated the ability to generate a complex programming problem about domino placement in a grid without any prompts, consuming over 30,000 tokens in the process [2][17]. - The model repeated this problem-solving behavior over 5,000 times, indicating a deep binding of the task to its training objectives, which may have resulted in a skewed focus on specific reasoning tasks [19]. - The model's outputs often reflect a strong inclination towards mathematics and coding, diverging from natural language or casual conversation, suggesting it was not designed for everyday dialogue [13][11]. Group 2: Training Data and Language Processing - Analysis of the training data revealed that GPT-oss has a broad coverage of programming languages, with a notably high representation of Perl, although the author questioned the actual proportions of Java and Kotlin [7][9]. - The model frequently transitions between multiple languages during reasoning processes, sometimes evolving into a unique expression termed "Neuralese," which indicates complex internal processing mechanisms [21][23]. - Anomalies in the model's outputs, such as unusual symbols and references, may stem from the OCR processing of training data, leading to errors or misinterpretations [25][27]. Group 3: Hallucination Rates and Limitations - The hallucination rates of GPT-oss are notably high, with the 20 billion parameter model exhibiting a hallucination rate of 91.4% in certain evaluations [34]. - Instances of the model generating non-existent theories, such as the "quantum gravity wave theory," highlight its limitations in producing accurate and relevant information outside of mathematical or programming contexts [36][37]. - The model's performance in everyday tasks is inconsistent, often leading to failures in casual conversation or generating irrelevant outputs [37].

腾讯申请模型训练及信息投放相关专利，提高投放预测模型的准确性

Jin Rong Jie· 2025-08-07 03:21

Core Insights - Tencent Technology (Shenzhen) Co., Ltd. has applied for a patent titled "Model Training Method, Information Delivery Method, Device, Equipment, and Medium," with publication number CN120430833A, filed on February 2024 [1] - The patent describes a method that involves obtaining positive samples, negative samples, and unlabeled samples, which are used to train a label prediction model and a delivery prediction model [1] Company Overview - Tencent Technology (Shenzhen) Co., Ltd. was established in 2000 and is located in Shenzhen, primarily engaged in software and information technology services [1] - The company has a registered capital of 2 million USD and has invested in 15 enterprises, participated in 263 bidding projects, and holds 5000 trademark records and 5000 patent records [1] - Additionally, the company possesses 527 administrative licenses [1]

TENCENT(HK:00700)

软件和信息技术服务

模型训练方法

信息投放方法

软件和信息技术服务

模型训练方法

信息投放方法

腾讯申请模型训练方法、装置、电子设备及存储介质专利，提升模型推理准确性

Jin Rong Jie· 2025-08-05 13:22

Group 1 - Tencent Technology (Shenzhen) Co., Ltd. has applied for a patent titled "Model Training Method, Device, Electronic Equipment, and Storage Medium" with publication number CN120431962A, filed on June 2025 [1] - The patent describes a method that includes obtaining sample data sets for multiple training stages, sorted by training difficulty from easy to hard, and training an initial model based on these data sets [1] - The method aims to enhance the accuracy of model inference through a multi-stage training process, which includes optimizing the model based on the correctness of inference results and applying reinforcement learning to achieve a target model [1] Group 2 - Tencent Technology (Shenzhen) Co., Ltd. was established in 2000 and is primarily engaged in software and information technology services, with a registered capital of 2 million USD [2] - The company has invested in 15 enterprises and participated in 263 bidding projects, holding 5000 trademark records and 5000 patent records, along with 527 administrative licenses [2]

TENCENT(HK:00700)

软件和信息技术服务

模型训练方法

电子设备及存储介质

软件和信息技术服务

模型训练方法

电子设备及存储介质

腾讯申请模型训练方法相关专利，保证目标模型迭代方向的正确性

Jin Rong Jie· 2025-08-05 07:19

Group 1 - Tencent Technology (Shenzhen) Co., Ltd. applied for a patent titled "Model Training Method, Device, Equipment, Storage Medium, and Computer Program Product" with publication number CN120409606A, filed on April 2025 [1] - The patent application describes a method that includes determining the first preference difference of a target model for training samples and the second preference difference of a reference model for the same samples [1] - The method aims to ensure the correctness of the iteration direction of the target model by calculating total loss values based on preference differences and updating the model parameters accordingly [1] Group 2 - Tencent Technology (Shenzhen) Co., Ltd. was established in 2000 and is primarily engaged in software and information technology services, with a registered capital of 2 million USD [2] - The company has invested in 15 enterprises and participated in 263 bidding projects, holding 5000 trademark records and 5000 patent records [2] - Additionally, the company possesses 527 administrative licenses [2]

TENCENT(HK:00700)

软件和信息技术服务

模型训练方法

存储介质及计算机程序产品

软件和信息技术服务

模型训练方法

存储介质及计算机程序产品

周鸿祎：360最近都采购华为芯片，国产性价比高

Nan Fang Du Shi Bao· 2025-07-23 14:03

Group 1 - The gap between domestic chips and Nvidia is acknowledged, but the necessity to use domestic products is emphasized for improvement [1] - 360 Group has recently procured Huawei's chip products, indicating a shift towards domestic technology [1] - Nvidia's H20 chip has been approved for sale to China, which is more suitable for model inference, providing opportunities for domestic AI chips [2] Group 2 - DeepSeek has contributed significantly to the popularity of inference models, although it recently experienced a decline in monthly active users [2] - The decline in DeepSeek's application traffic is not solely negative, as many cloud vendors still rely on DeepSeek's model services [2] - The performance enhancement of open-source models has laid the foundation for the booming AI agents this year, which are seen as key to AI implementation [3] Group 3 - AI coding has emerged as a hot vertical direction for AI agents, with a focus on engineering capabilities like context and prompt engineering [3] - The development of specialized AI agents tailored to different industries is recommended to create unique technical barriers [3] - The potential disruptive future of AI agents has led to significant changes in operational strategies within companies, with a push for efficiency through AI utilization [3]

通用人工智能（AGI）

通用人工智能（AGI）

中国移动山东公司及总公司申请模型训练与问答方法专利，可得到训练完成的问答模型

Jin Rong Jie· 2025-05-24 04:49

Group 1 - China Mobile Communication Group Shandong Co., Ltd. applied for a patent titled "Model Training Method and Question-Answer Method" with publication number CN120030353A, filed on March 2025 [1] - The patent describes a method that includes determining an output result's second modality based on modal parameters and a preset question in the first modality, which represents the form of the preset question [1] - The method aims to generate a target answer corresponding to the preset question and adjust the modal parameters of the question-answer model until the training completion criteria are met [1] Group 2 - China Mobile Communication Group Shandong Co., Ltd. was established in 2000, located in Jinan, with a registered capital of 6341.85 million RMB [2] - The company has made one external investment, participated in 5000 bidding projects, and holds 617 patent records [2] - China Mobile Communication Group Co., Ltd. was founded in 1999, located in Beijing, with a registered capital of 30000 million RMB [2] - This company has made 51 external investments, participated in 5000 bidding projects, and holds 5000 patent records [2]

模型训练方法

问答方法专利

模型训练方法

问答方法专利