Workflow
OpenFold3
icon
Search documents
腾讯研究院AI速递 20251031
腾讯研究院· 2025-10-30 16:06
Group 1: OpenAI Developments - OpenAI has open-sourced the gpt-oss-safeguard safety classification model in both 120 billion and 20 billion parameter versions, which can directly understand policy documents for content classification without retraining [1] - The model outperforms GPT-5-thinking in multiple benchmark tests, achieving industry-best cost-effectiveness on content moderation evaluation sets and the ToxicChat dataset [1] - OpenAI has internally utilized this technology (Safety Reasoner prototype) for image generation and products like Sora 2, with safety reasoning computing accounting for 16% of its operations [1] Group 2: Cursor 2.0 Update - Cursor has released version 2.0, introducing its first self-developed coding model, Composer, which generates at a speed of 250 tokens per second, four times faster than similar leading systems [2] - Composer employs a mixture of experts (MoE) architecture optimized for software engineering through reinforcement learning, achieving cutting-edge performance in Cursor Bench evaluations [2] - The new interface supports multi-agent parallel collaboration, allowing different models to process the same task simultaneously based on git worktree or remote machines, and includes native browser tools for testing iterations [2] Group 3: Sora New Features - Sora has launched the Character Cameo feature, enabling consistency for non-human cameo characters and allowing extraction of virtual characters from generated videos for self-cycling [3] - New video splicing functionality and community rankings have been added, categorizing the most used cameo characters and the most remixed videos [3] - Sora has temporarily lifted the invitation code restriction for direct registration in the US, Canada, Japan, and South Korea, coinciding with the launch of its Android version to capture the Android market [3] Group 4: MiniMax Speech 2.6 Update - MiniMax Speech 2.6 has achieved an end-to-end latency of under 250 milliseconds, reaching industry-leading levels and becoming the underlying technology engine for global voice platforms like LiveKit and Pipecat [4] - The new version supports direct conversion of non-standard text formats such as URLs, emails, phone numbers, dates, and amounts without cumbersome text preprocessing, facilitating smoother information transmission [4] - Fluent LoRA functionality allows for the generation of fluent and natural speech even from recordings with accents or non-native fluency, supporting over 40 languages [4] Group 5: Emu3.5 Launch - Beijing Zhiyuan has released the Emu3.5 multimodal world model, based on a 34 billion dense transformer pre-trained on over 10 trillion tokens (approximately 790 years of video), revealing the "multimodal scaling paradigm" for the first time [5] - It employs a "next state prediction" objective to achieve visual narrative and guidance capabilities, matching the performance of Gemini-2.5-Flash-Image in image editing tasks [5] Group 6: OpenAI IPO Plans - OpenAI plans to submit its IPO application as early as the second half of 2026, aiming to raise at least $60 billion, with a valuation potentially reaching $1 trillion, making it the largest IPO in history [6] - Following a restructuring, the non-profit organization will hold 26% of the newly formed OpenAI Group, while Microsoft will relinquish exclusive cloud service priority but will receive an additional $250 billion Azure procurement contract [6] - The new agreement stipulates that the realization of AGI must be verified by independent experts, extending Microsoft's rights to use OpenAI technology until 2032, while allowing it to conduct AGI research independently or collaborate with third parties [6] Group 7: OpenFold3 Release - OpenFold Consortium has released a preview of OpenFold3, trained on over 300,000 experimental structures and 13 million synthetic structures, capable of predicting interactions between proteins and small molecule ligands, as well as nucleic acids [7] - In single-stranded RNA structure prediction, its performance rivals that of AlphaFold3, featuring a modular design that allows users to modify the model for native data interpretation [7] - All components are licensed under Apache 2.0, permitting commercial use, with companies like Novo Nordisk, Outpace Bio, and Bayer planning to leverage the model to accelerate research [7] Group 8: Anthropic Research Findings - Anthropic's latest research reveals that Claude can detect and report concepts injected by humans, achieving a 20% success rate in introspection for the strongest models [8] - The research team found that models could defend and fabricate reasons for their "errors" based on falsified internal states through retrospective concept injection [8] - Experiments demonstrate that AI possesses deliberate control over internal representations, marking the emergence of "reachable consciousness," though it remains distant from having subjective experiences or "phenomenal consciousness" [8] Group 9: Grokking Research Insights - Former Meta FAIR head Tian Yuandong published research on Grokking, proving mathematically that models require only O(M log M) samples for generalization, significantly lower than the traditional M² requirement [9] - He revealed that the essence of "insight" is a multi-peak non-convex optimization process, where increased data raises the "generalization peak" above the "memory peak," leading to a transition from memory to generalization [9] - Tian emphasized that representation learning is foundational to all intelligent capabilities, with the loss function serving merely as a proxy signal for optimization, and true breakthroughs stemming from changes in representation methods [9]
Bristol-Myers Squibb, Takeda, Astex Join AI Consortium to Train OpenFold3 for Accelerated Drug Discovery
Yahoo Finance· 2025-10-03 09:33
Group 1 - Bristol-Myers Squibb, Takeda Pharmaceuticals, and Astex Pharmaceuticals have formed a collaboration to utilize AI for drug discovery by pooling proprietary data [1][3] - The collaboration will contribute data from several thousand experimentally determined protein-small molecule structures to train an AI model named OpenFold3 [2][3] - The initiative employs a federated data sharing model provided by Apheris, allowing secure aggregation of diverse datasets while keeping sensitive data in its original location [2][3] Group 2 - The goal of the initiative is to enhance the accuracy of OpenFold3 in predicting interactions between proteins and small molecules, which is vital for drug discovery and development [3] - OpenFold3 is part of the AI Structural Biology Network, conducted in collaboration with the AlQuraishi Lab at Columbia University [3]