Matthew Berman
Search documents
Forward Future Live | 10/10/25
Matthew Berman· 2025-10-10 16:24
Download Humanities Last Prompt Engineering Guide (free) 👇🏼 https://bit.ly/4kFhajz Download The Matthew Berman Vibe Coding Playbook (free) 👇🏼 https://bit.ly/3I2J0YQ Join My Newsletter for Regular AI Updates 👇🏼 https://forwardfuture.ai Discover The Best AI Tools👇🏼 https://tools.forwardfuture.ai My Links 🔗 👉🏻 X: https://x.com/matthewberman 👉🏻 Forward Future X: https://x.com/forward_future_ 👉🏻 Instagram: https://www.instagram.com/matthewberman_ai 👉🏻 Discord: https://discord.gg/xxysSXBxFW 👉🏻 TikTok: https://www ...
This Tiny Model is Insane... (7m Parameters)
Matthew Berman· 2025-10-10 16:05
Model Performance & Innovation - A 7 million parameter model (TRM - Tiny Recursive Model) is outperforming larger frontier models on reasoning benchmarks [1][2] - TRM achieves 45% test accuracy on ARC AGI 1 and 8% on ARC AGI 2, surpassing models with significantly more parameters (less than 0.01% of the parameters) [2] - The core innovation lies in recursive reasoning with a tiny network, moving away from simply predicting the next token [6][23] - Deep supervision doubles accuracy compared to single-step supervision (from 19% to 39%), while recursive hierarchical reasoning provides incremental improvements [16] - TRM significantly improves performance on tasks like Sudoku (55% to 87%) and Maze (75% to 85%) [18] Technical Approach & Implications - TRM uses a single tiny network with two layers, leveraging recursion as a "virtual depth" to improve reasoning [23][27][28] - The model keeps two memories: its current guess and the reasoning trace, updating both with each recursion [25] - The approach simplifies hierarchical reasoning, moving away from complex mathematical theorems and biological arguments [22][23] - Recursion may represent a new scaling law, potentially enabling powerful models to run on devices like computers and phones [34] Comparison with Existing Models - Traditional LLMs struggle with hard reasoning problems due to auto-regressive generation and reliance on techniques like chain of thought and pass at K [3][5][6] - HRM (Hierarchical Reasoning Model), a previous approach, uses two networks operating at different hierarchies, but its benefits are not well-understood [9][20][21] - TRM outperforms HRM by simplifying the approach and focusing on recursion, achieving greater improvements with less depth [30] - While models like Grok for Thinking perform better on some benchmarks, they require significantly more parameters (over a trillion) compared to TRM's 7 million [32]
Greg Brockman: AGI, Sora 2, Bottlenecks, White Collar, Proactive AI, and more!
Matthew Berman· 2025-10-08 18:48
AI Trends & Future Predictions - Discussion on scaling Sora, indicating the industry's focus on improving AI model capabilities [1] - Exploration of transformer models' future relevance in AI development [1] - Consideration of proactive AI and compressing intelligence as key areas of advancement [1] - Speculation on the potential of fully generated software and its implications [1] - Examination of Agentic Commerce Protocol, suggesting a move towards AI-driven commercial interactions [1] - Predictions for 2026, including the possibility of Artificial General Intelligence (AGI) [1] Technology & Infrastructure - Analysis of building with AMD and other kinds of compute, highlighting the importance of hardware infrastructure [1] - Identification of bottlenecks in AI development, suggesting areas needing improvement [1] - Discussion on decoupling of the internet, potentially related to data sovereignty or decentralized technologies [1] Job Market & Industry Impact - Addressing concerns about job security in the face of AI advancements [1] - Exploration of building on top of OpenAI, indicating the platform's significance in the AI ecosystem [1] - Consideration of the role of humans in the loop, emphasizing the importance of human-AI collaboration [1]
Forward Future Live | 8/3/25
Matthew Berman· 2025-10-03 16:20
Enter to win a sora 2 code! https://gleam.io/FC9uI/win-instant-access-to-openais-sora-app Download Humanities Last Prompt Engineering Guide (free) 👇🏼 https://bit.ly/4kFhajz Download The Matthew Berman Vibe Coding Playbook (free) 👇🏼 https://bit.ly/3I2J0YQ Join My Newsletter for Regular AI Updates 👇🏼 https://forwardfuture.ai Discover The Best AI Tools👇🏼 https://tools.forwardfuture.ai My Links 🔗 👉🏻 X: https://x.com/matthewberman 👉🏻 Forward Future X: https://x.com/forward_future_ 👉🏻 Instagram: https://www.insta ...
Sora 2 is unbelievable...
Matthew Berman· 2025-10-02 19:16
Sora 2 功能与特点 - Sora 2 能够生成各种风格的视频,包括名人乱斗、游戏场景、电影片段等 [1][4][5][8][9] - Sora 2 在人物面部扫描和还原方面表现出色,能准确捕捉人物的 likeness [11][12] - Sora 2 能够进行风格迁移,生成水彩、Pixar 风格、黏土动画等多种艺术风格的视频 [60][61] - Sora 2 具备一定的物理模拟能力,在液体、烟雾、火焰等效果的生成上表现优秀 [38][39][40][42][64] - Sora 2 在镜头控制方面表现良好,能够实现平移、变焦、焦点转移等复杂的镜头效果 [34][36][37] Sora 2 局限性 - Sora 2 在处理精细动作和物体操作时仍存在困难,例如手指与键盘的交互、纸牌的洗牌等 [27][28][29] - Sora 2 在生成包含多人场景时,容易出现人物变形、穿模等问题 [18][22][58][59] - Sora 2 在文本生成方面存在不足,容易出现文字错误、日期不准确等问题 [50][51] - Sora 2 生成的视频分辨率可能较低,影响观看体验 [37] - Sora 2 在版权方面存在争议,可能存在侵权风险 [1][3][65] Lindy 赞助与应用 - Lindy 是一款低代码平台,可以快速构建在线教育平台等应用,并在 5 分钟内部署 [13][14][15] - Lindy 具有内置的 QA 流程,可以确保代码的质量和可靠性 [14][15] - Lindy 为用户提供 20 美元的免费额度 [15]
Claude is BACK! (30 Hours of Thinking!)
Matthew Berman· 2025-10-01 18:08
Model Performance & Benchmarks - Claude Sonnet 4.5% is considered the best coding model, demonstrating a significant advancement in coding ability [1] - On SWE-bench verified evaluation, Claude Sonnet 4.5% outperforms Opus 4.1% by a substantial margin, exceeding almost 20 percentage points compared to GPT-4 Code Interpreter and Gemini 1.5 Pro [1] - The model achieves top scores on Terminal Bench (50%), agentic tool use, and computer use benchmarks, excelling in high school math (Amy 2025 with Python) with a 100% score [1] Long Horizon Tasks & Efficiency - AI's ability to complete long horizon tasks is exponentially increasing, with the task duration AI can handle doubling every 7 months [1] - Claude Sonnet 4.5% can think independently for over 30 hours, indicating its suitability for agentic applications [1] - The industry is shifting towards measuring AI intelligence per watt, emphasizing the importance of task and token efficiency [2] Future Applications & Industry Impact - Anthropic is showcasing a vision of the future of software with "Claude Imagine," demonstrating the ability to generate applications on the fly within a desktop environment [1][2] - Claude is increasingly used to write its own code, with Anthropic's CEO stating that it writes the majority of the code for Claude [9][10] - Box tested Claude Sonnet 4.5% for data extraction accuracy with Box AI on 40,000 fields across 1500+ documents, and the model performed four percentage points better than Sonnet 4 [3][4] Pricing & Availability - Claude Sonnet 4.5% is priced at $3 per million input tokens and $15 per million output tokens, the same as Sonnet 4 [11] - Anthropic recommends immediate upgrading to Claude Sonnet 4.5% for all use cases [11]
OpenAI just dropped Sora 2... And it's SCARY GOOD
Matthew Berman· 2025-09-30 22:44
Everything you're about to see in this clip was created by Sora. Remember that. Check it out.One year ago, Sora 1 redefined what was possible with moving images. Today, we're announcing the Sora app, powered by the allnew Sora 2. [Music] It's the most powerful imagination engine ever built, and it's packed with new features.I'll pass it to Bill for more details. Now every video comes with sound. [Music] Sora 2 is also the state-of-the-art for motion physics IQ and body mechanics, marking a giant leap forwar ...
21 MORE Things You Should Be Using AI For...
Matthew Berman· 2025-09-30 15:12
Creativity and Design - AI 工具可以用于室内设计,通过颜色标注编辑房间照片,以展示不同家具和风格的效果 [1][2] - AI 可以重建历史场景,例如 Jupiter Temple 和 Petra Treasury,通过图像生成逼真的 4K 重建图 [3] - AI 可以辅助产品设计,用户只需提供简单的手绘图,即可生成 3D 产品设计图,并展示产品的使用效果 [3][4] - AI 可以轻松创建视觉效果惊艳的 3D 动画,适用于 DJ 表演和音乐会背景 [5] - AI 可以将图片转换为不同的格式,例如将 JPEG 格式的图片转换为 SVG 格式 [6] Business Applications - AI 可以用于检测钓鱼邮件和诈骗信息,通过分析邮件内容、语音邮件截图或短信内容,判断其是否为诈骗 [6] - AI 可以评估太阳能安装的可行性,通过分析房屋的 Google Maps 图像,估算安装成本和节能效果,并提供财务分析 [6][7] - AI 可以作为个人助理,通过设定计划任务,定期获取新闻摘要,并发送到指定邮箱 [7] - AI 可以协助安排会议,通过分析团队成员的日程安排,找出合适的会议时间,并起草会议邀请 [7] - AI 可以将手写笔记自动数字化,并进行扩展和深度分析,例如将化学结构图转换为包含化学结构链接的参考表 [8] - AI 可以快速编写 SQL 查询语句,用户只需上传数据集并描述过滤条件,即可生成相应的 SQL 查询 [8] - AI 可以对客户通话进行深度情感分析,生成图表和自然语言解释,帮助企业了解客户情绪和改进服务 [8][9] Professional and Financial Use Cases - AI 可以帮助理解研究论文,通过总结论文内容、解释研究方法和发现,并高亮显示 PDF 中的关键信息 [10][11][12] - AI 可以对房地产市场进行深度分析,提供房价、市场竞争情况、持有成本等信息,辅助购房决策 [13][14] - AI 可以作为薪资谈判助手,通过研究同类职位的薪资水平,提供谈判策略,并撰写谈判邮件 [15][16][17][18] - AI 可以优化邮件措辞,将邮件内容修改得更专业、更友好,避免冒犯他人 [19][20][21] - AI 可以简化保险信息审查,总结保险条款、解释福利选项,并根据个人情况推荐合适的保险计划 [22][23][24] - AI 可以对公司进行详细的财务分析,为投资决策提供参考 [25][26] - AI 可以利用 LM Studio 从任何文档创建学习指南,包括抽认卡、视频概述和测验 [26][27][28] - AI 可以通过分析街景图像,辅助 Geogesser 游戏,帮助玩家快速定位 [29][30][31]
Who's Going to Power Microsoft's New AI Models?
Matthew Berman· 2025-09-26 21:40
Strategic Partnership - Microsoft signed a 5-year deal with Nebius worth $174 billion [1] - The deal focuses on powering Microsoft's AI projects with GPUs [1] - Nebius provides AI infrastructure, including pre-training, inference, and post-training capabilities [1][2] Nebius AI Studio - Nebius AI Studio supports open-source models for inference [2] - Nebius offers services like inference, Laura, and fine-tuning [2] - A promotional code "AI Studio 20" provides $20 in credits [2]
Forward Future Live | 9/26/25
Matthew Berman· 2025-09-26 16:39
Download Humanities Last Prompt Engineering Guide (free) 👇🏼 https://bit.ly/4kFhajz Download The Matthew Berman Vibe Coding Playbook (free) 👇🏼 https://bit.ly/3I2J0YQ Join My Newsletter for Regular AI Updates 👇🏼 https://forwardfuture.ai Discover The Best AI Tools👇🏼 https://tools.forwardfuture.ai My Links 🔗 👉🏻 X: https://x.com/matthewberman 👉🏻 Forward Future X: https://x.com/forward_future_ 👉🏻 Instagram: https://www.instagram.com/matthewberman_ai 👉🏻 Discord: https://discord.gg/xxysSXBxFW 👉🏻 TikTok: https://www ...