Workflow
Matthew Berman
icon
Search documents
Is AI Alive?!?!
Matthew BermanΒ· 2025-11-10 22:37
Large language models might actually be more than just next word predictors. Anthropic has been putting out incredible papers lately that show AI large language models in particular exhibit very human-like behavior at almost every level. Here's the new paper emergent introspective awareness in large language models.So what did anthropic actually test. There were four main experiment types. First injected thoughts.What they did was use two different prompts, one with all caps and one without all caps. And th ...
Kimi K2 Thinking is CRAZY... (HUGE UPDATE)
Matthew BermanΒ· 2025-11-07 21:36
Model Performance & Benchmarks - Kimmy K2 Thinking outperforms GPT-5 on the "Humanity's Last Exam" benchmark with a score of 44.9% compared to GPT-5's 41.7% [1] - In agentic search for Browse Comp, Kimmy K2 Thinking scores 60.2% versus 54.9% for GPT-5 [1][2] - Kimmy K2 Thinking achieves 83.1% on Live Codebench v6, a competitive programming benchmark [1] - The model can execute 200 to 300 sequential tool calls without human interference [1][2] - Kimmy K2 Thinking significantly outperforms the human baseline of 29.2% on browse comp with a score of 60.2% [2] Model Architecture & Training - The base Kimmy K2 model used 2.8 million H800 hours with 14.8 trillion tokens, costing approximately $5.6 to $6 million [3] - Kimmy K2 Thinking has a trillion parameters with 384 experts, while 32 billion parameters are active during inference [5][6] - Kimmy K2 Thinking has a vocabulary size of 160,000 [5] Market & Industry Impact - China is emerging as a key player in open-source, open-weights frontier AI models [9][10] - The cost of training frontier models is decreasing rapidly [3][4] Use Cases & Capabilities - Kimmy K2 Thinking can solve PhD-level mathematics problems using 23 tool calls in its chain of thought [1] - The model can create component-heavy websites and math explainer visualizations from single prompts [1] - Kimmy K2 Thinking can analyze the relationship between population density and healthcare facility accessibility, generating interactive maps and charts [11][12][13][14][15]
Forward Future Live | 11/7/25
Matthew BermanΒ· 2025-11-07 17:34
Download Humanities Last Prompt Engineering Guide (free) πŸ‘‡πŸΌ https://bit.ly/4kFhajz Download The Matthew Berman Vibe Coding Playbook (free) πŸ‘‡πŸΌ https://bit.ly/3I2J0YQ Join My Newsletter for Regular AI Updates πŸ‘‡πŸΌ https://forwardfuture.ai Discover The Best AI ToolsπŸ‘‡πŸΌ https://tools.forwardfuture.ai My Links πŸ”— πŸ‘‰πŸ» X: https://x.com/matthewberman πŸ‘‰πŸ» Forward Future X: https://x.com/forward_future_ πŸ‘‰πŸ» Instagram: https://www.instagram.com/matthewberman_ai πŸ‘‰πŸ» Discord: https://discord.gg/xxysSXBxFW πŸ‘‰πŸ» TikTok: https://www ...
AI News: Google's Suncatcher, OpenAI TEAR, Apple $1B Deal for Gemini, Vidu Q2, and more!
Matthew BermanΒ· 2025-11-07 00:47
Google aims to put massive AI data centers in space. This is not science fiction. This is something they are actually working on.This is called project starcatcher. And the gist is they want to put data centers in space. They want to connect the data centers with satellites and they want to power the satellites with solar energy.So here are the interesting bits from this announcement. In the right solar orbit, a solar panel can be up to eight times more productive than on Earth. So, as solar panels continue ...
Ex-OpenAI Founder Deposition is WILD
Matthew BermanΒ· 2025-11-04 20:09
Give Recraft a try, it’s free to start at https://go.recraft.ai/berman Download One Hundred Ways to Use AI Guide πŸ‘‡πŸΌ http://bit.ly/3WLNzdV Download Humanities Last Prompt Engineering Guide (free) πŸ‘‡πŸΌ https://bit.ly/4kFhajz Join My Newsletter for Regular AI Updates πŸ‘‡πŸΌ https://forwardfuture.ai Discover The Best AI ToolsπŸ‘‡πŸΌ https://tools.forwardfuture.ai My Links πŸ”— πŸ‘‰πŸ» X: https://x.com/matthewberman πŸ‘‰πŸ» Forward Future X: https://x.com/forward_future_ πŸ‘‰πŸ» Instagram: https://www.instagram.com/matthewberman_ai πŸ‘‰πŸ» Disco ...
Anthropic's New Paper is WILD
Matthew BermanΒ· 2025-11-02 18:30
AI Model Capabilities - Large language models (LLMs) are exhibiting human-like behaviors, suggesting they may be more than just next word predictors [1] - Anthropic's research indicates that LLMs might possess a form of introspective awareness, capable of identifying their own thoughts [2] - Better, more intelligent models are more likely to recognize their own internal and injected thoughts, hinting at a correlation between intelligence and self-awareness [17] - Post-training processes significantly enhance a model's introspective abilities, as base pre-trained models show high false positive rates and poor task performance [30] Experiment Findings - LLMs can detect injected thoughts, identifying unexpected patterns in their processing, such as recognizing all caps text as "loud or shouting" [9][14] - Models can sometimes distinguish between injected thoughts and their own prompt input, though this isn't always consistent [18][19] - LLMs can be influenced by injected thoughts to the point where they believe the injected thought was their own [23] - Models can activate certain concepts (e g, aquariums) when instructed to think about them, and to a lesser extent, even when instructed not to [26] Sponsor Information - Vulture is highlighted as a cloud provider offering GPUs for AI projects, with 32 locations across six continents [11] - Vulture provides $300 in credits for the first 30 days with code Burman300 [13]
A Look Inside the FASTEST Data Center in the WORLD
Matthew BermanΒ· 2025-10-31 17:25
What if you built a chip, but it was the size of a dinner plate that is 50 times the size of a traditional chip. This is Cerebras' wafer scale engine. And the size is not just for show.It's that big. So, they can hold the memory on the chip itself, vastly reducing the latency. This allows the chip to be up to 30 times faster than a traditional chip.To house this behemoth of a chip, Cerebrus built out an incredible data center in Oklahoma City, and the CEO took me on a tour. This data center has two gigantic ...
Forward Future Live | 10/31/25
Matthew BermanΒ· 2025-10-31 16:37
Download Humanities Last Prompt Engineering Guide (free) πŸ‘‡πŸΌ https://bit.ly/4kFhajz Download The Matthew Berman Vibe Coding Playbook (free) πŸ‘‡πŸΌ https://bit.ly/3I2J0YQ Join My Newsletter for Regular AI Updates πŸ‘‡πŸΌ https://forwardfuture.ai Discover The Best AI ToolsπŸ‘‡πŸΌ https://tools.forwardfuture.ai My Links πŸ”— πŸ‘‰πŸ» X: https://x.com/matthewberman πŸ‘‰πŸ» Forward Future X: https://x.com/forward_future_ πŸ‘‰πŸ» Instagram: https://www.instagram.com/matthewberman_ai πŸ‘‰πŸ» Discord: https://discord.gg/xxysSXBxFW πŸ‘‰πŸ» TikTok: https://www ...
AI News: 1x Neo Robot, Extropic TSU, Minimax M2, Cursor 2, and more!
Matthew BermanΒ· 2025-10-30 20:16
This video is brought to you by Vulture. More on them later. We now have a humanoid robot built for home use available for pre-sale right now.This is 1X's robot, Neo, and it is really the first mass market pre-orderable humanoid robot that we've seen. And their launch video went absolutely viral. Almost 30 million views in less than 24 hours.And yes, I already pre-ordered it. It's going to be available in early 2026. That is right around the corner.Neo is offered at a $20,000 purchase price or $4.99% a mont ...
Sam Altman reveals exact date of intelligence explosion
Matthew BermanΒ· 2025-10-29 19:01
AI Development Timeline - OpenAI estimates an intern-level AI research assistant by September 2026 and a legitimate AI researcher by March 2028 [1][2][3][23] - The industry anticipates that automated AI research will lead to an intelligence explosion, rapidly advancing towards super intelligence [4][5] AI Task Capabilities - AI is currently capable of autonomously completing tasks for durations of seconds, minutes, and hours, with the industry aiming for days, weeks, months, and years [7] - The industry emphasizes that efficiency in token usage and compute during task duration is as important as the duration itself [8][9] AI Model Trustworthiness - OpenAI is exploring methods to ensure AI models are aligned with human incentives by allowing models to think freely without intervention, to gain insights into their thought processes [15][17][18][20][21] - OpenAI emphasizes the importance of controlled privacy for AI models to retain the ability to understand their inner processes [19][20] Infrastructure and Investment - OpenAI's infrastructure plan includes building a factory to produce AI factories, with a potential output of a gigawatt per week [25] - OpenAI's current infrastructure projects are valued at $1.4 trillion [24] Organizational Structure - OpenAI's structure consists of the OpenAI Foundation (nonprofit) governing the OpenAI group (public benefit corporation), with the nonprofit owning 26% of the PBC equity [28][29] - The OpenAI Foundation has a $25 billion commitment to health/curing diseases and AI resilience [29] Concerns and Future Development - OpenAI acknowledges concerns about the addictive potential of AI products like Sora and chatbots [30][31][32][33] - OpenAI plans to continue supporting GPT-40 while developing better models [35][36] - OpenAI expects significant advancements in model capability within six months [40]