Codeex
Search documents
Everyone will be EMPOWERED by AI
20VC with Harry Stebbings· 2026-02-21 15:43
You still need software engineers today. You still need designers. I'm a PM.Do you need PMs. You know, you can have some fun jokes about that. I don't think you need them.>> Today, joining us in the hot seat, we have Alexander and Bericos, product lead for Codeex at OpenAI. This is an incredible discussion. Time to get the notebook out.>> For me, the most exciting future with AI is one where everyone just feels like a superhuman, like empowered by AI. And for that, we need tools that everyone feels fluent w ...
AI executives push for growth opportunities in international markets
Youtube· 2026-02-19 17:19
Core Insights - OpenAI and Anthropic's CEOs publicly snubbed each other at the India AI summit, highlighting the competitive tension between the two companies [1][2] - OpenAI's CEO, Sam Altman, emphasized the company's growth in India, noting that 100 million people use ChatGPT weekly, making India the fastest-growing market for their coding agent, Codex [3] Company Strategies - OpenAI is exploring advertising as a monetization strategy for its international growth, while Anthropic has decided against running ads [3][4] - Altman mentioned that customer feedback on OpenAI's advertising strategy has been positive, although the exact ad format is still being refined [4][5] - OpenAI plans to roll out ads internationally but has not provided a specific timeline, indicating that aligning revenue with user growth in different geographies is crucial [5] Financial Considerations - OpenAI is reportedly raising a funding round expected to reach $100 billion, which underscores the need for revenue generation to offset significant data center expenditures [6] - The company is looking at Instagram ads as a model for effective advertising, despite challenges faced by Meta in international ad rollout due to regulatory issues [6] Competitive Landscape - The competition in the AI sector is intensifying, particularly with China's rapid advancements in AI technology, which may include subsidies for consumer products [7]
Robinhood stock drops following earnings, plus how AI is putting pressure on software companies
Youtube· 2026-02-11 01:17
Company Overview - Robinhood's fourth quarter revenue missed expectations, leading to an almost 8% decline in after-hours trading [1] - The stock was already down about 40% from its all-time high in October, raising concerns about its performance during the current crypto downturn [4] Financial Performance - Key metrics showed deceleration, particularly in net deposit growth, which continued to decline into January [2] - Despite the topline miss, management's commentary on future business growth and transaction volumes was constructive, indicating a decent outlook [3] Crypto Market Impact - Crypto revenue has decreased from over 20% to an expected near 10% of total business, with a potential 50% haircut to current trading volumes during a crypto winter [6] - This scenario would only result in a manageable 10% hit to Robinhood's EBITDA [6] Business Diversification - Robinhood is better positioned during the current crypto downturn due to its diversified product lineup, including a significant increase in net interest income and new offerings like retirement accounts and banking products [9][10] - The company is evolving into a more comprehensive financial app, which enhances its resilience compared to previous downturns [8] Options Trading Growth - Options trading, which constitutes about 25% of Robinhood's revenue, is expected to see significant growth, potentially up to 40% due to increased penetration and new product offerings [12][14] - The options market is less cyclical, allowing for trading in both up and down markets, which supports long-term growth for Robinhood [13] Prediction Markets - Robinhood's entry into prediction markets is seen as a potential growth area, leveraging its strong distribution capabilities despite increasing competition [15] - The company has announced a partnership that enhances its control over economics and product innovation in this space [16]
Agent Observability Powers Agent Evaluation
LangChain· 2026-02-09 20:44
Welcome everyone to this webinar on observing uh agents and how agent observability powers agent webinar on observing uh agents and how agent observability powers agent evaluation. Uh my name is Harrison. I'm the co-founder CEO.Joined with me is VC. Do you want vec. Do you want to do a quick intro.>> Yeah, sounds good. Yeah, super excited to be talking about this. I'm Vivec.I work on a bunch of our deep agents work here at LinkedIn. Awesome. So, it's right at the top of the hour and I see a bunch of people ...
OpenAI VS Anthropic
Matthew Berman· 2026-02-06 19:13
OpenAI and Anthropic are going head-to-head within minutes of each other. Opus 4.6% was dropped and now GPT 5.3% Codeex. They are both going so hard into a Gentic coding.That is where the industry is headed. That is where all of these frontier labs are investing their time. Long horizon task, agents, sub agents, agent teams.One of the biggest complaints about codec has been how slow it is. A lot of people say it is the best coding model out there, but it is so brutally slow as compared to Opus and other cod ...
Codex brings concepts into view.
OpenAI· 2025-12-03 16:29
Algorithms are hard to visualize and for me it's literally impossible. I have affantasia which means that I can't form mental images at all. But today I'm going to walk you through how I use codeex openi's agentic coding tool to build my very own algorithms visualizer website so I can finally see what my brain is unable to generate by itself.Step one, tell what to build. I asked Codeex to come up with a list of common algorithms and create a website with their visualizations. That's it.No need to specify la ...
Automatic code reviews with OpenAI Codex
OpenAI· 2025-11-04 17:54
Code review of two comment mark. >> Hey everyone, I'm Roma >> and I'm Maya. >> Codex needs to do two things really well to be an effective coding teammate.First, it needs to work with all your tools and second, it also needs to plug into all of your team workflows. Code review is one of the most important workflows for any engineering team and we want to help there as well. With GPT5 and now GP5 codeex, we train these models specifically to find bugs and investigate some issues.Maya, why don't you tell us m ...
GPT-5 Codex is nuts...
Matthew Berman· 2025-09-15 22:31
Product Overview - OpenAI releases GPT5 Codeex, optimized for agentic coding, available in various environments like terminal, IDE, GitHub, and ChatGPT iOS app [1][2] - GPT5 Codeex is included with ChatGpt Plus Pro business edu and enterprise plans [3] Performance Benchmarks - GPT5 Codeex achieves 74.5% on SWEBench verified, a slight improvement over GPT5's 72.8% [3] - Code refactoring sees a significant improvement with GPT5 Codeex at 51.3% compared to GPT5's 33.9% [3] - GPT5 Codeex can work independently for over 7 hours on complex tasks [4] - GPT5 Codeex uses 93.7% fewer tokens than GPT5 for simpler tasks but spends twice as long on complex use cases [6] - GPT5 Codeex reduces incorrect comments to 4.4% compared to GPT5's 13.7% and increases high impact comments to 52.4% from 39.4% [7] Features and Capabilities - Codeex is trained for code reviews, identifying critical flaws by navigating codebase, reasoning through dependencies, and running code and tests [6] - Codeex CLI updates include better formatted tool calls and diffs, simplified approval modes, and conversation state compaction [12][13] - Codeex automates environment setup by scanning for setup scripts and fetching dependencies at runtime [15] - Codeex can spin up its own browser, iterate on its builds, and attach screenshots to tasks and GitHub PRs [15] - Codeex reviews PRs by matching stated intent to the actual diff, reasoning over the codebase, and executing code and tests [16] Windsurf Integration - Windsurf is highlighted as a powerful agentic IDE, especially after being acquired by Cognition [9] - Windsurf offers features like deep wiki, vibe, replace, one-click MCP store, sophisticated memory, and deep integration with Devon [10][11] Pricing and Availability - Pro plan at $200 per month can support a full work week across multiple projects, positioning it as an additional developer [19] - Business plans offer credit purchases for exceeding included limits, while enterprise plans provide a shared credit pool [20] Infrastructure Improvements - Cloud infrastructure performance is improved by caching containers, reducing medium completion time for new tasks and follow-ups by 90% [14]
How to Improve your Vibe Coding — Ian Butler
AI Engineer· 2025-08-03 04:32
Agent Performance - Current agents have a low overall bug find rate and generate a significant amount of false positives [1][2] - Some agents have a true positive rate of less than 10% for finding bugs [2] - Three out of six agents benchmarked had a 10% or less true positive rate out of 900+ reports [3] - One agent produced 70 issues for a single task, all of which were false [4] - Cursor had a 97% false positive rate over 100+ repos and 1,200+ issues [4] - Thinking models are significantly better at finding bugs in a codebase [8][18] - Agents are not holistically looking at files, leading to high variability across runs [20] Implications for Developers - Alert fatigue reduces the effectiveness of trusting agents, potentially leading to bugs in production [5] - Developers are unlikely to sift through numerous false positives to identify actual bugs [4] Recommendations for Improving Agent Performance - Use bug-focused rules with scoped instructions detailing security issues and logical bugs [6] - Prioritize naming explicit classes of bugs in rules, such as "off bypasses" or "SQL injection" [11] - Require fix validation by ensuring agents write and pass tests before incorporating changes [12] - Manage context thoroughly by feeding diffs of code changes and preventing key files from being summarized [15] - Ask agents to create a step-by-step component inventory of the codebase [16] - Bias the model with specific security information like the OWASP Top 10 [9][10]