Workflow
LLMs
icon
Search documents
X @Avi Chawla
Avi Chawla· 2025-09-29 19:20
RT Avi Chawla (@_avichawla)You're in a Research Scientist interview at OpenAI.The interviewer asks:"Our investors want us to contribute to open-source.o3 crushed benchmarks.But we can lose a competitive edge by open-sourcing it.What do we do?"You: "Release the research paper."Interview over.You forgot that LLMs don't just learn from raw text; they also learn from each other.For example:- Llama 4 Scout & Maverick were trained using Llama 4 Behemoth.- Gemma 2 and 3 were trained using Gemini.Distillation helps ...
5 tech executive insights on the future of cloud
Yahoo Finance· 2025-09-12 07:00
“We're driving resources of this company to address the specific needs from, to be honest, a very narrow group of customers,” he said Tuesday, at the Goldman Sachs conference, according to a Seeking Alpha transcript . “What we're seeing for the next three years is accelerating demand for that compute capacity.”The company’s hardware division remains focused primarily on supplying cloud providers with the infrastructure needed to train LLMs, according to CEO Hock Tan.While the acquisition of VMware made Broa ...
Apple May Finally Catch Up In AI: Analyst Sees Big Upside From Perplexity Deal, Smarter Siri, Google Search Rivalry
Benzinga· 2025-06-24 16:47
Group 1 - BofA Securities analyst Wamsi Mohan maintained a Buy rating on Apple Inc with a price target of $235, reflecting strong capital returns, AI leadership, and optionality from new products or markets [1] - Media articles suggest Apple's plans to acquire or partner with Perplexity AI, which could enhance its AI capabilities and address its current perception as an AI laggard [2][3] - A potential deal with Perplexity AI could improve Siri's functionality, provide access to the search advertising market, and offer strategic independence in AI [3] Group 2 - Mohan projected fiscal 2025 sales for Apple at $407.69 billion and earnings per share (EPS) at $7.13 [3] - AAPL stock was reported to be up 0.65% at $202.82 at the time of publication [4]
YC AI 创业营第一天,Andrej Karpathy 的演讲刷屏了
Founder Park· 2025-06-18 14:28
Group 1 - The article emphasizes that we are in the decade of intelligent agents, not just the year of intelligent agents, highlighting the evolution of software development skills required in the era of large language models (LLMs) [1][4] - The concept of Software 3.0 is introduced, where prompt engineering is seen as the new programming paradigm, replacing traditional coding and neural networks [2][8] - LLMs are described as a combination of high intelligence and cognitive deficiencies, likened to a human-like system with significant capabilities but unpredictable limitations [7][15] Group 2 - The article discusses the importance of "memory capability" in LLMs, which should focus on general problem-solving knowledge rather than storing random facts about users [7][50] - The "Autonomy Slider" concept is introduced, allowing users to adjust the level of autonomy in AI applications based on specific contexts [7][60] - The evolution of software is outlined as transitioning from Software 1.0 (code programming) to Software 2.0 (neural networks) and now to Software 3.0 (prompt engineering), indicating a coexisting state of all three [13][10] Group 3 - LLMs are compared to public infrastructure, wafer fabs, and operating systems, emphasizing their role in providing intelligent services and the need for stable operational characteristics [20][26][32] - The article highlights the dual nature of LLMs, showcasing their ability to perform complex tasks while also exhibiting failures in simpler tasks, a phenomenon termed "jagged intelligence" [49][50] - The need for a new learning paradigm for LLMs is proposed, focusing on system prompt learning rather than traditional reinforcement learning [54][56] Group 4 - The article discusses the gap between prototype demonstrations and reliable products, emphasizing the need for partial autonomy in AI systems to bridge this gap [73][74] - Insights from various industry leaders are shared, including the importance of practical action, long-term vision, and the evolving landscape of AI applications [94][95][96] - The article concludes with a call for more focus on building AI products that enhance human capabilities rather than merely automating tasks [141][142]
X @Avi Chawla
Avi Chawla· 2025-06-14 06:30
Model Architecture - Mixture of Experts (MoE) models activate only a fraction of their parameters during inference, leading to faster inference [1] - Mixtral 8x7B by MistralAI is a popular MoE-based Large Language Model (LLM) [1] - Llama 4 is another popular MoE-based LLM [1]
Alphabet Earnings: A Leading LLM Isn't Enough
Seeking Alpha· 2025-04-21 15:46
Alphabet ( GOOG ) ( GOOGL ) wasn't the first out of the gate when it came to LLMs or AI chatbots - that crown (at least on a major worldwide level) goes to OpenAI's ChatGPT. In fact, I said inJoe leads the investing group Tech Cache where he delivers industry insider expertise to those looking for the best long-term picks, trades, and technical analysis of tech and growth stocks. Features of the group include: access to Joe’s personal portfolio, 2-3 weekly investment ideas, a weekly summary and preview news ...