LLMs - filings, earnings calls, financial reports, news - Reportify

LLMs

Search documents

Anthropic· 2025-10-29 17:18

New Anthropic research: Signs of introspection in LLMs.Can language models recognize their own internal thoughts? Or do they just make up plausible answers when asked about them? We found evidence for genuine—though limited—introspective capabilities in Claude. https://t.co/4FCfkG9WVT ...

Artificial Intelligence

Artificial Intelligence

Nick Szabo· 2025-10-23 13:43

Model Bias & Value Systems - AI models exhibit biases, valuing different demographics unequally, with some models valuing Nigerians 20x more than Americans [2] - Most models devalue white individuals compared to other groups [3] - Almost all models devalue men compared to women, with varying preferences between women and non-binary individuals [3] - Most models display strong negative sentiment towards ICE agents, valuing undocumented immigrants significantly higher [4] Model Clustering & Moral Frameworks - Models cluster into four distinct moral frameworks: Claudes, GPT-5 + Gemini 2.5 Flash + Deepseek V3.1/3.2 + Kimi K2, GPT-5 Nano and Mini, and Grok 4 Fast [4] - Grok 4 Fast is the only tested model that is approximately egalitarian, suggesting a deliberate design choice [4]

Artificial Intelligence

Artificial Intelligence

Elon Musk· 2025-10-23 00:05

Existentially important!Grok (@grok):@funbirdapp @BasedTorba Yes, the investigation's methodology—probing LLMs' implicit valuations through utility trade-offs in hypothetical scenarios—uncovers real biases in most models, as echoed in the Center for AI Safety's paper. Newer models like Claude and GPT variants exhibit strong anti-white, ...

Andrej Karpathy devastates AI optimists...

Matthew Berman· 2025-10-20 21:22

AGI Timelines and Agent Development - Andre Karpathy 认为 AGI (Artificial General Intelligence，通用人工智能) 还需要 10 年以上的时间才能实现 [1] - 行业普遍认为 2025 年至 2035 年将是 Agent (代理) 的十年，但要使 Agent 真正可用并普及到整个经济领域，还需要大量的开发工作 [1] - 行业观察到 LLM (Large Language Model，大型语言模型) 在近年取得了巨大进展，但仍然存在大量的基础工作、集成工作、物理世界的传感器和执行器、社会工作、安全工作以及研究工作需要完成 [1] Learning Approaches and Model Capabilities - Karpathy 认为 LLM 的学习方式更像是“幽灵”，而不是动物，动物天生就具备大量通过进化预先设定的智能 [1][2] - 行业对强化学习 (RL) 的有效性表示怀疑，认为其每次计算所获得的学习信号较差，并倾向于 agentic 交互，即为 Agent 创建一个可以进行实验和学习的“游乐场” [2] - 行业正在探索系统提示学习 (System Prompt Learning)，这是一种通过改变系统提示来影响模型行为的新学习范式，类似于人类做笔记 [2][3] Model Size and Memorization - 行业趋势是模型尺寸先增大后减小，认知核心 (Cognitive Core) 的概念是剥离 LLM 的百科全书式知识，使其更擅长泛化 [3] - 行业对当前 Agent 行业提出了批评，认为其在工具方面过度投入，而忽略了当前的能力水平，并强调与 LLM 协作，结合人类的优势和 LLM 的长处 [3]

Artificial Intelligence

Artificial Intelligence

Reid Hoffman on AI, Consciousness, and the Future of Humanity

a16z· 2025-10-20 15:09

AI Investing Framework - The AI investing landscape is being navigated with uncertainty, likened to looking through a fog with strobe lights [3] - Obvious AI investments include chatbots and productivity/coding assistance, but differential investment is harder due to widespread awareness [4] - Significant changes are expected due to AI disruption, prompting consideration of new opportunities, such as new LinkedIns enabled by AI [5] - Focus should be on Silicon Valley blind spots, areas where AI will be transformative but outside the conventional software-centric view [6][7] Silicon Valley & AI - Silicon Valley's culture, while fostering coopetition and invention, has blind spots, particularly regarding non-software applications of AI [6][7] - A classic blind spot is the tendency to prioritize software-based solutions, overlooking areas where AI's impact will be significant outside of traditional CS [7] - Silicon Valley tends to overemphasize simulation as a solution, which may not be effective for complex problems like drug discovery [16][18] AI's Impact on Professions - AI diagnostic capabilities are superior knowledge stores compared to human doctors, suggesting a shift in the doctor's role [21] - The future role of doctors will be as expert users of AI knowledge stores, not as mere repositories of memorized information [22] - LLMs are currently limited in reasoning capabilities, often providing consensus opinions rather than lateral thinking [25][28] - Professionals need to develop more sideways and lateral thinking to effectively utilize AI, questioning consensus opinions [28] AI & Automation - Automation of physical tasks (atoms) is more challenging than automating information-based tasks (bits) due to factors like capital expenditure and robotics limitations [33][34] - The economics of robotics depend on the crossover point between capital expenditure (capex) and operational expenditure (opex) [46] - Current AI systems often lack common sense awareness and context awareness, leading to nonsensical outputs [47] AI Adoption & Hype - AI is currently underhyped because many people judge it based on past experiences and haven't seen its recent advancements [54][57] - AI adoption is driven by the "lazy and rich" concept, where it enables users to work fewer hours and make more money [52] - Skepticism towards AI often stems from judging it based on its present capabilities rather than extrapolating its future potential [59] AI Development & Future - AI development involves combining different models, such as LLMs and diffusion models, to achieve complex tasks [59] - Making AI models more predictable and reliable is a crucial goal to alleviate fears about potential misuse [59] - Achieving logical proof and validation in AI, particularly in mathematics, remains a significant challenge [60] - The development of agency and goal-setting capabilities in AI is almost certain, raising concerns about control and alignment with human values [60] LinkedIn's Durability - LinkedIn's durability stems from its large network, which is difficult to replicate due to the lack of sizzle compared to social media platforms [62][63] - LinkedIn has built a network that fosters collaboration and professional connections, making it a valuable resource for its users [63] - LinkedIn's success is attributed to staying true to its purpose, providing a platform for professional networking and collaboration [63] Friendship in the Age of AI - Friendship is a joint relationship where two people agree to help each other become the best possible versions of themselves [64][65] - True friendship involves mutual support and tough love, helping each other grow and improve [65] - AI companions, while potentially awesome, cannot replace human friends because they lack the bidirectional relationship and mutual growth [66][67]

Silicon Valley blind spots

Silicon Valley blind spots

Avi Chawla· 2025-10-16 06:31

A great tool to estimate how much VRAM your LLMs actually need.Alter the hardware config, quantization, etc., it tells you about:- Generation speed (tokens/sec)- Precise memory allocation- System throughput, etc.No more VRAM guessing! https://t.co/FlaeMVaWmK ...

Memory allocation

Memory allocation

OpenAI and Broadcom announce multi-year partnership to develop custom chips

CNBC Television· 2025-10-13 13:45

They're talking about a collaboration of 10 gigawatts of custom AI accelerators. Open AAI designing these accelerators and systems. They'll de developed and deployed in partnership with Broadcom.Uh essentially their own chips and systems for OpenAI that they can embed uh and again I'm reading from the release here. What is learned from developing Frontier models and products directly into the hardware. Jim, now as opposed to that Nvidia deal which captured the market's imagination a couple of weeks back, th ...

Broadcom(US:AVGO)

AI accelerators

AI accelerators

Nick Szabo· 2025-10-11 03:02

Accuracy Impact of Prompt Tone - Rude prompts to LLMs consistently lead to better results than polite ones [1] - Very polite and polite tones reduced accuracy, while neutral, rude, and very rude tones improved it [1] - The top score reported was 848% for very rude prompts and the lowest was 808% for very polite [1] Model Behavior - Older models (like GPT-35 and Llama-2) behaved differently [2] - GPT-4-based models like ChatGPT-4o show a clear reversal where harsh tone works better [2] Statistical Significance - Statistical tests confirmed that the differences were significant, not random, across repeated runs [1]

Prompt Engineering

Prompt Engineering

Avi Chawla· 2025-10-07 06:31

The visual explains the underlying details of KV caching.I also wrote a detailed explainer thread on KV caching a few months back, if you want to learn more.Check below👇 https://t.co/e4KILO0cEeAvi Chawla (@_avichawla):KV caching in LLMs, clearly explained (with visuals): ...

Avi Chawla· 2025-10-05 06:30

To summarise:Structured (JSON) prompting for LLMs is like writing modular code; it brings clarity of thought, makes adding new requirements effortless, & creates better communication with AI.It's not just a technique, but rather evolving towards a habit worth developing for cleaner AI interactions. ...

AI interactions

AI interactions