Workflow
Latency
icon
Search documents
X @mert | helius.dev
mert | helius.dev· 2025-09-05 14:55
Custom Solutions - A small percentage of hyper latency sensitive users could benefit from a custom solution [1] - Dedicated node for YS gRPC streaming from Helius can be pointed to shreds for quicker data [1] Performance Improvement - Shaving off approximately 0.5 milliseconds in data processing time is possible [1] Risk Consideration - The custom solution is not fault tolerant, unlike Laserstream [1]
X @Starknet 🐺🐱
Starknet 🐺🐱· 2025-08-28 02:00
Technology & Scalability - Starknet 被 Extended 选择,因为它满足了吞吐量和延迟的要求 [1] - Starknet 是唯一满足 Extended 需求的 L2 解决方案 [1]
X @Starknet 🐺🐱
Starknet 🐺🐱· 2025-08-27 14:00
Technology Selection - Extended chose Starknet due to its superior throughput and latency capabilities compared to other Layer 2 solutions [1] - Starknet is the only L2 that meets Extended's specific requirements [1]
X @mert | helius.dev
mert | helius.dev· 2025-08-26 23:07
Solana技术特性 - Solana当前区块时间技术上为 400 毫秒,开发者可实现更快速度 [1] - 基于初步测试,交易发送到在Helius shreds中观察到交易的端到端延迟约为 70 毫秒 [1] - 该测试为初步脚本,可能可以更快,需要考虑跳过率 [1]
X @mert | helius.dev
mert | helius.dev· 2025-08-18 23:05
Decentralization & Performance - Decentralized systems' end-to-end latency between user and observing shred with transaction is comparable to centralized exchanges (CEX) or Web2 platforms [1] - Decentralization enhances reliability across a greater number of faults [1] - There is no inherent engineering reason for decentralized systems to be slower than centralized counterparts [1]
X @mert | helius.dev
mert | helius.dev· 2025-08-18 22:32
Solana Network Performance - Solana's architecture uses shreds instead of entire blocks for faster data propagation [1] - Validators with higher stake receive shreds faster, providing a latency advantage [1] Helius Competitive Advantage - Helius possesses the highest stake on Solana, enabling its customers to receive data faster than competitors [1] - Laserstream offers a latency edge for reading Solana data [1]
X @mert | helius.dev
mert | helius.dev· 2025-08-18 13:33
Network Performance - Latency increases [1] - Bandwidth reduces [1]
Serving Voice AI at $1/hr: Open-source, LoRAs, Latency, Load Balancing - Neil+Jack Dwyer, Gabber
AI Engineer· 2025-07-31 13:45
Technology & Product Development - Orpheus (Emotive, Realtime TTS) 的部署经验,包括延迟和优化[1] - 高保真语音克隆及示例[1] - 使用多个 GPU 和多个 LoRa 进行负载均衡[1] Company & Industry Focus - Gabber 致力于简化和降低实时、多模态消费者应用程序的开发成本[1] - 演讲者 Neil Dwyer 在 Bebo 构建了实时流媒体 + 计算机视觉管道,并在 LiveKit 参与了 Agents 平台的开发[1]
The Rise of Open Models in the Enterprise — Amir Haghighat, Baseten
AI Engineer· 2025-07-24 15:30
AI Adoption in Enterprises - Enterprises' adoption of AI is crucial for realizing AI's full potential and impact [2] - Enterprises initially experiment with OpenAI and Anthropic models, often deploying them on Azure or AWS for security and privacy [7] - In 2023, enterprises were "toying around" with AI, but by 2024, 40-50% had production use cases built on closed models [9][10] Challenges with Closed Models - Vendor lock-in is not a primary concern for enterprises due to the increasing number of interoperable models [12][13] - Ballooning costs, especially with agentic use cases involving potentially 50 inference calls per user action, are becoming a significant concern [20] - Enterprises are seeking differentiation at the AI level, not just at the workflow or application level, leading them to consider in-house solutions [21] Reasons for Open Source Model Adoption - Frontier models may not be the right tool for specific use cases, such as medical document extraction, where enterprises can leverage their labeled data to build better models [16][17] - Generic API-based models may not suffice for tasks requiring low latency, such as AI voices or AI phone calls [18] - Enterprises aim to reduce costs and improve unit economics by running models themselves and controlling pricing [20][21] Inference Infrastructure Challenges - Optimizing models for latency requires both model-level and infrastructure-level optimizations, such as speculative decoding techniques like Eagle 3 [23][24][25][26] - Guaranteeing high availability (four nines) for mission-critical inference requires robust infrastructure to handle hardware failures and VLM crashes [27][28] - Scaling up quickly to handle traffic bursts is challenging, with some enterprises experiencing delays of up to eight minutes to bring up a new replica of a model [29]
X @mert | helius.dev
mert | helius.dev· 2025-07-22 17:43
Marketing & Promotion - Helius.dev promotes its service for latency reduction [1] Service Offering - Helius.dev offers services to lower latency [1]