NVIDIA NIM™ microservices
Search documents
NVIDIA DGX Spark Arrives for World's AI Developers
Globenewswire· 2025-10-13 23:39
Core Viewpoint - NVIDIA has announced the launch of DGX Spark, the world's smallest AI supercomputer, designed to meet the growing demands of AI workloads that exceed the capabilities of traditional PCs and workstations [2][3]. Product Overview - DGX Spark delivers 1 petaflop of AI performance and features 128GB of unified memory, enabling developers to run inference on AI models with up to 200 billion parameters and fine-tune models of up to 70 billion parameters locally [3][6]. - The system is compact, with dimensions of 150 mm x 150 mm x 50.5 mm and a weight of 1.2 kg, making it suitable for lab or office environments [5]. - Priced at $3,999, DGX Spark integrates NVIDIA's full AI platform, including GPUs, CPUs, networking, and software, into a powerful desktop solution [5][7]. Historical Context - The launch of DGX Spark is a continuation of NVIDIA's mission that began with the DGX-1 in 2016, which was pivotal in the development of AI technologies, including ChatGPT [4][9]. Technical Specifications - Compared to the DGX-1, DGX Spark features a significant upgrade in performance from 170 TFLOPS (FP16) to 1 PFLOP (FP4), while reducing system power consumption from 3,200 W to 240 W [5]. - The system utilizes NVIDIA's Grace Blackwell architecture and includes advanced networking capabilities with NVIDIA ConnectX®-7 technology, providing 5x the bandwidth of fifth-generation PCIe [6][7]. Market Impact - Early adopters of DGX Spark include major companies and research organizations such as Google, Microsoft, and NYU Global Frontier Lab, indicating strong interest and potential for widespread application in AI development [10][11]. - The availability of DGX Spark is set to expand through partnerships with various technology companies, enhancing access to powerful AI computing solutions [7][11].
NVIDIA Launches Family of Open Reasoning AI Models for Developers and Enterprises to Build Agentic AI Platforms
Globenewswire· 2025-03-18 19:10
Core Insights - NVIDIA has launched the Llama Nemotron family of models, which are designed to provide advanced AI reasoning capabilities for developers and enterprises [1][4] - The new models enhance multistep math, coding, reasoning, and complex decision-making through extensive post-training, improving accuracy by up to 20% and optimizing inference speed by 5x compared to other leading models [2][3] Model Features - The Llama Nemotron model family is available in three sizes: Nano, Super, and Ultra, each tailored for different deployment needs, with the Nano model optimized for PCs and edge devices, the Super model for single GPU throughput, and the Ultra model for multi-GPU servers [5] - The models are built on high-quality curated synthetic data and additional datasets co-created by NVIDIA, ensuring flexibility for enterprises to develop custom reasoning models [6] Industry Collaboration - Major industry players such as Microsoft, SAP, and Accenture are collaborating with NVIDIA to integrate Llama Nemotron models into their platforms, enhancing AI capabilities across various applications [4][7][8][10] - Microsoft is incorporating these models into Azure AI Foundry, while SAP is using them to improve its Business AI solutions and AI copilot, Joule [7][8] Deployment and Accessibility - The Llama Nemotron models and NIM microservices are available as hosted APIs, with free access for NVIDIA Developer Program members for development, testing, and research [12] - Enterprises can run these models in production using NVIDIA AI Enterprise on accelerated data center and cloud infrastructure, with additional tools and software to facilitate advanced reasoning in collaborative AI systems [16]
NVIDIA Blackwell RTX PRO Comes to Workstations and Servers for Designers, Developers, Data Scientists and Creatives to Build and Collaborate With Agentic AI
GlobeNewswire News Room· 2025-03-18 19:01
Core Insights - NVIDIA has launched the RTX PRO™ Blackwell series, a new generation of workstation and server GPUs aimed at enhancing workflows for AI, technical, creative, engineering, and design professionals through advanced computing technologies [1][3] Product Overview - The RTX PRO Blackwell series includes various GPU configurations: - Data center GPU: NVIDIA RTX PRO 6000 Blackwell Server Edition - Desktop GPUs: NVIDIA RTX PRO 6000, 5000, 4500, and 4000 Blackwell editions - Laptop GPUs: NVIDIA RTX PRO 5000, 4000, 3000, 2000, 1000, and 500 Blackwell editions [4][2] Performance Enhancements - The new GPUs feature significant performance improvements: - NVIDIA Streaming Multiprocessor offers up to 1.5x faster throughput - Fourth-Generation RT Cores provide up to 2x performance for photorealistic rendering - Fifth-Generation Tensor Cores deliver up to 4,000 AI trillion operations per second [5][6][7] Memory and Bandwidth - The GPUs support larger and faster GDDR7 memory, with up to 96GB for workstations and servers, enhancing the ability to handle complex datasets [5][6] - Fifth-Generation PCIe support doubles the bandwidth over the previous generation, improving data transfer speeds [5] Multi-Instance GPU Technology - The RTX PRO 6000 and 5000 series GPUs feature Multi-Instance GPU (MIG) technology, allowing secure partitioning of a single GPU into multiple instances for efficient resource allocation [6] Industry Applications - The RTX PRO Blackwell GPUs are designed for various industries, including healthcare, manufacturing, retail, and media, providing powerful performance for AI, scientific, and visual computing applications [8][9] Customer Feedback - Early evaluations indicate significant performance improvements, such as a 5x speed increase in rendering for Foster + Partners and up to 2x GPU processing time improvement for GE HealthCare [10][11] Availability - The RTX PRO 6000 Blackwell Server Edition will be available from major data center system partners and cloud service providers later this year [14][15] - The workstation and laptop editions will be available through global distribution partners starting in April and later this year, respectively [16][17]