Workflow
Parallel Computing
icon
Search documents
NVIDIA Corporation (NVDA) Presents at Bank of America Global Technology Conference Transcript
Seeking Alpha· 2025-06-04 18:45
Company Overview - NVIDIA Corporation is a leader in GPU computing, with Ian Buck heading the Accelerated Computing Business Unit, overseeing hardware and software product lines, third-party enablement, and marketing activities [2][4]. Keynote Highlights - Ian Buck emphasized the significance of the current era as the "AI time," indicating a transformative phase for the industry [4]. Conference Context - The conference is part of the BofA Securities Global Technology Conference, showcasing key insights from industry leaders [1][3].
刚刚!DeepSeek,硬核发布!
券商中国· 2025-02-27 03:35
DeepSeek又有大动作! 开源周第三天,DeepSeek宣布开源Optimized Parallelism Strategies(优化并行策略)。 Optimized Parallelism Strategies,该策略是为了提高计算效率、减少资源浪费并最大化系统性能而设计的并 行计算方案。这些策略通过合理分配任务、协调资源利用和减少通信开销,实现在多核、分布式或异构系统中 的高效并行执行。 英伟达通过在Blackwell架构上应用TensorRT DeepSeek优化,让具有FP4生产级精度的模型,在MMLU通用智 能基准测试中达到了FP8 模型性能的99.8%。目前,英伟达基于FP4优化的DeepSeek-R1检查点已经在Hugging Face上开源,并且可以通过以下链接访问模型地址:DeepSeek-R1-FP4。 在后训练量化方面,该模型将Transformer模块内的线性算子的权重和激活量化到了FP4,适用于TensorRT- LLM推理。这一优化使每个参数的位数从8位减少到4位,从而让磁碟空间和GPU显存的需求减少了约1.6倍。 使用TensorRT-LLM部署量化后的FP4权重文件,能够为 ...