Workflow
NanoGPT
icon
Search documents
AI数据中心上天,与其说黑科技不如说是作秀
3 6 Ke· 2025-12-17 12:39
Core Viewpoint - Starcloud, a space computing startup supported by Nvidia, has successfully trained and operated an AI model in space for the first time, marking a significant milestone in the field of space AI [1][3]. Group 1: Company Developments - Starcloud's satellite, Starcloud-1, successfully ran Google's open-source model Gemma and trained NanoGPT using the complete works of Shakespeare, sending a Shakespearean-style message back to Earth [3]. - Starcloud aims to achieve a tenfold reduction in energy costs for orbital data centers compared to ground-based data centers, validating the feasibility of constructing space data centers that require large computing clusters [3]. Group 2: Industry Trends - Google plans to begin building space AI data centers by early 2027, with ambitions to utilize solar energy in space, which is significantly more abundant than on Earth [5][6]. - The drive towards space AI data centers is largely motivated by the need to address energy shortages faced by tech giants in the U.S., where insufficient infrastructure has become a critical issue [9]. - The energy demands for AI data centers are projected to reach 347 GW by 2030, highlighting the urgency for alternative energy solutions [9]. Group 3: Technical Challenges - Space AI data centers face significant challenges, including heat dissipation and radiation protection, which have yet to be effectively resolved [11][15]. - The average temperature in low Earth orbit is -120°C, complicating heat management, as heat transfer in space occurs primarily through radiation [13]. - High-energy particles in space can cause single-event upsets in electronic components, leading to potential computational errors, which necessitates the use of older chip manufacturing processes for space applications [13][15].
英伟达GPU被SpaceX送上太空,在天上训练卡帕西的NanoGPT
3 6 Ke· 2025-12-11 07:32
Core Insights - The article discusses the successful training and operation of AI models in space, marking a significant milestone in the integration of artificial intelligence and space technology. Key players include Nvidia, SpaceX, Google, and Andrej Karpathy's NanoGPT [1][5][10]. Group 1: AI in Space - The first AI model, Gemma, was successfully trained and operated in space using Nvidia's H100 chip aboard the Starcloud-1 satellite launched by SpaceX [5][10]. - The AI model Gemma greeted Earth with a message, showcasing its capabilities to analyze and provide insights [7]. - Another model, NanoGPT, was trained using the complete works of Shakespeare on the H100 chip [3]. Group 2: Future Plans and Infrastructure - Starcloud aims to build a solar-powered orbital data center with a capacity of 5GW, which is expected to have significantly lower costs compared to terrestrial counterparts [8][10]. - The CEO of Starcloud, Philip Johnston, emphasized the potential of space to overcome energy limitations faced on Earth, allowing for larger AI models to be trained without the constraints of land and cooling [10]. - Starcloud plans to launch more Nvidia H100 chips and the Blackwell platform in a satellite mission scheduled for October 2026 [9]. Group 3: Global Developments in Space Computing - Chinese research institutions have been exploring space-based intelligent computing since 2019, focusing on key technological advancements [12][13]. - The "Three-Body Computing Constellation," consisting of 12 satellites, was launched by Guoxing Aerospace and Zhijiang Laboratory, achieving commercial operation in September [14]. - The "Tiansuan Plan" was announced by the Tiansuan team, aiming to establish a superintelligent cluster with a computing power of 10 EOPS in near-Earth orbit [15].
英伟达GPU被SpaceX送上太空!在天上训练卡帕西的NanoGPT
量子位· 2025-12-11 06:54
Core Viewpoint - The article discusses the groundbreaking achievement of training and running AI models in space, highlighting the collaboration between companies like Nvidia, SpaceX, and Google, as well as the involvement of former OpenAI co-founder Andrej Karpathy's NanoGPT [2][3][4]. Group 1: Space AI Training - The first AI model training in space was successfully conducted using Nvidia's H100 chip aboard the Starcloud-1 satellite, launched by SpaceX [6][7]. - The AI model Gemma, a large open-source model from Google, was run in space, greeting Earth with a message [9]. - NanoGPT, developed by Andrej Karpathy, was also trained directly in space, marking a significant milestone in AI development [9]. Group 2: Future Plans and Infrastructure - Starcloud aims to build a solar-powered 5GW orbital data center, which is expected to have lower construction and operational costs compared to terrestrial counterparts [10]. - The company plans to launch more Nvidia H100 chips and the Blackwell platform in a satellite mission scheduled for October 2026 [11]. - Starcloud's CEO emphasized the potential of space to overcome energy limitations faced on Earth, suggesting that AI operations can be more efficient in a low Earth orbit environment [12]. Group 3: Global Developments in Space Computing - Chinese research institutions have been exploring space-based intelligent computing since 2019, focusing on key technological advancements [17]. - The China National Space Administration has successfully launched the world's first space computing constellation, achieving regular commercial operations [18]. - The TianSuan Plan aims to establish a superintelligent cluster in near-Earth orbit with a computing power of 10 EOPS, addressing challenges related to radiation and heat dissipation [19].
Nvidia-Backed Starcloud Trains First LLM In Space Amid Orbital Datacenter Buzz — CEO Calls It 'Significant' First Step - NVIDIA (NASDAQ:NVDA)
Benzinga· 2025-12-11 06:18
Core Insights - Starcloud, a startup backed by Nvidia, has successfully trained AI models from space, utilizing the Nvidia H-100 GPU, which offers 100 times the computational power of previous chips launched into orbit [1][2]. Group 1: Company Developments - Starcloud has trained Google's open LLM Gemma from its Starcloud-1 satellite [2]. - The company also trained an LLM called NanoGPT, developed by former Tesla AI lead Andrej Karpathy, using Shakespeare's works [3]. - Starcloud aims to build large, scalable, and cost-competitive data centers in space, with plans for a 2.4-mile-tall and 2.4-mile-wide orbital data center that will utilize solar and cooling panels [4]. Group 2: Industry Context - Starcloud's CEO, Philip Johnston, described the achievement as a significant step towards relocating computing resources to space, reducing the strain on Earth's resources [5]. - Elon Musk has reiterated his vision for solar-powered AI satellites, targeting a production goal of 1 Megaton per year, which would provide 100GW of AI capacity in space [6]. - Musk also mentioned plans for factories on the lunar surface to support AI satellite launches into deep space [6]. Group 3: Market Reactions - Nvidia's stock experienced a decline of 1.92% during after-hours trading, closing at $180.25 [9].
‘Greetings, earthlings': Nvidia-backed Starcloud trains first AI model in space as orbital data center race heats up
CNBC· 2025-12-10 14:05
Core Insights - The launch of Starcloud-1 satellite marks the first instance of an artificial intelligence model, Gemma, being trained and operated in space, utilizing a Nvidia H100 GPU that is 100 times more powerful than previous space GPUs [2][3] - Starcloud aims to establish orbital data centers to address the growing digital infrastructure crisis on Earth, which is facing increasing energy consumption and environmental concerns [4][5] Company Overview - Starcloud, co-founded in 2024, is backed by Nvidia and has successfully demonstrated the operation of AI models in space, indicating the feasibility of space-based data centers [5][6] - The company plans to build a 5-gigawatt orbital data center equipped with solar and cooling panels, which would be more efficient and cost-effective compared to terrestrial facilities [8] Technological Innovations - The satellite's AI model, Gemma, is capable of sophisticated responses similar to Earth-based databases, showcasing the potential of space-based AI applications [7] - Starcloud has also trained another model, NanoGPT, using the complete works of Shakespeare, demonstrating the versatility of its technology [7] Environmental Impact - Orbital data centers are projected to have 10 times lower energy costs than terrestrial data centers, addressing the constraints of energy consumption on Earth [5] - These space-based facilities can harness constant solar energy, unaffected by terrestrial weather and day-night cycles, contributing to environmental sustainability [9][12] Applications and Use Cases - Starcloud's orbital data centers have potential commercial and military applications, such as real-time intelligence for disaster response, including wildfire detection [10] - The company is working on integrating advanced AI workloads from space, enhancing capabilities for various industries [11]
X @Dash
Dash· 2025-10-03 01:10
@NanoGPTcom @TheDesertLynx Try out NanoGPThttps://t.co/iHqVKow5Qj ...
X @Dash
Dash· 2025-09-07 17:44
Product Announcement - NanoGPTcom is available for trial [1]
X @Dash
Dash· 2025-09-07 17:39
Everyone in the Dash community should be using @NanoGPTcom for all their AI needs.Bas (@247bas):How to use NanoGPT using a sign-in token and pay for it in a private way using Dash, in under 2 minutes!@NanoGPTcom@Dashpay https://t.co/ED917LmBrf ...
Muon作者仅用一篇博客,就被OpenAI看中了
机器之心· 2025-06-16 04:04
Core Insights - The article emphasizes that publishing papers is no longer the ultimate goal for researchers, as demonstrated by Keller Jordan's success with a blog post that led to his position at OpenAI [2][8]. - The case of Keller Jordan illustrates that talent acquisition in top AI research institutions like OpenAI prioritizes capability over traditional academic metrics [8]. Summary by Sections Blog Post Overview - Keller Jordan's blog titled "Muon: An optimizer for hidden layers in neural networks" was published on December 8, 2024, and introduced a new optimizer that significantly enhances training speed while maintaining accuracy for neural networks [4][6]. - The blog highlights the latest records in training speed for NanoGPT, with the most recent record being 2.979 minutes achieved on May 25, 2025 [9]. Muon Optimizer Design and Results - Muon is designed to optimize hidden layers in neural networks, achieving a training speed record of 2.6 seconds on the CIFAR-10 dataset while maintaining 94% accuracy [22]. - In competitive tasks, Muon demonstrated a 1.35 times improvement in training speed compared to previous methods [22]. - The optimizer's design involves applying Newton-Schulz iterations to approximate orthogonal updates, which enhances the learning process by diversifying update directions [29][30]. Performance and Efficiency - Muon requires minimal additional computational overhead, with a maximum FLOP cost of less than 1% in typical language model training scenarios [58][59]. - The optimizer has shown superior performance in training large models, such as a 1.5 billion parameter Transformer, compared to traditional methods like AdamW [22][66]. Comparison with Other Optimizers - The article discusses the limitations of other optimizers, such as Shampoo and Orthogonal-SGDM, highlighting that Muon outperforms them in efficiency and effectiveness [61][64]. - It emphasizes the importance of proper baseline tuning in research to ensure that new optimizers are genuinely effective [72]. Future Research Directions - The article mentions ongoing research to explore Muon's scalability and its application in various training scenarios, indicating a growing interest in its potential [79][81].