视觉生成大模型
Search documents
上海交通大学发表最新Science论文
生物世界· 2025-12-19 00:45
Core Viewpoint - The article discusses the development of a novel all-optical chip named LightGen, which addresses the significant computational power shortage faced by large-scale generative artificial intelligence, particularly in visual generation tasks [1][4]. Group 1: Research Development - A research team led by Assistant Professor Chen Yitong from Shanghai Jiao Tong University published a paper in the journal Science on December 18, 2025, detailing their work on the LightGen chip [2]. - The LightGen chip is designed for large-scale intelligent semantic vision generation and integrates millions of photonic neurons to achieve high-resolution image generation, denoising, style transfer, and 3D generation and manipulation [3][4]. Group 2: Performance and Efficiency - Experimental results indicate that the LightGen chip's end-to-end computational speed and energy efficiency exceed that of the most advanced electronic chips by more than two orders of magnitude, paving the way for the advancement of large visual generative models [4].
可灵AI推出可图2.1模型 多维能力跃升、会员限时7天免费
Cai Fu Zai Xian· 2025-07-10 09:24
Core Insights - The launch of the Ketu 2.1 model by Keling AI significantly enhances image generation capabilities, including improved instruction adherence, stunning portrait aesthetics, and cinematic quality [1][11] - The model is available for free to all member users for a limited time, allowing creators to explore its features [11] Image Generation Capabilities - Ketu 2.1 excels in following complex instructions, accurately capturing multiple elements and details in prompts, resulting in high-quality images that showcase creative imagination [1][3] - The model demonstrates a notable improvement in image quality, including clarity, richness of elements, and realism, particularly in portrait aesthetics [3][5] Artistic and Cinematic Quality - The model can generate images with a cinematic feel, effectively recreating scenes with unique aesthetic tones and advanced composition [6] - It supports over 180 different styles, allowing creators to choose from various artistic expressions, from vintage photography to futuristic digital art [10] Text Generation Features - Ketu 2.1 also enhances text generation, producing clear and design-oriented text in both Chinese and English, facilitating smoother integration of text and images for marketing and creative projects [8] User Engagement and Growth - Keling AI has achieved significant user engagement, with a total of 344 million images and 168 million videos generated since its launch, showcasing its strength in the image generation sector [11]
国内首个移动端视觉生成大模型“橘洲”V1端侧版在长沙上线
news flash· 2025-05-21 03:08
Core Insights - The first domestic visual foundation model "Juzhou" V1 edge version based on indigenous computing power was officially launched in Changsha on May 21 [1] - The model can generate images at a resolution of 1024×1024 in seconds on mobile devices, featuring low cost, high quality, fast speed, lightweight, and offline capabilities [1] - Developed by Hunan Huishiwei Intelligent Technology Co., Ltd., the model completed training on nearly 40 million images in a short period, making it the first visual foundation model in China to achieve complete training and inference on domestic computing power and deploy on mobile devices [1]
手机能畅玩,“橘洲”有多硬核?
Chang Sha Wan Bao· 2025-05-21 00:20
Core Viewpoint - The article highlights the launch of "Juzhou," a domestically developed visual foundation model by Hunan Huishiwei Intelligent Technology Co., which is designed for mobile deployment and can generate images in seconds, marking a significant advancement in AI technology for smartphones [1][12]. Group 1: Product Features - "Juzhou" is a lightweight visual foundation model that can generate 1024×1024 resolution images in seconds on mobile devices, addressing the limitations of traditional cloud-based models [1][8]. - The model is designed to operate efficiently on mobile devices, significantly reducing computational costs and enhancing user experience by allowing offline usage [3][8]. - Compared to foreign mainstream open-source models, "Juzhou" achieves similar image quality with only 1/20 of the size and time, ensuring data privacy and security [8][12]. Group 2: Technical Innovations - The development of "Juzhou" utilized nearly 70 petaflops of pure domestic computing power, marking a significant step in the localization of AI technology [12][14]. - The model employs innovative techniques such as cross-model structure extreme distillation to maintain high image generation quality while minimizing performance loss [14]. - The training process for "Juzhou" was accelerated, achieving a model training time of just 20 hours and compressing the model size to 1/50 of cloud-based models [14]. Group 3: Market Positioning and Future Goals - "Juzhou" aims to serve as a foundational model for B-end developers, enabling them to create their own mobile AI applications, thus expanding its market reach [9][10]. - The company plans to iterate on the model monthly and open-source corresponding inference models to foster a collaborative ecosystem [10]. - The vision for "Juzhou" is to empower various industries with AI capabilities, targeting a trillion-level market in the next three years [14].