Workflow
Seedream 4.0
icon
Search documents
16个AI的锦秋CEO大会海报比稿大战,谁能拿到设计费?
锦秋集· 2025-11-01 00:06
「锦秋AI实验室」 这是一档专注于探索和评测AI产品在实际场景中应用效果的栏目。 我们正在 用AI 解锁100个效率场景。 下一个场景会是什么? 今年,锦秋基金将以 「 Experience with AI 」 为主题, 举办首届CEO年度大会 。 这是锦秋第一次以这样一场科技与思想交织的形式呈现年会——我们想探讨的不只是AI本身,而是 科技、资本与创造力如何在AI时代重新相遇 。 我们希望这场活动,不只是一次关于AI的对话,而是一场关于 让AI被真正理解、被使用、被体验 的 真实场域。 在筹备过程中,一个问题忽然浮现: "如果这张海报,也由AI自己来生成,会是什么样?" 于是,一场横跨 16款 AI 工具 的"实战测评"就此展开。 对我们来说,这不是一次简单的产品测评,而是一场关于 中文语境、品牌美学与视觉创造力的真实 实验 。 01 产品的选择 为了让这场实验尽可能全面,我们选取了 16款 AI 文生图模型 ,涵盖全球主流产品、中国本土代表 及数个新兴平台。 我们想看看——当这些AI面对 中文语境与品牌表达 时,会给出怎样不同的"视觉答案"。 这既是一次对AI能力的检验,也是一场关于品牌视觉未来边界的探索 ...
AI几分钟生成的绘本,你敢给孩子读吗?
创业邦· 2025-10-31 00:08
以下文章来源于刺猬公社 ,作者刺猬公社编辑部 刺猬公社 . 互联网内容行业观察与研究 今年夏天,Gemini 推出Nano banana 模型,凭借着出色的角色一致性能力,引发了众多网友对 AI 图像生成的探索和关注。在该模型曝光不久前,Gemini AI 上线了 Storybook 故事创作功能,用户 仅需输入几句话描述情节,AI将自动生成10页图文内容的电子书。 图像模型的完善使得人们逐渐开始探索AI绘本功能的妙用。在短视频平台上有大量 AI 绘本视频,很 多视频附上英文字幕和配音,tag 标注为英语绘本、英语磨耳朵等,平均点赞量大几千,有的视频数 据则近百万。点开这些博主的主页,往往挂着启蒙绘本、英语绘本视频合集等购买链接。 AI 一键生成的绘本能读吗?为什么这些略显抽象的幻灯片放映似的视频能迅速走红?为了找到答案, 我开始着手调研和试用产品。 成为绘本创作者 只要一分钟 为了直观体验目前 AI 绘本技术的能力,我在 Google Gemini 里进行了一次测试。 点开AI绘本功 能,界面上出现了几个创作建议。 来源丨刺猬公社 ( ciweigongshe ) 作者丨 白棉 编辑丨 园长 2024年8月 ...
爆火的AI三宫格图片,比我们的生活更像电影。
数字生命卡兹克· 2025-10-24 01:32
Core Viewpoint - The article discusses the recent trend of creating three-panel AI-generated images, highlighting the cultural significance and emotional resonance behind this phenomenon, which reflects a desire to narrate personal stories through a cinematic lens [46][49][55]. Group 1: Trend and Popularity - The three-panel AI images have gained immense popularity on platforms like Douyin and Xiaohongshu, with likes reaching thousands [3]. - Various user-generated content has emerged, including artistic and abstract interpretations, showcasing the versatility of the format [10][11][17]. Group 2: Creative Process - Users can easily create these images using the Seedream 4.0 AI tool, which allows for customization through prompts [32]. - A template for creating three-panel images is provided, emphasizing the importance of scene description, character details, and overall aesthetic [33][34]. Group 3: Cultural Reflection - The article draws parallels between the current trend and past social media practices, noting that the desire to present life as a cinematic experience has remained consistent over the years [46][49]. - The use of AI to generate idealized versions of oneself serves as a form of escapism and self-expression, allowing individuals to project their aspirations [55][56].
张一鸣公开谈AI人才“过拟合”
Sou Hu Cai Jing· 2025-10-13 13:51
Core Insights - Zhang Yiming, founder of ByteDance, highlighted the shortcomings in AI talent training during the opening ceremony of the Shanghai Xuhui Zhichun Innovation Center, emphasizing the issue of "overfitting" in talent capabilities [1][10] - The demand for AI positions surged tenfold in the first seven months of 2025, with a significant shortage of algorithm-related talent, particularly in search algorithms, where the ratio is "5 positions for 2 candidates" [3][8] - ByteDance's recruitment index for AI positions is the highest among the top 20 companies hiring for new AI roles, indicating a strong focus on AI talent acquisition [3][8] Talent Strategy - The establishment of the Shanghai Xuhui Zhichun Innovation Center aims to recruit young individuals interested in computer science and AI, reflecting ByteDance's commitment to nurturing innovative talent [3][9] - Zhang Yiming's approach signifies a shift in how ByteDance views talent, treating it as a core parameter for algorithm evolution rather than a disposable resource [3][10] - The center plans to cultivate talent through practical exploration, focusing on independent thinking and resilience [10][11] AI Development Initiatives - ByteDance has made significant advancements in AI, launching various key products and models, including the "Kouzi Space" for productivity enhancement and the "Doubao" general model [6][7] - The company has been rapidly upgrading its models, with the "Doubao 1.6" version released in June, and has achieved top rankings in video generation tasks [7][8] - ByteDance's recruitment plan for 2026 includes hiring over 5,000 fresh graduates, with a 23% increase in demand for R&D positions [8][9] Industry Context - The AI sector is at a critical juncture, transitioning from technology to industry application, with a pressing need for talent that can address real-world complex problems [10][12] - Zhang Yiming's focus on fostering cross-disciplinary talent aims to overcome the limitations of traditional talent training, which often leads to a disconnect between technical skills and business challenges [11][12] - The company is striving to create a closed-loop ecosystem for AI infrastructure, covering various applications from foundational models to intelligent agents [12][14]
全球Agent产业化竞速
CAITONG SECURITIES· 2025-10-12 06:42
Investment Rating - The report maintains a "Positive" investment rating for the industry [2] Core Insights - The global large model Agent capability is accelerating its industrialization, shifting from a focus on parameter scale competition to embedding Agent capabilities into systems and core entry points [7][10] - The transformation of large models is evolving from "single language interaction" to "multi-modal perception," enabling them to "see and do" while being controllable and manageable throughout the entire process [10] - Domestic companies are collaborating around a "model-entry-computing power" framework, establishing a triangular industrial structure that is gradually closing the loop from "model → platform → entry/scenario → supply side" [7][10] Summary by Sections Global Large Model Agent Capability Industrialization - Since September 2025, the focus has shifted from "parameter scale competition" to "Agent capability embedding," with significant advancements in commercial viability from companies like OpenAI, Anthropic, and Google [10] - OpenAI's Sora 2 model and app have entered a commercial operational phase, integrating video generation technology with compliance management [12] - Anthropic's Claude Sonnet 4.5 model enhances engineering capabilities for long-term tasks and tool operations, focusing on production environment usability [13] - Google has integrated Gemini into Chrome, enabling high-frequency scenarios and expanding capabilities from answering to executing tasks [18] Content, Agent, and Entry Advancement: Paths of Overseas Leading Companies - Overseas companies are using product forms and system interfaces to support Agents, transitioning from "can speak and answer" to "can see and do" [22] - The focus is on thickening entry points (browsers/home) and toolchains (SDK/testing/security) to facilitate the transition from technical demonstrations to industrialization [22] Model-Entry-Computing Power Convergence: The Chinese Path - Alibaba's Qwen3-Max flagship model leads the "model-platform-entry" upgrade, establishing a comprehensive path from foundational models to enterprise tools and creative entry points [23] - Tencent's Agent Development Platform 3.0 and mixed models have shown significant advancements, with a focus on efficiency and global expansion [28] - Baidu's Wenxin model X1.1 has improved performance metrics significantly, enhancing its capabilities in complex writing and long-term tasks [30] Domestic and International AI Upgrade Resonance - The AI industry is entering a critical phase of large-scale implementation, with future competition focusing on the construction of an "engineering triangle" system [47] - The core differences between domestic and international developments lie in the pace and financial structure, with international firms accelerating exploration but facing higher risks [56]
从摄影棚到Prompt:锦秋基金用AI拍了组官网团队照片
锦秋集· 2025-10-11 08:59
Core Insights - The article discusses the use of AI technology to generate professional photos for a company, highlighting the advancements in AI models that can produce high-quality images suitable for corporate branding [3][36]. Group 1: AI Application in Professional Photography - The company tested 10 latest AI image generation models, including Google’s Nano-Banana and ByteDance’s Seedream 4.0, finding that some models are approaching a "ready-to-use" standard in maintaining identity consistency [3][36]. - Due to logistical challenges in gathering team members for a photoshoot, the company decided to utilize AI to generate the required professional images instead [4][5]. Group 2: Model Performance and Selection - Seedream 4.0 was chosen for its superior performance in facial consistency, skin texture, and lighting details compared to Nano-Banana, making it the primary tool for generating the professional photos [20][24]. - The AI-generated images were able to present natural expressions and maintain a high level of detail, which is often difficult to achieve in traditional photography [24][30]. Group 3: Future Implications of AI in Corporate Identity - The experiment indicates a shift where AI-generated professional photos can become a sustainable asset for companies, allowing for continuous updates to team images rather than being static [36][38]. - AI technology enables a new approach to corporate branding, allowing for personalized expressions within a unified style, thus enhancing the relationship between companies and their visual assets [37][38]. Group 4: Challenges and Limitations - Some team members expressed dissatisfaction with the AI-generated images, particularly regarding facial expressions, indicating that current models struggle with nuanced emotional representation [39][41]. - The article notes that while AI can generate high-quality images, there are still challenges in achieving natural poses and expressions, suggesting a need for further refinement in AI capabilities [41].
张一鸣公开谈AI人才“过拟合” 透出字节跳动的“创新焦虑”与“AI野望”
Mei Ri Jing Ji Xin Wen· 2025-10-10 14:45
Core Insights - ByteDance founder Zhang Yiming emphasized the importance of innovative talent cultivation in AI during the opening of the Shanghai Xuhui Zhichun Innovation Center, highlighting the issue of "overfitting" in talent development, where individuals may excel in known tasks but struggle with innovation [1][7][8] - The company is facing a significant shortage of AI talent, with demand for AI positions increasing tenfold in the first seven months of 2025, leading to a competitive hiring environment [1][2][6] - ByteDance's recruitment index for AI positions is notably high at 29.83, indicating a strong focus on attracting talent in this area [1][6] Talent Strategy - The establishment of the Shanghai Xuhui Zhichun Innovation Center aims to recruit young individuals interested in computer science and AI, fostering a new generation of innovative talent through practical exploration [1][6] - ByteDance plans to hire over 5,000 fresh graduates in its 2026 campus recruitment initiative, with a 23% increase in demand for R&D positions compared to previous years [6] - Zhang Yiming's approach reflects a shift in talent strategy, viewing talent as a core parameter for algorithm evolution rather than a disposable resource [2][4] AI Development Initiatives - ByteDance has made significant advancements in AI, launching various products and models, including the "Kouzi Space" agent product and the "Doubao" general model, with continuous upgrades since April 2023 [5][9] - The company is actively involved in multiple AI application areas, including video generation and embodied intelligence, aiming to create a comprehensive "AI infrastructure + ecosystem" [9] - The collaboration with Shanghai Jiao Tong University's ACM class, known for producing top computer science talent, underscores ByteDance's commitment to enhancing its AI capabilities [4][8]
Sora2之后,又来了个全新的影视级AI视频模型,它的名字,叫GAGA。
数字生命卡兹克· 2025-10-10 01:33
Core Viewpoint - The article discusses the launch of a new AI video model, GAGA-1, which is considered to be at a top level in character performance and synchronization of audio and visuals [3][19][20]. Group 1: Product Features - GAGA-1 is designed for character performances with dialogue, achieving a level comparable to film quality, particularly excelling in short dramas and interactive gaming [20][21]. - The model allows for video generation using a combination of images and text prompts, with specific recommendations for prompt length to optimize performance [22][28]. - GAGA-1 currently offers three functionalities: Gaga Actor, Gaga Avatar, and Library, with a focus on the Gaga Actor feature for the latest model [16][18]. Group 2: Performance and Limitations - The model has shown impressive results in generating videos with realistic expressions and emotions, although it struggles with complex movements and longer prompts [30][52]. - The model's performance varies with the complexity of the prompts, and while it supports multiple languages, the quality of output can differ significantly [53]. Group 3: Pricing and Accessibility - GAGA-1 is currently available for free, with no indication of when or if a pricing model will be implemented, although it is expected to be significantly cheaper than competitors like Sora2 and Veo3 [55][57]. - The model aims to democratize video content creation, allowing more individuals to participate in the process [60][61].
开源仅一周,鹅厂文生图大模型强势登顶,击败谷歌Nano-Banana
机器之心· 2025-10-05 06:42
Core Viewpoint - The article highlights the rapid rise of Tencent's Hunyuan Image 3.0 model, which has topped the LMArena leaderboard, showcasing its advanced capabilities in text-to-image generation and its potential to rival top proprietary models in the industry [3][54]. Model Performance - Hunyuan Image 3.0 has received significant attention in the creator community for its superior image quality, detail restoration, and understanding of composition and style consistency [4][39]. - The model has surpassed 1.7k stars on GitHub, indicating growing community interest and participation [6]. - It demonstrates strong performance in generating coherent narratives and detailed illustrations based on user prompts, effectively combining knowledge, reasoning, and creativity [9][15]. Technical Specifications - The model is built on the Hunyuan-A13B architecture, featuring 80 billion parameters, making it Tencent's largest and most powerful open-source text-to-image model to date [3][41]. - It employs a mixed discrete-continuous modeling strategy, allowing for efficient collaboration between text understanding and visual generation [42][43]. - The training process involved a large dataset of nearly 5 billion images, ensuring high-quality and diverse training data [45]. Training and Development - The training strategy included multiple progressive stages, focusing on enhancing multimodal modeling capabilities through various data types and resolutions [49][51]. - The model's architecture integrates language modeling, image understanding, and image generation into a unified framework, enhancing its overall performance [43][54]. Industry Context - The emergence of models like Hunyuan Image 3.0 reflects a broader trend in the AIGC field, where models are evolving from mere generation capabilities to understanding, reasoning, and controlling content creation [55][56]. - Open-source initiatives are becoming a core driver of innovation, with companies like Tencent leading the way in developing and sharing advanced models to foster community collaboration [56].
行业观察 | Token市场占据半壁江山,火山引擎在打什么牌?
Sou Hu Cai Jing· 2025-09-22 15:16
Core Insights - The article emphasizes that the volume of Tokens called is a more accurate reflection of the actual load of large models in the AI cloud market than the scale of GPU computing power [2][6][11] - Volcano Engine has emerged as a significant player in the Chinese AI cloud market, with a revenue target exceeding 20 billion yuan for 2025, following a revenue of over 11 billion yuan in 2024 [2][3][35] - The focus on Token consumption indicates a shift in the cloud computing industry from selling computing power to selling Tokens, which could provide a competitive advantage for Volcano Engine [6][20][36] Market Position - According to IDC reports, Volcano Engine holds a 49.2% market share in the large model public cloud service market for the first half of 2025, up from 46.4% in 2024 [3][6] - In the AI infrastructure market, Volcano Engine ranks third with a 9% market share, and in the generative AI infrastructure market, it ranks second with a 14.2% market share [3] Token Consumption Growth - The Token consumption in China is experiencing rapid growth, with a reported increase of nearly 10 times from June to December 2024 [7][12] - The total Token consumption in the Chinese large model public cloud service market reached 537 trillion times in the first half of 2025 [7] - Volcano Engine's Ark platform saw a year-on-year increase of 3.98 times in Token consumption [7] Strategic Focus - Volcano Engine prioritizes Token consumption over revenue from GPU computing, viewing it as a better indicator of AI industry health and customer engagement [6][9][10] - The company aims to create a virtuous cycle where stronger model capabilities lead to increased AI applications and higher Token consumption [10][21] Future Outlook - Predictions suggest that by the end of 2027, the daily Token consumption of the Doubao model could exceed 100 trillion, marking a growth of at least 100 times from 2024 [18] - The shift from "selling computing power" to "selling Tokens" is seen as a significant evolution in cloud computing technology and business models [20][36] Competitive Landscape - Volcano Engine's strategy mirrors that of Google, which has successfully integrated its AI models with consumer applications to enhance Token consumption and reduce computing costs [22][35] - The company is positioned to leverage its extensive consumer application ecosystem, including Douyin and Doubao, to further increase its market share in Token consumption [34][35]