Workflow
AI生图
icon
Search documents
火爆全网的AI片场探班玩法,手把手教会你。
数字生命卡兹克· 2025-12-25 01:20
Core Viewpoint - The article discusses the evolution of AI video technology, particularly focusing on the use of AI tools to create personalized video experiences with characters from popular media, highlighting the ease of use and creative potential of these tools [1][35]. Group 1: AI Tools and Techniques - The process of creating AI-generated images and videos involves three main steps: generating images using prompts, creating videos from key frames, and editing the final product with software [4]. - The author emphasizes the simplicity of the process, suggesting that users do not need to purchase prompts but can utilize AI tools like Gemini to generate effective prompts based on their needs [16][21]. - The article mentions the challenges faced when using different AI models, particularly the inconsistency in generating images of Asian faces with the Nano Banana Pro model, leading to a switch to another model, Seedream 4.5, which performed better for this demographic [11][13]. Group 2: Creative Applications - The author shares various creative applications of the AI tools, including generating images and videos featuring characters from popular franchises like "Stranger Things" and "Avatar," as well as nostalgic shows like "Wu Lin Wai Zhuan" [28][34]. - The article highlights the ability to create engaging content by combining AI-generated visuals with editing software, allowing for the addition of effects and sound to enhance the final product [30]. - The narrative reflects on the emotional connection to childhood memories and the excitement of interacting with beloved characters through AI technology, showcasing the potential for personal storytelling [35].
你还在晒AI图,有人已经在靠“提示词”收款了
3 6 Ke· 2025-11-27 09:40
一句好话术, 顶十个滤镜 前段时间詹姆斯中国行,我在朋友圈刷到好几个朋友都跟詹姆斯合影了,我还纳闷儿他们怎么都不上班跑去参加活动,一问才知道都是用即梦4.0生成蹭 热度的,打工人的世界哪有那么多说走就走的美好。 那还说啥了,我转头也去即梦官网做了一张: ●即梦4.0生成的明星合照 大家有兴趣的也可以用这个提示词试一试—— 请将图1和图2融合成一张双人俯拍自拍照,画面构图紧凑,两位主体靠得很近,头部略微上仰,眼神直视镜头,营造出强烈的视觉冲击力。左侧人物站得 略靠前,参考我图1的主体形象特征造型保持不变,需要保持人脸相似度;右侧人物参考图2的主体形象特征保持造型不变,需要保持人脸相似度,略微内 扣身体,拍摄角度为高角度俯拍,使头部比例被夸张放大,符合典型的日韩视觉自拍风格。背景为纯白色,简洁干净,进一步凸显人物主体。画面风格偏 向日系视觉系,整体画面清晰度高,用iphone前置自拍,最终呈现出精致、时尚、略带的合影效果。要求人物实现无缝融进画面,视觉过渡自然,整体画 面光线明亮且均匀。 这套提示词也不是我研究的,是我在网上扒的。现在这种P图指令网上到处都是,一搜一大把,随便一个帖子都有大几百个赞,甚至还有人把这 ...
开源模型叫板Nano Banana Pro!Stable Diffusion原班人马杀回来了
量子位· 2025-11-26 09:33
Core Insights - The article discusses the launch of Flux.2, a new AI image generation model from Black Forest Lab, which aims to compete with Google's Nano Banana Pro by offering similar image quality at a lower cost [1][42]. Group 1: Product Features - Flux.2 is designed to be a productivity tool, enhancing the capabilities of users in generating images [2]. - The model supports multiple reference images, allowing for complex image generation tasks, such as creating fashion editorial images with consistent characters [3]. - Flux.2 offers various versions, including Flux.2 [pro], [flex], [dev], and an upcoming [klein], each tailored for different user needs and performance requirements [16][17]. Group 2: Performance Comparison - Initial tests show that Flux.2's image generation speed is under 10 seconds for the [pro] version, with the ability to handle up to 10 reference images [17]. - While Flux.2 demonstrates significant improvements in instruction adherence and fine control, it still lags behind Nano Banana Pro in overall image quality [39][40]. - Users have reported that Flux.2 performs well in tasks like photo restoration and image editing, often producing results that are more natural compared to Nano Banana 2 [46][48]. Group 3: Market Positioning - Flux.2 is positioned as a cost-effective alternative to Google's models, providing high-quality outputs at a lower price point, which is appealing for users who typically face high costs with Nano Banana Pro [42]. - The model supports high-resolution image editing up to 4MP, catering to users looking for detailed outputs [44]. - The article highlights the historical context of Flux models, noting that Flux.1 was a benchmark in the AI image generation space before the introduction of Flux.2 [56][59].
太炸裂了!全网实测Nano Banana Pro,网友:这模型里到底装了什么鬼东西!
量子位· 2025-11-21 06:29
Core Insights - Google has launched the Nano Banana Pro, a powerful image generation model that has garnered significant attention and excitement across the internet [11][10]. - The model integrates multi-modal understanding capabilities from Gemini 3 Pro and Google's extensive knowledge base, allowing it to comprehend real-world semantics and physical logic [12]. Features and Capabilities - Users can access the Nano Banana Pro for free through the Gemini application, although there are usage limits for free accounts, while subscribers to Google AI Plus, Pro, and Ultra enjoy higher quotas [13]. - The model supports high-resolution outputs, including 2K and 4K, and can generate complex professional charts, enhancing its utility for various applications [15][46]. - It has improved text rendering capabilities, allowing for multi-language support and direct translation of text within images [15]. User Experience and Performance - Initial tests demonstrated the model's ability to create detailed and aesthetically pleasing visual outputs, such as exploded views of bicycle components and scenes with dolls [14][20]. - The model's performance is influenced by the specificity of user prompts, with clearer instructions leading to better results [23]. - Users have reported a surge in creative applications of the Nano Banana Pro, showcasing its versatility in generating illustrations, infographics, and even comic strips [28][34][42]. Industry Impact - The launch of Nano Banana Pro is seen as a significant advancement in AI-generated imagery, pushing the boundaries of what is possible in this field [26]. - Google CEO Sundar Pichai has endorsed the model, highlighting its advanced image generation and editing capabilities, which are designed to meet the needs of professionals in various industries [46].
AI技术滥用调查:“擦边”内容成流量密码,平台能拦却不拦?
Hu Xiu· 2025-10-12 10:08
Group 1 - The article highlights the misuse of AI technology, particularly in creating inappropriate content, leading to significant concerns for both ordinary individuals and public figures [1][6][10] - A surge in AI-generated content, such as "AI dressing" and "AI borderline" images, has become prevalent on social media platforms, attracting large audiences and followers [2][10][11] - The Central Cyberspace Affairs Commission has initiated actions to address the misuse of AI technology, focusing on seven key issues, including the production of pornographic content and impersonation [4][5] Group 2 - Ordinary individuals and public figures alike are victims of AI misuse, with cases of identity theft and defamation emerging from AI-generated content [6][8][9] - The prevalence of AI-generated "borderline" content on social media platforms raises concerns about copyright infringement and the potential for exploitation [10][12][22] - Various tutorials and guides are available on social media, instructing users on how to create and monetize AI-generated borderline content, indicating a growing trend in this area [13][16][22] Group 3 - Testing of 12 popular AI applications revealed that 5 could easily perform "one-click dressing" on celebrity images, raising concerns about copyright infringement [31][32][39] - Nine of the tested AI applications were capable of generating borderline images, with the ability to bypass content restrictions through subtle wording changes [40][41][42] - The article discusses the challenges faced by platforms in regulating AI-generated content, highlighting the need for improved detection and compliance measures [54][56][60] Group 4 - The article emphasizes the need for clearer legal standards and increased penalties for violations related to AI-generated content to deter misuse [57][59][60] - Recommendations for individuals facing AI-related infringements include documenting evidence and reporting to relevant authorities, underscoring the importance of legal recourse [61] - The article concludes that addressing the misuse of AI technology requires a multifaceted approach, including technological improvements and regulatory clarity [62]
登顶苹果应用榜!谷歌火遍全网的“纳米香蕉”,凭啥击败ChatGPT?
证券时报· 2025-09-16 07:51
Core Viewpoint - Google's market capitalization has reached $3 trillion, and its AI application Gemini has surpassed ChatGPT to become the top app on the Apple App Store [1][2]. Group 1: Gemini's Performance - Gemini has achieved over 2 million downloads in the US App Store, surpassing ChatGPT, and has also topped the charts in Canada, India, and Morocco [2]. - The success of Gemini is attributed to the launch of the image editing product Nano Banana, which has significantly improved image quality and editing control [4]. Group 2: Nano Banana Features - Nano Banana allows users to edit images using simple natural language commands, eliminating the need for traditional editing tools [4]. - The model maintains character consistency across different scenes and actions, which is crucial for brand character creation and script generation [4]. - It supports the fusion of multiple images and incorporates world knowledge to understand complex scenes for editing tasks [5]. - Nano Banana reduces the barriers to 3D modeling by generating 2D designs that include essential structural and material information [5]. Group 3: Market Impact and Competitors - The popularity of Nano Banana has sparked competition in the image generation space, with other companies like ByteDance and Shengshu Technology launching similar models [10]. - Analysts believe that the native multimodal model architecture is gaining industry recognition, with OpenAI and Google's models showing advantages in performance and deployment [10]. - The demand for computational power is expected to increase due to the higher requirements of native multimodal models compared to non-native ones [11].
“AI生图”做题家大赛,谁赢了?
Core Viewpoint - The emergence of AI-generated figurine images has been significantly influenced by Google's recent release of the Gemini 2.5 Flash Image model, dubbed "Nano Banana," which has been praised for its user-friendly operation and high-quality output [2][5]. Group 1: AI Model Comparisons - Following the launch of "Nano Banana," competitors such as ByteDance's Seedream 4.0 and Shenshu Technology's Vidu Q1 quickly entered the market, indicating a rapid escalation in the AI image generation sector [5][8]. - Seedream 4.0 has reportedly topped the rankings in text-to-image and image editing categories, surpassing Google's Nano Banana in both fields [8]. - In a comparative test, Nano Banana produced a more realistic figurine image of a long-haired kitten, demonstrating superior understanding of figurine aesthetics compared to Seedream 4.0 and Vidu Q1, which struggled with material representation [11][14]. Group 2: Performance Insights - Seedream 4.0 excelled in generating a stunning final image from a complex prompt involving a figurine in a realistic setting, while Nano Banana required additional prompts to improve its output [14]. - In a test involving family dynamics, Seedream 4.0 interpreted the prompt favorably, while Nano Banana added unexpected elements, showcasing differences in understanding user intent [18]. - All three AI models displayed unique strengths and weaknesses, with Nano Banana achieving extreme realism, Seedream 4.0 demonstrating good comprehension, and Vidu Q1 providing balanced performance across tasks [20]. Group 3: Industry Implications - The advancements in these AI models represent a significant leap in capabilities, including improved understanding, faster output times, and higher image quality, moving closer to the ideal of a productivity tool [23].
X @0xLIZ
0xLIZ· 2025-08-28 01:35
Model Capabilities - Google's Gemini 2.5 Flash Image model (formerly Nano Banana) is positioned as a significant advancement in AI image generation, with the potential to redefine image production scenarios [1] - The model demonstrates the ability to perform image manipulation tasks such as changing clothing and adjusting poses while maintaining core facial features [1] - The model exhibits limitations in accurately replicating specific hand gestures, suggesting areas for improvement in understanding and executing complex instructions [1] - The model brings "high consistency" which enables image element-level "composability", allowing for the free "assembly" of images [1] - The model allows users to manipulate objects within an image, rather than just pixels or layers [1] Practical Application - The model is currently accessible for free via a specified link, offering users the opportunity to test its capabilities without performance restrictions [1]
Qwen新开源,把AI生图里的文字SOTA拉爆了
量子位· 2025-08-05 01:40
Core Viewpoint - The article discusses the release of Qwen-Image, a 20 billion parameter image generation model that excels in complex text rendering and image editing capabilities [3][28]. Group 1: Model Features - Qwen-Image is the first foundational image generation model in the Tongyi Qianwen series, utilizing the MMDiT architecture [4][3]. - It demonstrates exceptional performance in complex text rendering, supporting multi-line layouts and fine-grained detail presentation in both English and Chinese [28][32]. - The model also possesses consistent image editing capabilities, allowing for style transfer, modifications, detail enhancement, text editing, and pose adjustments [27][28]. Group 2: Performance Evaluation - Qwen-Image has achieved state-of-the-art (SOTA) performance across various public benchmark tests, including GenEval, DPG, OneIG-Bench for image generation, and GEdit, ImgEdit, GSO for image editing [29][30]. - In particular, it has shown significant superiority in Chinese text rendering compared to existing advanced models [33]. Group 3: Training Strategy - The model employs a progressive training strategy that transitions from non-text to text rendering, gradually moving from simple to complex text inputs, which enhances its native text rendering capabilities [34]. Group 4: Practical Applications - The article includes practical demonstrations of Qwen-Image's capabilities, such as generating illustrations, PPTs, and promotional images, showcasing its ability to accurately integrate text with visuals [11][21][24].
“没有AI味”的Flux.1新模型,现可以免费试用
量子位· 2025-08-05 01:40
Core Viewpoint - The article discusses the release of a new AI image generation model, FLUX.1 Krea [dev], which aims to produce more realistic and diverse images without the typical "AI feel" associated with generated images [1][3][70]. Model Performance - The model is designed to avoid common issues in AI-generated images, such as overexposed highlights and unnatural textures, focusing instead on natural details [3][5]. - FLUX.1 Krea [dev] outputs four images at once, allowing users to select the most realistic one [14][76]. Optical Realism - The model's ability to understand physical optical principles was tested by generating images based on prompts related to different materials [11][12]. - While the model successfully added realistic features like rust to metal surfaces, it still produced some inexplicable structures [15][16]. - The model's understanding of water textures was found to be superficial, resulting in repetitive and distorted wave patterns [21]. Texture Continuity and Semantic Understanding - The model was evaluated on its ability to generate complex textures and natural transitions, particularly in knitted fabrics and plants [22][23]. - Although it performed well in terms of microstructure continuity, it struggled with accurately representing uneven textures and specific plant types [27][32]. Perspective and Motion Blur - The model's capability to generate scenes with multiple objects was assessed to understand its grasp of spatial relationships [34]. - It demonstrated a reasonable performance in creating depth of field effects, but had issues with accurately depicting motion and directional blur [38][43]. Adherence to Physical Rules - The model was tested with prompts that contained logical contradictions to see if it would prioritize physical laws over data fitting [45]. - It maintained the presence of shadows even when instructed otherwise, indicating a strong adherence to physical realism [47]. - However, it failed to generate realistic images in scenarios that defy physical laws, such as fish swimming above a city [49][50]. Additional Features - The model allows users to experiment with different image styles and adjust existing images, although it struggled with accurately capturing human features [51][56]. - Despite its limitations, FLUX.1 Krea [dev] is noted for its strong performance in light and material texture, making it a competitive option among AI image generation tools [65][71].