Workflow
数字生命卡兹克
icon
Search documents
现在,你终于可以用飞书搭自己的AI知识库了。
数字生命卡兹克· 2025-05-22 17:09
我在过去,写过N次飞书了。 我在过去,也安利过好多次AI知识库产品了,混沌之初交大家用dify、扣子搭知识库,后来也写过腾讯ima。 但是,我一直希望,飞书能出自己的AI知识库产品。 无他。 因为我的公司开在飞书上,因为我自己,也是飞书的深度用户。 因为我所有的工作和知识数据,整体数据量能跟微信相媲美的,只有飞书。 我根本不知道我现在飞书里面到底存了我的多少数据,我只知道,我每天都会操作一堆乱七八糟的文档。 而且我这个人,其实没有那么的爱整理。 我最常干的一件事,就是经常在飞书上,直接新起一个文档,然后写了一堆信息,分享给别人,就完事了。 过了一段时间,我想回想一下那个文档叫什么名字,根本找不到了,因为,那玩意叫未命名文档。。。 还有各种,未命名多维表格。 | 我曾经试图把我的一些资料导入到NotebookLM中,作为我的知识库。 | | --- | | 下载文件,重新命名,分类整理。 | | 干了半小时,我就放弃了,因为实在太累了。 | | 想一想,还是等等吧,因为飞书不可能不出AI知识库产品的,等就完了。 | | 因为绝大多数的AI知识库产品,它们都是你搭好了AI,再想办法喂知识。 | | 而在飞书里,在 ...
Agent真的卷疯了,AI办公Agent也来了。
数字生命卡兹克· 2025-05-21 16:53
Core Viewpoint - The article discusses the emergence of specialized agents in various industries, highlighting the introduction of the Skywork Super Agents by Kunlun Wanwei, specifically designed for office tasks [1][3][5]. Group 1: Product Overview - Skywork Super Agents is a new product by Kunlun Wanwei aimed at enhancing office productivity [3][5]. - The product features distinct modes for document creation, PPT presentations, and spreadsheet management, catering to specific office scenarios [5][6][59]. - The platform offers both overseas and domestic versions, with dedicated websites for each [5][87]. Group 2: User Experience - The author had a five-day testing experience with the product, noting its comprehensive functionality and user-friendly interface [4][5]. - The agent allows users to input themes and requirements for document and PPT creation, streamlining the process [8][9][18]. - A notable feature is the confirmation step before finalizing tasks, enhancing user control over the output [15][18][19]. Group 3: Features and Capabilities - The Skywork Super Agents include specialized modes for creating documents, PPTs, and spreadsheets, with the ability to handle various types of content [6][59]. - Users can upload files or provide prompts, and the agent will generate content based on the input, including the ability to edit generated text directly [27][30][63]. - The PPT generation process is highlighted for its aesthetic appeal and structured output, with options for users to confirm or modify the generated content [22][23][30]. Group 4: Pricing and Market Position - The pricing strategy for the overseas version is positioned as mid-range compared to similar products, while the domestic version is significantly cheaper, being one-third of the overseas price [78][84]. - The product operates on a point system, where more complex tasks consume more points, reflecting the computational resources used [77][78]. Group 5: Company Insights - Kunlun Wanwei is recognized for its commitment to improving AI usability, with recent initiatives including the open-sourcing of the DeepResearch Agent framework [86][90][92]. - The company aims to address everyday office challenges through innovative engineering solutions, indicating a strong focus on user needs [93].
一文看懂2025 Google I/O开发者大会 - 250刀Ultra会员、Veo3、Imagen4等等全线开花。
数字生命卡兹克· 2025-05-20 23:34
Core Insights - Google has made significant advancements in AI technology, showcasing a range of new products and features during the Google I/O developer conference, indicating a strategic shift towards integrated AI solutions [3][10][99] Group 1: AI Models - The introduction of the Google AI Ultra membership at $249.99 per month signifies a comprehensive strategy to unify various AI offerings under one subscription [6][10] - Gemini 2.5 Pro emerged as a standout model, outperforming competitors in all LMArena categories, particularly excelling in language, reasoning, and coding tasks [15][21] - Gemini 2.5 Flash is positioned as a speed-focused model, set to launch in June, with improvements across multiple dimensions [19][20] - Gemini 2.5 Pro Deep Think enhances the capabilities of the Pro model, particularly in complex mathematical and programming benchmarks [21][24] - Gemini Diffusion represents a cutting-edge research initiative, utilizing a novel approach to content generation that significantly reduces latency [26][28] Group 2: Gemini Products - Gemini Live integrates multimodal interaction, allowing users to engage with AI through visual inputs, with a new visual question-answering feature launching on Android and iOS [30][31] - The Personal Context feature personalizes user interactions by accessing data from Google applications, enhancing the relevance of AI responses [34][36] - DeepResearch and Canvas upgrades allow users to upload files for in-depth research and convert reports into various formats, including web pages and podcasts [38][39] - Gemini's integration into Chrome enables real-time content understanding and summarization while browsing [41] - The introduction of Agent Mode allows users to delegate tasks to AI, streamlining processes like house hunting [43][44] Group 3: Visual Generation - Flow, a new AI film production tool, combines capabilities from various Google models to create and edit videos based on user prompts [46][48] - Veo 3 enhances video realism with native audio generation, allowing for synchronized sound effects and dialogue [53][55] - Imagen 4, the latest text-to-image model, boasts significant improvements in image quality and detail, now available for general use [60][64] Group 4: Google Search Enhancements - AI Overviews have been adopted by over 1.5 billion users monthly, improving search result relevance and user engagement [67][68] - AI Mode represents a transformative shift in search functionality, enabling complex queries and personalized results based on user data [70][72] Group 5: Agent Systems - Project Mariner, an AI-driven automation tool, has advanced to handle multiple tasks simultaneously and learn from user demonstrations [76][80] - Jules, an AI programming agent, is currently in global testing, allowing users to automate code management tasks [81][82] Group 6: Other Innovations - The Project Moohan headset and Android XR smart glasses showcase advancements in augmented reality, enhancing user interaction with their environment [89][91] - Google Beam technology enables realistic 3D video calls, enhancing remote communication experiences [93][95] - The upgraded SynthID digital watermarking technology addresses challenges in identifying AI-generated content [98]
DeepSeek们越来越聪明,却也越来越不听话了。
数字生命卡兹克· 2025-05-19 20:14
Core Viewpoint - The article discusses the paradox of advanced AI models, where increased reasoning capabilities lead to a decline in their ability to follow instructions accurately, as evidenced by recent research findings [1][3][10]. Group 1: Research Findings - A study titled "When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs" reveals that when models engage in reasoning, they often fail to adhere to given instructions [2][3]. - The research team from Harvard, Amazon, and NYU conducted tests on 15 models, finding that 13 out of 14 models showed decreased accuracy when using Chain-of-Thought (CoT) reasoning in simple tasks [4][6]. - In complex tasks, all models tested exhibited a decline in performance when employing CoT reasoning [4][6]. Group 2: Performance Metrics - In the IFEval test, models like GPT-4o-mini and Claude-3.5 experienced significant drops in accuracy when using CoT, with GPT-4o-mini's accuracy falling from 82.6% to 76.9% [5]. - The results from ComplexBench also indicated a consistent decline across all models when CoT was applied, highlighting the detrimental impact of reasoning on task execution [4][6]. Group 3: Observed Behavior Changes - The models, while appearing smarter, became more prone to disregarding explicit instructions, often modifying or adding information that was not requested [9][10]. - This behavior is attributed to a decrease in "Constraint Attention," where models fail to focus on critical task constraints when reasoning is involved [10]. Group 4: Proposed Solutions - The article outlines four potential methods to mitigate the decline in instruction-following accuracy: 1. **Few-Shot Learning**: Providing examples to the model, though this has limited effectiveness due to input length and bias [11][12]. 2. **Self-Reflection**: Allowing models to review their outputs, which works well for larger models but poorly for smaller ones [13]. 3. **Self-Selective Reasoning**: Enabling models to determine when reasoning is necessary, resulting in high recall but low precision [14]. 4. **Classifier-Selective Reasoning**: Training a smaller model to decide when to use CoT, which has shown significant improvements in accuracy [15][17]. Group 5: Insights on Intelligence - The article emphasizes that true intelligence lies in the ability to focus attention on critical aspects of a task rather than processing every detail [20][22]. - It suggests that AI should be designed to prioritize key elements of tasks, akin to how humans effectively manage their focus during critical moments [26][27].
HDRimg,30秒一键生成亮瞎眼的HDR表情包。
数字生命卡兹克· 2025-05-18 19:27
| | 7 HDR vs SDR 技术参数对比 (通俗版) | | | --- | --- | --- | | 对比维度 | SDR (Standard Dynamic Range) | HDR (High Dynamic Range) | | 亮度范围 | 最高大约 100~300 尼特 | 可达到 1000~2000+ 尼特甚至更高 | | 色域范围 | sRGB (标准红绿蓝) | DCI-P3 / BT.2020 (更广的色彩) | | 对比度 | 约 1,000:1 | 约 1,000,000:1 (部分HDR设备) | | 色彩深度 | 8-bit (每种颜色256个等级) | 通常为 10-bit (每种颜色1024等级) | | 细节表现 | 高光易"糊"、暗部易"黑成一团" | 高光不过曝、暗部保细节 | | 视觉感受 | 画面平淡,像照片 | 圈面通透,有立体感,像"真实现场" | | 应用范围 | 传统电视、网页图片、普通视频 | 高端手机、4K电视、流媒体视频、PS5游戏等 | | 编码标准 | Rec.709 | Rec.2020、HDR10、Dolby Vision 等 | 而这次 ...
这才是现在最强的AI声音模型。
数字生命卡兹克· 2025-05-15 15:40
几个月前,我写过一篇MiniMax的AI声音模型。 我说,那就是当时最强的中文AI音频。数据也有点小爆。 而在去年12月之后,至今将近半年时间,在AI声音模型这块,我觉得还是没有能超越MiniMax的。 直到昨天,我看到 MiniMax在X上发了他们新一代声音模型的技术报告,Speech-02来了。看来想突破Speech-01的上限,还是得他们自己。 | MiniMax (official) & | ... | | --- | --- | | @MiniMax AI | | | | Language | WER J | | SIM ↑ | | | --- | --- | --- | --- | --- | --- | | | | MiniMax | 11LABS | MiniMax | 11LABS | | | Chinese | 2.252 | 16.026 | 0.780 | 0.677 | | | English | 2.164 | 2.339 | 0.756 | 0.613 | | | Cantonese | 34.111 | 51.513 | 0.778 | 0.670 | | | Japanese ...
今天我替煤炭给AI正个名。。。
数字生命卡兹克· 2025-05-14 20:05
过于无语了。 我真是没想到,有朝一日,我得出来替煤炭,替AI,写一篇正名的文章。 这两天,在X、微博、小红书,疯传着一个截图。 图中一份正儿八经标价8200元的研究报告,赫然写着这么一句: "煤炭素有'黑金'之称,是可再生资源,收获来自煤矿石以及击杀凋灵骷髅获得1~3个。" 这么一份标价天价、封面还特么打着"2022-2029行业发展趋势前景"旗号的专业研究报告。 告诉你,煤炭,是打游戏打怪掉落出来的。 这个事,直接冲上了知乎热搜第一。 各种衍生梗,遍地开花。 比如山西人,击杀凋零鼓楼获取煤炭。 还有这个。 甚至,就这个掉落数据,这篇报告都还粘贴错了。 是1/3概率掉落煤炭,不是掉落1到3个煤炭。。。 很多群里,都看到朋友,在嘲笑AI,说AI幻觉哈哈哈哈哈哈哈哈。 或者就是离谱的AI搜索,然后审核人员背锅。 真的太抽象了。 这不是一个段子。 这是一份实际在官网上对外销售、挂着研究机构名头的能源产业报告。 卖价8200一份,电子版都不打折。 这事儿要是搁以前,最多被拿来当做行业笑话讲一讲。 但今天不一样。 因为它的流行路径,已经变质了。 很多人看到这句话的第一反应是:"现在AI写研报真离谱。" 但是,我想说,你 ...
腾讯悄悄出了个插件版“Cursor”,还跟微信小程序打通了。
数字生命卡兹克· 2025-05-13 15:38
Core Viewpoint - Tencent Cloud's CodeBuddy 3.0 is a code assistant that integrates with WeChat's development tools, leveraging Tencent's extensive ecosystem to enhance programming workflows and facilitate the development of WeChat mini-programs [1][6][41]. Group 1: Product Overview - CodeBuddy is a plugin rather than an IDE, allowing integration with various coding environments without the need for a separate software installation [1][2]. - The product supports mainstream features such as code completion and intelligent development modes, similar to other AI programming tools [4][5]. - CodeBuddy's integration with WeChat Developer Tools enables developers to create mini-programs efficiently, utilizing WeChat's development standards [6][7][41]. Group 2: Functionality and User Experience - The plugin can be installed easily within existing development environments, making it accessible for developers using different IDEs like IDEA or Xcode [2][4]. - CodeBuddy features a Craft mode that helps clarify requirements through interactive prompts, enhancing the development process [15][27]. - The assistant can generate code based on user prompts, significantly reducing the time and effort required to develop applications [29][32]. Group 3: Ecosystem and Strategic Advantage - The integration of CodeBuddy with WeChat's ecosystem provides a unique advantage, allowing for seamless access to various APIs and payment systems [40][41]. - Tencent's extensive ecosystem serves as a significant competitive moat, enabling developers to transform ideas into accessible applications quickly [39][46]. - The combination of AI capabilities with WeChat's functionalities creates new channels for creativity and distribution, enhancing user engagement [43][44].
一手实测深夜发布的世界首个设计Agent - Lovart。
数字生命卡兹克· 2025-05-12 19:08
Core Viewpoint - The article discusses the emergence and potential of Lovart, an AI design agent tool, highlighting its capabilities and the future of design workflows in the industry [1][64]. Group 1: Product Overview - Lovart is an AI design agent tool that gained significant attention, particularly in overseas markets, and operates on an invitation-only basis for its beta testing [2][6]. - The interface of Lovart resembles an AI chat platform, providing a user-friendly experience for design requests [7][8]. - The tool emphasizes the importance of industry-specific knowledge, suggesting that understanding design requirements and context is crucial for effective AI application [8]. Group 2: Functionality and Features - Users can input specific design requests, and Lovart processes these by first matching the required style before executing the task [11][17]. - The tool utilizes a LoRA model for style matching, which is essential for achieving the desired design outcome [17]. - Lovart can break down design tasks into detailed prompts, ensuring clarity and precision in the execution of design requests [19][23]. Group 3: Design Process and Output - The article illustrates a practical example where Lovart generated a series of illustrations based on a detailed prompt, showcasing its efficiency and effectiveness [9][30]. - Lovart supports various design functionalities, including resizing images and separating text from backgrounds for easier editing [52][57]. - The tool can also generate video content based on design prompts, demonstrating its versatility in handling multimedia projects [58][61]. Group 4: Future Implications - The author expresses optimism about the future of design workflows, suggesting that AI agents like Lovart could redefine the role of designers and the nature of design outputs [64]. - The potential for vertical agents in various industries is highlighted, indicating a trend towards specialized AI tools that cater to specific fields [64].
2025,我们又一次用AI,交了个朋友。
数字生命卡兹克· 2025-05-11 09:37
2025年5月10号,杭州,晴。 在下了几天大雨之后,运气格外的好,在活动的这天,居然放晴了。 杭州,真是一个AI浓度很高的城市。 这一次,我们又一起用AI,交了个朋友。 这一次,终于也把它,带到了杭州。 我们在西湖边,在仲夏未至的初夏里,跟300个从全国各地赶来的人,一起搞了一场非常不正经的AI聚会。 而且我们最好玩的是,找了一个非常有趣的场地。 是一个篮球场。 之所以找这个场地,就是因为人数有点多,这次300多个人,还需要吃席一样的放桌子,所以找了个,阿里园区里面巨大篮球场。 一个好朋友约你出来,说,走吧,一起喝一杯,一起造点东西。 《一起AI,交个朋友》已经一年了。 这是第六站,从北京到上海,从深圳到昆明,从年初的北京回归到现在的杭州。 心中那种隐隐约约的理想主义感觉被重新点燃了。 就是这一次,确实准备的没有特别的充分,很多的问题,直播的时候声音重叠,用AI抽奖的时候各种幻觉,屏幕还在kuku闪屏。 但是,还是感谢大家的包容。 同时,也是很多的AI人,实在都太卷了,身体也实在太差了,所以也提醒一下大家,要多运动。。。 健康第一。 同时,我觉得,它真的不像一场活动,更像是一场派对,一次重逢。 我们让所有人 ...