这是要出大事了。。。

Core Insights - Taalas has introduced a groundbreaking chip that integrates large models directly onto the chip, eliminating memory bandwidth limitations and achieving unprecedented performance levels [1][6] - The chip can run Llama 3.1 at a speed of 17,000 tokens per second, significantly outperforming Nvidia's best chips, which run at 230 tokens per second and 2,000 tokens per second respectively [1][3] Performance Comparison - Taalas's chip is 50 times faster than Nvidia's most advanced chip, with a projected speed of 22,000 tokens per second in the near future, surpassing human neural transmission speeds [5][6] - The cost of Taalas's chip is only one-twentieth of Nvidia's, making it an economically attractive option [5][7] Power Efficiency - The chip operates with significantly lower power consumption, allowing it to be cooled with a fan instead of requiring water cooling [5][7] - This low power requirement enables the chip to be used in various applications without the need for bulky servers [7] Software and Upgrade Considerations - Taalas's approach eliminates the need for complex software coding, simplifying the deployment of AI models [5][6] - However, the hardware must be replaced for each upgrade, contrasting with traditional GPUs that allow for easy swapping of software [5][6] Industry Implications - Taalas's innovation could disrupt the current AI landscape dominated by Nvidia, as it offers a specialized solution that prioritizes cost and efficiency over generality [6][8] - The chip's design is particularly advantageous for military applications, robotics, and autonomous driving, where speed and predictability are critical [6][7] Conclusion - The introduction of Taalas's chip marks a significant milestone in AI technology, suggesting that even without revolutionary computing architectures, advancements can still be made by rethinking existing paradigms [8]