Your GPUs should be accelerating AI, not recomputing context 🔄

Hi, I'm Barack Epstein, product manager at Google Cloud and we are working on manage luster. We're showing here a demo about how manage luster can be used as a KV cache. KV cache is about storing context during inferencing.So in a standard situation, you can have your uh KV cache saved to GPU or TPU to or TPU memory. But with Luster now, you can save the context into the manage Luster. Now the first time that you have the context saved may not matter but as you start to build context and you have additional ...