Envoy resilience got a serious glow-up in latency-critical, user-facing storage systems by hunting down those pesky “no such bucket” failures—turns out rate-limiter timeouts were to blame. The team tuned Envoy’s fail-open vs. fail-close behavior, co-located rate-limiters to slay cross-load-balancer lag, and swapped out K6 for Envoy’s own Nighthawk to stress-test and spot hidden bottlenecks.
They also leveled up observability by tagging requests with gRPC metadata—and rolled out everything without flinching production traffic. Catch the nitty-gritty and more at KubeCon + CloudNativeCon North America in Atlanta this November!
Watch on YouTube
Top comments (0)