Roblox’s cloud-native catastrophe: A post mortem

In late October Roblox’s global online game network went down, an outage that lasted three days. The site is used by 50 million gamers daily. Figuring out and fixing the root causes of this disruption would take a massive effort by engineers at both Roblox and their main technology supplier, HashiCorp.Roblox eventually provided an amazing analysis in a blog post at the end of January. As it turned out, Roblox was bitten by a strange coincidence of several events. The processes Roblox and HashiCorp went through to diagnose and ultimately fix things are instructive to any company running a large-scale infrastructure-as-code installation or making heavy use of containers and microservices across their infrastructure.To read this article in full, please click here

Nov 30, -0001 - 00:00
 0
Roblox’s cloud-native catastrophe: A post mortem
Techatty All-in-1 Publishing
Techatty All-in-1 Publishing

In late October Roblox’s global online game network went down, an outage that lasted three days. The site is used by 50 million gamers daily. Figuring out and fixing the root causes of this disruption would take a massive effort by engineers at both Roblox and their main technology supplier, HashiCorp.

Roblox eventually provided an amazing analysis in a blog post at the end of January. As it turned out, Roblox was bitten by a strange coincidence of several events. The processes Roblox and HashiCorp went through to diagnose and ultimately fix things are instructive to any company running a large-scale infrastructure-as-code installation or making heavy use of containers and microservices across their infrastructure.

To read this article in full, please click here

Techatty Connecting the world of tech differently! Read. Write. Learn. Thrive. Make an informed decision without distractions. We are building tech media and publication networks to connect YOU and everyone to reliable information, opportunities, and resources to achieve greater success.