Abstract
A single dramatic software failure can cost a company millions of dollars---but can be avoided with simple changes to design and architecture.
This new edition of the best-selling industry standard shows you how to create systems that run longer, with fewer failures, and recover better when bad things happen.
New coverage includes DevOps, microservices, and cloud-native architecture.
Stability antipatterns have grown to include systemic problems in large-scale systems.
This is a must-have pragmatic guide to engineering for production systems.
This updated edition deals with the production of today's systems -- larger, more complex, and heavily virtualized -- and includes information on chaos engineering, the discipline of applying randomness and deliberate stress to reveal systematic problems.
Build systems that survive the real world, avoid downtime, implement zero-downtime upgrades and continuous delivery, and make cloud-native applications resilient.
Examine ways to architect, design, and build software -- particularly distributed systems -- that stands up to the typhoon winds of a flash mob, a Slashdotting, or a link on Reddit.
Take a hard look at software that failed the test and find ways to make sure your software survives.
Users
Please
log in to take part in the discussion (add own reviews or comments).