Backstory

You may recognize many of us from FoundationDB, the pioneering ACID-compliant distributed database that we began building in 2009. At the time, distributed database experts thought it was impossible to build a performant, highly scalable database with strong data consistency guarantees and ACID transactions. They thought you could either have a scalable, distributed database with weak data consistency, or a single machine database (such as MySQL) with strong data consistency.

We realized that it was techinically possible to build a distributed database with ACID guarantees, but we were afraid that the end result would be riddled with bugs because distributed systems are so incredibly complex. When hardware or network faults can happen at any time, ensuring reliability and data consistency seemed effectively impossible… at least, without a completely new approach to distributed systems testing.



So our team began to create testing technology that could deterministically simulate FoundationDB clusters of varying sizes and configurations, engage them with multiple types of workloads, inject a variety of hardware and network faults, and check the results at the end of each run to see if any of the system's guarantees were violated. Because our testing system was deterministic, any test run that found a correctness or reliability issue could be reproduced and analyzed (sometimes with additional logging added) so that the underlying bug(s) could be identified and fixed. Once a fix was applied, the test run (and variants of it) could be re-run to confirm that the fix had actually eliminated the bug.

With our rigorous new testing approach, we found that virtually all the bugs discovered in a new cycle of test runs had been introduced since the last testing cycle. Once we could consistently identify bugs within a day of their introduction, fixing them became trivial: it was easy to tell which recent change had caused the bug. This dramatically increased the productivity of the FoundationDB engineering team, since we could add new functionality knowing that any bugs would be quickly found and easily fixed. With this newfound fearlessness, we completely re-wrote FoundationDB's transaction processing engine (resulting in a 10x speed improvement) without needing to worry about introducing subtle instabilities.

In 2015, Apple acquired FoundationDB and began using it as the underpinning of its cloud infrastructure. One of FoundationDB's early customers, Snowflake, continues to use it as its primary metadata store. But as the FoundationDB team began to disperse through other big tech companies, our founders (Will Wilson and Dave Scherer) were shocked to find that even in the most sophisticated organizations, nothing like FoundationDB's deterministic testing existed. Changes to complex distributed systems were happening at a snail's pace, months of valuable engineering time was consumed by diagnosing and fixing production bugs.

We founded Antithesis because we saw the need for powerful testing tools and all the reliability and productivity they bring. Antithesis has taken the rigorous testing approach we pioneered at FoundationDB, matured it, and made it available to everyone else.