
In a world driven by rapid development and continuous innovation, failure isnât always a setbackâin fact, it can be a winning strategy.
Letâs talk about Fail Fast, a fundamental methodology in software development that aims to quickly identify the limitations and critical issues of a solution.
What is Fail Fast?
Fail Fast is an approach with a single goal: to find a potential stopping point. If Descartes said Cogito ergo sum (I think, therefore I am), here we could say Deficio ergo sum (I fail, therefore I exist).
This approach is especially crucial in contexts like the IoT (Internet of Things), where the variables are numerous and often beyond our direct control. But trust me, this methodology can prove useful in any field.
Imagine working with a “black box,” an element over which we have little visibility or influence. The documentation is brief, with a list of expected behaviors, but few certainties. On the other hand, we have detailed user stories, reviewed together with UX and QA teams, ready to be tested against rigorous test plans.
So, why aim to fail?
Because discovering as early as possible that something isnât working allows us to make timely decisions, adjust the course, and, if necessary, escalate the issue.
Fail Fast is the opposite of âsandbagging,â which is the habit of postponing a problem, hoping it will be forgotten or classified as ânon-reproducible.â
By acting immediately, we can prevent problems from accumulating and becoming unmanageable.
How to Put it into Practice?
Fail Fast often finds its application through Proof of Concept (PoC).
A team, or even a single developer, tests a solution to quickly validate it. But itâs not just about confirming whether something works; the PoC should challenge the solution, testing it under conditions that could lead to failure.
Senior or Expert profiles are typically involved in these activities due to their experience in identifying critical points and âhittingâ where problems are most likely to arise. However, itâs not an absolute requirement: anyone with the right mindset and approach can contribute to the Fail Fast process.
A Practical Example: BLE Synchronization in Background on iOS
A context where itâs easy to apply this strategy is hardware, often characterized by “black boxes” over which we have limited control.
Many of you may have a smart band or a smart ring. These devices continuously collect data, and for you users, itâs convenient to open the application and find updated data without waiting for a long synchronization process, right?
Great, but if we consider BLE (Bluetooth Low Energy) and iOS background operations as potential pain points, weâre dealing with a powder keg ready to explode at any moment.
From experience, we know there are critical patterns involving BLE and iOS in different scenarios (we could talk about it for days…). A classic example of applying Fail Fast would be to immediately verify if the peripheral simply requires physical interaction to maintain or re-establish the connection periodically.
If we donât run this test right away, we risk discovering too late that background synchronizations donât work as expected. This could compromise the seamless user experience they anticipate, forcing us to review the entire software or hardware project.
Or, worse, see negative reviews on app stores increase steadily.
Fail Fast: From PxD (Physical x Digital) to D² (Digital x Digital)
In the Physical x Digital (PxD) context, as is often the case in IoT projects, Fail Fast is almost mandatory. But itâs also worth applying it in the Digital x Digital (D²) realmâpurely digital projects. If we receive documentation for a REST or GraphQL API, for example, why not test it immediately in a real-world scenario where it might fail? Better for us to find out than the client.
âBut arenât there already automated tests?â
Itâs true, many believe that a solution validated by unit tests and internal QA is safe. However, recent failures, even in fields like aerospace, show that even an apparently stable system can present surprises when a third party starts interacting with it. Testing with a Fail Fast approach, perhaps with a precise time-box, could make all the difference.
Iâm curious to hear your thoughts and whether youâve adopted this practice as well. Do you have any concrete examples? Let me know!