Key Takeaways
- Scaling a system is a hard problem to solve. Underinvesting in scalability leads to a shortened lifespan for the system, but overinvesting can kill the MVP business case because of cost.
- Teams often make guesses about scalability needs because the business sponsors have a hard time thinking about system usage growth.
- There really are scalability requirements, they are just hard to see. Every system has a business case, and the business case has implicit scalability requirements.
- Achieving scalability affordably involves delicate trade-offs. Most scaling problems result from some critical bottleneck in the system, usually caused by access to a shared resource.
- Architectural experimentation is a good antidote to overbuilding for scalability.
Introduction
Every MVP has an implicit scalability hypothesis hiding inside. The product has a business case that nearly always (if not always) depends on a certain degree of scalability for its success. In today’s world, every successful business idea has to reach a large number of people to achieve its financial goals. As a result, every software system has a paramount concern about scalability. It doesn’t matter how good an idea is if it can’t serve large numbers of people. As a result, the most interesting, and perhaps the most difficult, architectural decisions are about scalability.
As we mentioned in an earlier article:
"Scalability is sometimes confused with performance, which, unlike scalability, is about the software system’s ability to meet its timing requirements and is easier to test than scalability. If the system’s performance is adequate in the initial release, the team may assume that the system would be able to cope with increased workloads. Unfortunately, that’s rarely the case if scalability wasn’t included as one of the top QARs during the architectural design".
The minimum scalability requirement for a system is the level of usage needed to satisfy the business case. More might be nice, but if the business case can’t be achieved, then the system is not worth building. As a result, the development team needs to evaluate the architecture of their system to ensure that this minimum level can be achieved.
A wise team will make sure that they have sufficient capacity to anticipate future usage growth without overinvesting to the point where they exceed the budget for their initiative. Deciding how much growth to anticipate is the art of architecting but requires consideration of market size and growth potential for the solution supported by the system.
Teams often make guesses about scalability needs
Teams often have few concrete requirements about scalability. The business may not be a reliable source of information but, as we noted above, they do have a business case that has implicit scalability needs. It’s easy for teams to focus on functional needs, early on, and ignore these implicit scaling requirements. They may hope that scaling won’t be a problem or that they can solve the problem by throwing more computing resources at it. They have a legitimate concern about overbuilding and increasing costs, but hoping that scaling problems won't happen is not a good scaling strategy. Teams need to consider scaling from the start.
In an earlier article, we wrote:
"[The MVA] consists of a minimal set of technical decisions that are tested and evolved using empiricism over time. These decisions are complemented by a minimal set of architectural practices that help the team to keep the product architecturally viable while they evolve it".
Customer feedback is often the only reliable source of information on the viability of the MVP, but you can’t gather this feedback until you build something, and that thing may not be scalable. As a result, architecting for scalability has to be an iterative, experiment-driven process. Unfortunately, development teams are sometimes resistant to focusing on scalability in their early experiments because they want to see if the MVP is successful before they worry about whether their solution scales. There is some logic in this, but the problem is that the MVP must be scalable for it to be a success.
Achieving scalability affordably involves delicate trade-offs
Uncertainty about scalability needs creates a dilemma: if your solution is not sufficiently scalable it will fail in the marketplace (or with your users), but if you create a solution that is more scalable than you need, by overinvesting in scalability, you may fail financially.
Teams often underestimate their product’s needs for scalability, such as:
- They can be too optimistic about how well their system will scale because they don’t test their assumptions.
- They assume that technology (e.g. cloud) will solve whatever problems they run into. This is rarely true because the barriers to scalability are usually caused by decisions that create bottlenecks, not simply "not a big enough or fast enough engine".
- They are so worried about simply getting the initial release product into the hands of customers that they consider scalability as a "nice-to-have" afterthought.
Overinvesting in scaling, however, is also a problem because it can lead to cost and schedule overruns as well as bloated and poorly performing code. In other words, insufficient scalability may tank the MVP, but investing too much may overinvest too soon and use up scarce funding to solve a problem you don’t yet (and may never) have. Overinvesting may even cause the project to be cancelled due to "sticker shock" so teams may feel more comfortable to err on the side of underestimating.
There really are scalability requirements, they are just hard to see
The MVP often has implicit scalability requirements, such as "in order for this idea to be successful we need to recruit ten thousand new customers". Asking the right questions and engaging in collaborative dialogue can often uncover these. Often these relate to success criteria for the MVP experiment.
For example, consider the business case: To get funding for an idea, the business sponsors often have adoption assumptions. The minimum assumptions that need to be achieved for the business case to be considered "successful" represent the minimum scalability requirements for the system. These assumptions may be optimistic to get the business case approved and still need to be validated through MVP experiments, but they represent a starting point for considering architectural trade-offs.
Sometimes the scalability assumed by the business case is impossible to achieve at an affordable cost (or at the cost stated in the business case). Finding this out quickly is valuable because it can lead to canceling a bad idea more quickly so that funds can be freed to work on more valuable ideas. As a result, teams should run early experiments to test whether this assumed scalability is achievable.
Trade-Offs That Could Cause Scalability Issues
Most scaling problems result from some critical bottleneck in the system, usually caused by access to a shared resource. Some of the more critical issues we have encountered ("YMMV") include:
- As in real estate, location matters. Understanding the impact of distributing processes and data helps teams make better scalability decisions without over-investing in capabilities they don’t yet need.
- Unmanaged shared resource access, sometimes as simple as a sequence number generator, often used in everything from generating order numbers to customer numbers to product numbers. Shared resources include other kinds of common services used by many applications, such as email services, security token generators, encryption services, caching services, and even services that interface with AI components, such as LLMs. Design challenges associated with shared resource access include locking and concurrency control.
- Deciding to use a framework or package (especially open-source) that hides underlying decisions affecting shared resources: As we noted in an earlier article, frameworks and packages greatly amplify a team’s productivity, but the framework/package often has unknown or poorly understood scalability bottlenecks. In the absence of this information, teams will need to run experiments to discover where the framework/package breaks down when load increases.
- Delegating solving potential scalability issues to the cloud vendor. The cloud scalability dilemma: if your application is not scalable to start with, no amount of "cloud technology" is going to solve that problem. Consider for a moment what "the cloud" is. It’s an easy way to provision virtual computing environments. Whether your application can utilize multiple environments to scale depends on the decisions the development team makes. If there are critical bottlenecks in the design, an infinitely scalable environment won’t help. In addition, too much reliance on "canned" solutions offered by a cloud vendor, such as virtual machines, containers, and serverless functions may give a team the illusion that their MVA will be able to scale adequately in the future, even if it is poorly designed to do so.
At a more fundamental level, is "move to the cloud" an architectural decision? Maybe, but if the application does not change to take advantage of some unique cloud features, and if your decision is not made to solve some architectural problem, then it’s not; it’s simply a convenient way to host the system. - Inappropriate Synchronous and Asynchronous Processing Strategies.
Some people see asynchronous communication as another scaling panacea because it allows work to proceed independently of the task that initiated the work. The theory is that the main task can do other things while work is happening in the background. So long as the initiating task does not, at some point, need the results of the asynchronous task to proceed (a very simple example is sending a job to a printer or logging an event), asynchronous processing can help a system to scale. But there may be a sequencing issue. If the initiating task needs to get an answer back to proceed, it can end up waiting. If the initiating task relies on many asynchronous tasks, it may also have to assemble its results in a particular order. Delays in asynchronous tasks can easily slow the system to a crawl. - Excessive Use of LLMs or "No Code" App Builders to Develop MVPs. Using LLMs or "No Code" app builders can speed up building (i.e., coding) the MVP. However, there are risks in that approach. As noted by Mirza Masfiqur Rahman, and Ashish Kundu in "Code Hallucination", "Generative models such as large language models are extensively used as code copilots and for whole program generation. However, the programs they generate often have questionable correctness, authenticity, and reliability in terms of integration as they might not follow the user requirements, provide incorrect and/or nonsensical outputs, or even contain semantic/syntactic errors - overall known as LLM hallucination". Does the team fully understand the trade-offs they implicitly made, and how these trade-offs may impact scalability? Would their technical skills worsen over time by delegating the MVP coding tasks to an LLM or a "no code" app builder?
Conclusion
At the point where you are making a decision that might affect the scalability of your MVA, you should decide whether you need to solve that issue now or whether it can wait. If the amount of rework to change your scalability decision in the future is "very" high then you should solve the problem now, otherwise you should document your "good enough" decision with a note that says that you have the option to make a different decision about scalability later, if needed. Putting this another way, you are intentionally creating technical debt that you may or may not have to repay.
Scaling a system is a hard problem to solve. Underinvesting in scalability leads to a shortened lifespan for the system, but overinvesting can kill the MVP business case because of cost. Thinking in terms of scalability options is important - so long as you can solve the problem later without changing the architecture, you can delay the decision, otherwise you need to solve it now.
Architectural experimentation is a good antidote to overbuilding for scalability. You don’t have to build the desired level of scalability before you need it, provided that you are confident that you can adapt the system at a predictable cost to increase the scalability when you need it. If your experiments show that a scaling decision can be deferred until some later time, then investing in scalability now is neither necessary nor desirable.