Avoid Trying to Predict the Future with Cloud Scaling

In software development, we’re often required to make predictions about the future with little information.

Sometimes, we need to predict future direction in order to make decisions about effort or maintainability. Is it likely that feature ‘x’ will ever be important? How flexible should the architecture be? Our understanding of the future has an impact on how we distribute our efforts today.

Have you ever initiated a new project and gone through the exercise of trying to predict future server load? This often happens before a single line of code is written and before a system can undergo any load or performance testing. Many enterprises make this type of decision based on an extremely early prediction of the infrastructure required for a solution. There is a pressure to establish an estimate up front because acquiring and setting up physical infrastructure takes time. There might be an approval process, time spent waiting on delivery, effort unboxing and setting up the servers, installing software, or configuring the network. There may be immovable deadlines or dates promised to customers. We sometimes have to be ready to go-live after the first phase of the project.

Coordinating a release to coincide with our infrastructure will require us to plan far in advance and so, we concede that it’s pretty difficult to go live without servers. As developers, we must do our best to predict what we might need.

An Everyday Example

Based on experience with similar applications, developers can make a best guess about the infrastructure needed for a project. We’ll probably account for load-balancing and fail-over. We expect a couple of reporting features that generate PDFs or Excel files and we expect those to require more resources than the typical page with a few simple database reads and writes. These slightly more intensive features will only be used occasionally and by a small percentage of the user base, so we’ll imagine that our proposed solution might need a minimum of three servers to handle the requested 5000 concurrent users during peak times.

To be safe, we’ll add a little bit of room for additional growth – say, four servers? We’ll do some load testing later, but we need to get this stuff through the approval process ASAP. We expect the ops team to take a few weeks to get these servers delivered, set up, and ready for production.

Let’s hope we’re right. The decisions that we’re trying to make with limited information have real-world consequences. If we underestimate, we’re in trouble. Worst case, our system can’t handle load and doesn’t work effectively for anyone. How long will it take to get another server in? We could expedite the process and be super fast if we have to be – maybe a day. But one day isn’t very fast for our users if they can’t get work done. A day of down time is crippling for any business.

Our fear of underestimating might cause us to be overly careful. Better to have too much than too little, we think. We can’t afford any downtime so we overestimate; we don’t have any performance metrics to go on yet and we want to be safe. The good news is that we can handle additional growth if we need, but for the foreseeable future, we’ve conceded to the cost of managing things that we’re not fully utilizing. Either way, both overestimating and underestimating are suboptimal.

Cloud Scaling

If only we could predict the future precisely. Furthermore, what if we could make our attempts at future prediction completely unnecessary? We only really care about predicting the future because of the high cost in time to adjust and adapt to meet changes in demand.

The cloud makes the time cost of scaling infrastructure go away. For me, this is the single most exciting aspect of utilizing the cloud.  With physical infrastructure, changes are measured in days or weeks. With the cloud, seconds or minutes.

You’ve heard the buzzwords. Are you cloud-optimized? Cloud-ready? Cloud-speak is everywhere. It can be a bit overwhelming as we try to filter through all of the messages in the mad race for services. In getting more familiar with the cloud, it may be beneficial to start with understanding one of the most fundamental concepts, one that can benefit almost any new software project: scaling.

With cloud infrastructure, the provider manages all of the physical servers. For us, it’s abstracted and elastic. We can treat our infrastructure like configurable software, adding more servers or instances at the click of a button or the slide of a slider. We can scale up or down in a matter of seconds without committing to the ownership or the set up time required with physical devices. We can even get fancy and automate the addition of resources based on need. When our application isn’t busy, we scale down to save cost. When our system is busy, we scale up to meet demand. And we only pay for what we’re actually using. We can even manage our cloud infrastructure to automatically stay within our budget.

Services like Microsoft’s Azure can even manage this for us on Website instances. Want your website instance to automatically add additional instances if CPU or memory usage rises above a certain threshold for a specified length of time? Want it to scale back down when usage subsides? You can actually just configure that in the portal.

insights-cloud-scaling

 

How many servers do I need? Who cares! I’ll configure my subscription for as many instances as I need at the time and configure it to scale up and down as needed and within my budget.

Now, the cloud isn’t magic, we still need to make strong architecture choices to ensure that our application can scale out. But we can remove one of the decisions we were previously forced to make up front and defer it to a point in time where we actually know how many resources are required by the solution. We’re no longer held accountable to a premature plan, a best guess based on minimal information with serious consequences when we get it wrong. We can adapt so easily that we can no longer get it wrong.

This cloud agility is incredibly valuable for new software projects. The fact that adapting to load is made trivial makes the question all but irrelevant at the outset of a project. We can spend more of our time focusing on our businesses and perfecting the applications that support them rather than on the overhead required to maintain the infrastructure that they run on.

We’re seeing more and more enterprises starting to move toward utilizing the cloud to increase agility and take advantage of its dynamic nature. With the direction of companies like Microsoft, Amazon, Google, Apple, IBM investing so heavily in cloud infrastructure, the management and automation tools available are evolving and keep getting better and better.

It’s an exciting time to build software.​​