Speculation is not only something for banks, but can also help to save energy.
May it be a little bit less?
When renting a virtual machine (VM), we set up a QoS contract between the Cloud provider and the customer that specifies details about how “large” the VM should be. These contracts are called Service Level Agreements (SLAs). Violations of terms in these SLAs result into penalties the Cloud provider has to pay to the customer. So setting up the VM with the required storage, memory, CPU cores, and bandwidth shares would suffice to always guarantee these SLAs. However, a VM hardly ever utilizes 100% of all the resources that were provided when the SLA was agreed. Thus, we might ask: May we allocate less resources than agreed, but more than actually utilized at a specific point in time – and not violate SLAs?
The difference between agreeing, providing and utilizing
Let’s have a look into such a case and take storage as example (Table 1). Let’s say that we agreed to provide (at least) 1000 GB when needed. If we now only provide 500 GB, but the customer only utilizes 400 GB, we are completely fine. The customer does not even notice that we provide less than agreed! However, once the customer wants to utilizes more, let’s say 510 GB, we have to react quickly to provide what is requested unless we want to run into a violation of the SLA. In the third case, when the customer would like to use 1010 GB, we could provide more storage space (and sell it at an extra charge), but we are not obliged to do so. We are always fine, when we provide the 1000 GB. Thus, no violation in this case.
Table 1: Provided, agreed, utilized
The heavier the resource consumption, the easier the problem
As to our speculative approach, the hardest questions are “How much can we lower the provisioning of a resource without risking an SLA violation?”, and “When should we do that?”. As we found out very quickly, the answers heavily depend on the characteristics of the workload of an application, especially its volatility. Volatility describes how fast and how much a workload changes. Whereas a stable heavy resource-consuming workload is not a problem, the unstable workloads that go from low to medium or high resource consumption represent the real challenges.
What to do with all the measurements?
In my thesis, I evaluated several methods on how to get a reasonable value for lowering the provided resource, while my colleague Vincent Emeakaroha studied the problem on how to gather the measurements in the first place. Millions of measurements are executed every second. Thus, we had to find an efficient way to represent these measurements and to analyze them quickly.
Case Based Reasoning and Rules
One of the methodologies was Case Based Reasoning – a knowledge management technique that learns based on past experience. It stores some sample measurements together with prospective actions that improve the situation. When a new case comes along, the most similar case with the highest utility is retrieved, and the same action as in the former case is executed. This approach worked fairly well, but its knowledge base grew very large very quickly and so grew the time it took to retrieve similar cases. The most effective approach we eventually found was using a knowledge base using rules. For the implementation we used the Java-based rules engine Drools.
After my holidays you will see how the rules work in more detail. Meanwhile, you can have a look at Drools, a really neat framework that also allows to solve other optimization problems (e.g., with the additional Optaplanner, formally known as Drools Planner). At Zühlke, we have also been using Drools in a bunch of very complex real-life projects.