Appearance
How Zypr's Derivative-Based Optimizer Works
Simulator Model
Zypr’s simulation model is based on a mixing process (see Inventory Policy Primer), cost functions, and mass balance principles, whereby resource state changes to a target environment (referred to as a resource "pool") are evaluated as a closed system, such that:
Accumulation therefore represents pool resources that increase over time in response to service demand growth. The transactional processes to add more performant hardware, and remove less performant ones, is governed by maximum and minimum pool Utilization rates. A pool capacity increase event is trigger at the maximum rate, whereas the minimum rate limits how much net Accumulation may be added.
Zypr simulates the process of state changes, for a preferred forward-looking time period, using a piece-wise smooth dynamical system that supports injection of discrete (non-smooth) events, which are called Jump Events. The preferred forward-looking time period is partitioned into "smooth" sequential time-based intervals, whose start and end time correspond to Utilization boundaries. Jump Events enable arbitrary state changes to be injected into this evolution process at any arbitrary time.
Arbitrary does not mean these events are "unknown or unpredictable." In fact, most events are known and well-planned weeks, months and even years before they occur. Jump Events provide the means to solve the inherent complexity of various resource behaviors that are not "smooth" yet meaningfully impact a pool's state (e.g., a server configuration change).
A variety of cost functions are used to evaluate resource consumption costs at each state partition of the dynamical system. A unique sequence, derived from these stateful partitions, represents a feasible solution.
Simulation Methods
Zypr enables two types of simulations:
- Fixed
- Optimal
The objective of the Fixed simulation type is to identify the evolution of server inventory transactions based upon a predetermined turnover time and rate (e.g., refresh servers every five years). Resource state evolution is therefore bound by this input value and results in a single feasible solution that is output. Most organizations use a predetermined refresh time to approximate lowest total cost.
The Optimal simulation type performs a mathematical optimization whose objective function is to minimize total resource cost. The term "optimal" refers to the rate of uptake of more performant hardware, with respective to what hardware performance is currently available from a vendor, that yields the maximum rate of monetization that results in the lowest total resource cost for a pool. An Optimal simulation therefore seeks to solve the following equation for the least cost:
For example, refreshing a pool's server inventory at a rate whereby the pool always contained only the highest performance servers available, would produce a performance capture yield of 100%. For most services, that is not the "optimal" performance yield (i.e., least-cost solution). An Optimal simulation answers what is.
In contrast to a Fixed Time simulation, an Optimal simulation generates a forecast that describes both the timing and uptake rates (ingress) of more performant servers as well as the timing of removal rate (egress) of less performant ones. This forecast provides the information necessary to evolve the pool's resource composition, over a preferred future time period, at the lowest total cost.
Optimization Engine
An Optimal simulation uses Zypr's optimization engine to further partition time-intervals in order to evaluate resource consumption, turnover, cost, and cost dependency derivatives with respect to time.
The following sections further describe how each might influence identifying a lowest total cost solution.
Cost Dependencies
Zypr uses cost functions that incorporate the rate of computation (i.e., performance) of hardware to normalize the relative cost of hardware and other resources, whose cost are correlated to hardware performance. Performance is represented by scalar values.
For example, performance per watt is a well-known metric that measures the rate of computation that is delivered by a hardware unit for every watt it consumes. Zypr applies the same principle to data center space resources and software, whose license or subscription fee correlates to a hardware unit's rate-of-computation (e.g., license per processor).
Below is a closed-form formula that solves the breakeven for server refresh as continuous function.
Although this formula cannot identify an optimal solution, it does provide a succinct way to describe how the relative cost of resources (referred to as a pool's Kratio) combine with hardware performance improvement rates to produce different hardware refresh rates.
Specifically, the Kratio represents the cost of software, power and space resources for every dollar of hardware cost. Kratios typically range from 2 to 10 in most enterprise environments, and reflects resource pools with different cost structures. Different cost structures are usually due to unique software stacks that enable specific services (e.g., databases, HA apps, message services, web, etc.). The key takeaway is:
The exact same increase in hardware performance can produce significantly different financial savings when assigned to resource pools with very different Kratios.
This creates both intra and inter pool hardware assignment (combinatorial) optimization opportunities.
The illustrations below show that as a pool's Kratio increases, the breakeven time of server refresh decreases. Conversely, the higher the rate of performance improvement, the breakeven time of server refresh also decreases.
An important difference between the two different simulation types is: An Optimal simulation accounts for these differentials, whereas a Fixed simulation does not.
Specifically, an Optimal simulation evaluates two different methods for how pool capacity is increased, over time, in response to rising service demand.
- Increase pool capacity by adding incremental hardware units and corresponding software licenses, space, and power consumption.
- Increase pool capacity by "trading-up" to more performant hardware regardless of the chronological age of hardware being replaced.
The former method requires payment to directly acquire all incremental resources. Whereas the latter is a form of hardware performance arbitrage in which additional software, power and space resources are effectively increased without directly purchasing additional software licenses, watts of electricity, and rack space.
This concept underpins the policy of hardware refresh. However, an Optimal simulation doesn't predetermine when or at what rate hardware refresh should be executed. Instead, it relies on assignment (combinatorial) optimization to modify a pool's composition — its resource mix — in such a way that minimizes the pool's total resource cost.
The rate of assignment of more performant hardware into a pool therefor corresponds to a pool's Kratio and service demand growth rate.
Resource Consumption Cost Functions
Zypr provides a very flexible object model to define the cost dependencies between hardware and corresponding software, space and power consumption rates. Both simulation types use the same cost functions.
The Software Item object model enables complex license consumption and cost rules to be defined individually for each software item in a pool's software stack. As a simulation evolves, consumption rules feed a rules engine that determines the required licenses at a specific resource state and time. Required license quantity is then evaluated against a repository that Zypr dynamically builds and maintains that contains the then current license position state.
Once a license position state is determined, Contract Terms are applied. The object model enables resource planners (particularly SAM resource planners) to specify software cost rules based on the unique contract calendar for that particular software item according to the settlement schedule with the software publisher. Several options are provided for how the settlement calendar is built by Zypr. The calendar defines vendor schedules for future prices, how prices are applied that create a cost liability, and when those costs are posted.
For example, a Cost Rule for an incremental increase in required licenses may specify an incremental cost be based upon a fixed-duration charge of six months regardless if actual use commenced eleven or one month prior to the annual settlement date. Additionally, the object model enables incremental cost obligations to be posted (i.e., recognized) upon "first use" or posted at an annual settlement date or a contract anniversary date.
Zypr's provides similar object models to describe server hardware, space, and power consumptions rates, and their respective cost functions.
Hardware Queue
Hardware queue represents the sequential order that hardware resources are removed from a pool. Fixed and Optimal simulations differ in how the order is determined.
The queue order of a Fixed simulation is based on the chronological age of pool hardware. Chronologically older hardware exit the pool before younger hardware.
The queue order of an Optimal simulation is based on the factors explained above and calculated on a per server, processor, or core basis. Zypr can automatically calculate and dynamically select the best queue order for the state evolution process (highly recommended). Alternatively, you may set the queue order for the entire state evolution process.
Jump Events
Jump Events provide the means to inject state changes — derivative functions introduced by non-continuous, discrete events.
In fact, discrete events describe so many types of resource behavior of pools, Zypr will not execute a simulation if the Jump Events array object, in a submitted Pool Resource Model, is null or empty. Except for rare edge cases, Jump Event injection into the state evolution process are treated the same for Fixed and OptimaYield simulations.
Hardware Performance Rating
Zypr assumes infrastructure capacity planners and performance engineers have reliable practices to evaluate and identify the expected future gains in hardware performance.
The Performance Jump Event object provides two ways to describe expected gains in hardware performance.
- A change in the performance rating of a server that is intended to be added to a pool.
- A change in performance of hardware external to the pool, but effects all existing servers in a pool.
The former applies to an individual server and typically accompanies a Configuration Jump Event to describe a change to a server's characteristics.
The latter applies to all existing servers in the pool and future servers that will be added to the pool. The performance gain effectively serves to re-rate all pool server inventory. This use-case typically arises from upgrades or adoption of new architectures that measurably reduces I/O throughput, response latencies, or off-loads tasks.
Hardware Utilization Rates
Zypr further assumes capacity planners and performance engineers have reliable practices to measure and forecast average pool utilization rates and correlate better hardware performance to future pool utilization.
Zypr incorporates these factors to govern the state evolution process. Future state is evolved using interval time partitions based on the following formula:
An interval time partition is usually further divided when a Jump Event is injected into the evolution process.
Constraints
Constraints provide the means to limit the feasible solution space Zypr evaluates and thereby avoid solutions considered unacceptable. For example, it may be preferred to limit total server quantity for the first year of a simulation, after which additional power distribution will be available.
Constraints are only permitted for Optimal simulations. That's because a Fixed Time simulation represents a single solution (i.e., a sequence of transactions), which cannot be constrained without producing a null solution. Whereas an Optimal simulation evaluates innumerable feasible solutions and a constraint is intended to reduce the feasible solution space. Of course, if a constraint narrows the solution space too much, it too will produce a null solution.