Cloud infrastructure Economics: Calculating IaaS cost

In our previous post, we referenced a number of financial factors and operational parameters that we need to take into account in order to calculate some meaningful costs for IaaS services. Let’s see now how these are combined to produce IaaS cost items.

Your mileage may vary on IaaS; you may be renting datacenter space, leasing equipment, operating your own facilities or simply using public clouds to run your business. However, at the end of the day, you need to utilize an apples-to-apples metric to find out which strategy is the most cost-effective. For IaaS, the metric is the footprint of your infrastructure, expressed in terms of virtual computing units: The monthly cost of a single virtual server, broken down to virtual CPU/Memory and virtual storage resources.

You can check online the formulas in this sheet here. If you want to take a peek on how the formulas work, continue reading.

For simplicity, we will not dive into virtua machine OS licensing costs here – they are easy to find out, anyway (those familiar with Microsoft SPLA should have an idea). We will calculate only monthly running costs of IaaS, expressed in the following cost items:

  • SRVCOST: Individual virtual server monthly cost: Regardless of virtual machine configuration, the monthly cost of spinning up a virtual machine.
  • COMPUTECOST: Virtual computing unit cost: The cost of operating one virtual memory GB and assorted CPU resources per month.
  • DISKCOST: Virtual disk unit cost: The cost of operating one virtual storage GB per month.

Almost all IaaS public cloud providers format their pricelists according to these three cost items, or bundle them in prepackaged virtual server sizes. Calculating these can help immensely in finding good answers to the “build or buy” question if you plan to adopt IaaS for your organization, or determine the sell price and margins if you are a public cloud provider.

Let’s see now each cost item one by one.

Individual server cost (SRVCOST)

How much does it cost to spin up a single virtual machine per month? What do we need to take into account? Well, a virtual machine, regardless of its footprint, needs some grooming and the infrastructure it will run on. The assorted marginal costs (cost to add one more virtual machine to our IaaS infrastructure) are the following:

  • C_SRV: Cost of maintaining datacenter network infrastructure (LAN switching, routing, firewall, uplinks) and computing infrastructure software costs (support & maintenance). We do not include here hardware costs since these are related to the footprint of the virtual machine.
  • C_DCOPS: Cost of manhours required to keep the virtual machine and related infrastructure up and running (keep the lights on)
  • C_NWHW: Cost of network related hardware infrastructure required to sustain one virtual machine. These are pure hardware costs and reflect the investment in network infrastructure needed to keep adding virtual machines.

An essential unit used in most calculation is the cost of rack unit. Referring to our older post for the EPC variable, this is expressed as

C_RU=EPC/RU

This gives us an approximation of the cost of one rack unit per month in terms of monthly electiricy and hosting cost (EPC).

C_SRV is expressed as a function of NETSUPP (monthly network operating & support costs), RU_NET (total network infrastructure footprint), CALCLIC (virtualization/computing infrastructure maintenance & software costs) and SRV (total virtual servers running). The formula is:

C_SRV=( NETSUPP + CALCLIC + C_RU*RU_NET ) / SRV

C_RU*RU_NET is the hosting cost of the entire networking infrastructure (switches, patch panels, load balancers, firewalls etc).

C_DCOPS is straightforward to calculate:

C_DCOPS = DCOPS / SRV

And finally C_NWHW is the hardware cost needed to add one more virtual server. To calculate C_NWHW we take into account the current network infrastructure cost and then we calculate how much money we have to borrow to expand it in order to provision one more virtual server. The way we do this is to divide the total network infrastructure cost with the number of provisioned virtual machines and spread this cost over the lifecycle of the hardware (AMORT), augmented with a monthly interest rate (INTRST):

C_NWHW=(NETINFRA/SRV) * (1+INTRST) / AMORT

Computing cost (COMPUTECOST)

As a computing unit, for simplicity we define one GB of virtual RAM coupled with an amount of processing power (CPU). Finding the perfect analogy between memory and CPU power is tricky and there is no golden rule here, so we define the metric as the amount of virtual RAM. The exact CPU power assigned to each virtual RAM GB depends on the amount of physical RAM configured in each physical server (SRVRAM) and the number of physical CPU cores of each server. COMPUTECOST is broken down to two cost items:

  • C_MEM: It is the cost associated with operating the hardware infrastructure that provisions each virtual RAM GB.
  • C_SRVHW: It is the cost associated with purchasing the hardware infrastructure required to provide each virtual RAM GB.

C_MEM depends on running costs and is the cost of compute rack units divided by the total virtual RAM deployed in our cloud:

C_MEM = (RU_CALC * C_RU) / TOTALMEM

Note that in some cases (like VMware’s VSPP program) you may need to add up to the above cost software subscription/license costs, if your virtualization platform is licensed per virtual GB.

C_SRVHW is calculated in a more complex way. First, we need to find out the cost of hardware associated with each virtual GB of RAM. This is the cost of one physical server equipped with RAM, divided with the amount of physical RAM adjusted with the memory overprovisioning factor:

CAPEX_MEM = (SERVER + MEMORY * SRVRAM) / (SRVRAM * MEMOVERPROV)

In a similar way with C_NWHW, we calculate the acquisition cost spread over the period of infrastructure lifecycle, with the monthly interest rate:

C_SRVHW = CAPEX_MEM * (1 + INTRST) / AMORT

Virtual storage cost (DISKCOST)

Calculating DISKCOST is simpler. The two cost items, in a similar way to COMPUTECOST are:

  • C_STOR: It is the cost associated with operating the hardware infrastructure that provisions each virtual RAM GB.
  • C_STORHW: It is the cost associated with purchasing the hardware infrastructure required to provide each virtual disk GB.

C_STOR is based on the existing operating costs for running the storage infrastructure and is calculated proportionally to the provisioned disk capacity:

C_STOR = (STORLIC + RU_STOR * C_RU) / TOTALSTOR

C_STORHW is the cost of investment for each storage GB over the infrastructure lifecycle period:

C_STORHW = (STORINFRA/TOTALSTOR) * (1 + INTRST) / AMORT

 

One can elaborate on this model and add all sorts of costs and parameters, however, from our experience, this model is quite accurate for solving an IaaS financial exercise. What you need simple datacenter metrics and easily obtained costs.

Cloud infrastructure Economics: Cogs and operating costs

Perhaps the most important benefit of adopting cloud services (either from a public provider or internally from your organization) is that their cost can be quantified and attributed to organizational entities. If a cloud service cannot be metered and measured, then it should not be called a cloud service right?

So, whenever you need to purchase a cloud service or when you are called to develop one, you are presented with a service catalog and assorted pricelists, from where you can budget, plan and compare services. Understanding how the pricing has been formulated is not part of your business since you are on the consumer side. However, you should care: You need to get what you pay for. There must be a very good reason for a very expensive or a very cheap cloud service.

In the past, we have developed a few cloud services utilizing own resources and third party services. Each and every time, determining whether launching the service commercially would be a sound practice depended on two factors:

  • Would customers pay for the service? If yes, at what price?
  • If a similar service already was on the market, where would our competitors stand?
  • What is the operating cost of the service?

Answering the first two questions is straightforward: Visit a good number of trusted and loyal customers, talk to them, investigate competition. That’s a marketing and sales mini project. But answering the last question can be a hard thing to do.

Let us share some insight on the operating costs and cost-of-goods for a cloud service and in particular, infrastructure as a service (IaaS). Whether you already run IaaS for your organization or your customers, you are in one of the following states:

  1. Planning to launch IaaS
  2. Already running your datacenter

State (1) is where you have not yet invested anything. You need to work on implementation and operational scenarios (build or buy? Hire or rent?) and do a good part of marketing plans. State (2) is where you have already invested, you have people, processes and technology in place and are delivering services to your internal or external customers. In state (1) you need to develop a cost model, in state (2) you need to calculate your costs and discover your real operating cost.

In both cases, the first thing you need to do before you move on with cost calculation is to guesstimate (state 1) or calculate (state 2) the footprint of your investment and delivered services. From our experience, the following parameters are what you should absolutely take into account in order to properly find out how much your IaaS really costs.

Financial parameters (real money)

  • EPC: Electrical power and hosting cost. How much do (or would) you pay for electricity and hosting. This can be found from your electricity bill, your datacenter provider monthly invoice or from your financial controller (just make sure you ask them the right questions, unless you want to get billed with the entire company overhead costs). EPC is proportional to your infrastructure footprint (ie number of cabinets and hardware).
  • DCOPS: Payroll for the operations team. You need to calculate the total human resource costs here for the team that will operate IaaS services. You may include here also marketing & sales overhead costs.
  • CALCLIC: Software licensing and support costs for IaaS entire computing infrastructure layer. These are software costs associated with the infrastructure (eg, hypervisor licenses), not license costs for delivered services, eg Microsoft SPLA costs.
  • STORLIC: Software licensing and support costs for your entire storage infrastructure. Include here in their entirety also data backup software costs.
  • SERVER: Cost of a single computing server. It’s good to standardize on a particular server model (eg 2-way or 4-way, rackmount or blade). Here you should include the cost of a computing server, complete with processors but without RAM. RAM to CPU ratio is a resource that is adjusted according to your expected workloads and plays a substantial role in cost calculation. If you plan to use blade servers, you should factor here the blade chassis as well.
  • MEMORY: Average cost of 1 GB or RAM.
  • STORINFRA: Cost of your storage infrastructure, as is, or the storage infrastructure you plan to purchase. Storage costs are not that easy to calculate as a factor of 1 disk GB units, since you have to take into account SAN, backup infrastructure, array controllers, disk enclosures and single disks. Of course we assume you utilize a centralized storage infrastructure, pooled to your entire computing farm.
  • NETINFRA: Cost of data network. As above, include here datacenter LAN, load balancers, routers, even cabling.
  • NETSUPP: Cost of network support (monthly). Include here software licensing, antivirus subscriptions and datacenter network costs.

Operational parameters (Facts and figures)

  • RUAmount of available rack units in your datacenter. This is the RU number you can use to install equipment (protected with UPS, with dual power feeds etc).
  • RU_STOR: Rack units occupied by storage systems
  • RU_CALC: Rack units occupied by computing infrastructure (hypervisors)
  • RU_NET: Rack units occupied by network infrastructure
  • SRV: Virtual machines (already running or how many you plan to have within the next quarter)
  • INTRST: Interest rate (cost of money): Monthly interest rate of credit lines/business loans
  • TOTALMEM: Total amount of virtual memory your SRV occupy
  • TOTALSTOR: Total amount of virtual storage your SRV occupy
  • SRVRAM: Amount of physical memory for each physical server. This is the amount of RAM you install in each computing server. It is one of the most important factors, since it depends on your average workload. A rule of thumb is that for generic workloads, a hardware CPU thread can sustain up to 6 virtual computing cores (vcpu). For each vcpu, you need 4 GB of virtual RAM. So, for a 2-socket, 6-core server you need 2 (sockets) x 6 (cores) x 6 (vcpu) x 4 (GB RAM) = 288 GB RAM. For a 4-way, 8-core server beast with memory intensive workloads (say 8 GB per vcpu) you need 4 x 8 x 6 x 8 = 1536 GB RAM (1.5 TB).
  • MEMOVERPROV: Memory overprovisioning for virtual workloads. A factor that needs tuning from experience. If you plan conservatively, use a 1:1 overprovisioning factor (1 GB of physical RAM to 1 GB of virtual RAM). If you are more confident and plan to save costs, you can calculate an overprovisioning factor of up to 1.3. Do this if you trust your hypervisor technology and have homogenous workloads on your servers (for example, all-Windows ecosystem) so that your hypervisor can take advantage of copy-on-write algorithms and save physical memory.
  • AMORT: Amortization of your infrastructure. This is a logistics & accounting term, but here we mainly use this to calculate the lifespan of our infrastructure. It is expressed in months. A good value is 36 to 60 months (3 to 5 years), depending on your hardware warranty and support terms from your vendor.

If you can figure out the above factors, you can proceed with calculating your operating IaaS costs. Keep reading here!

Over The Top: Too high to reach?

Recently, Facebook got WhatsApp (a mobile multichannel messaging startup) for $19B. It’s still too early to comprehend the rationale of this move: Buying market share? A bubble effect? An ingenious strategy move that aims squarely at service providers? A stupid buyout that will sink FB stock? Time will tell.

However, we can draw some conclusions on what this means for telco service providers. Over the top services (as they usually call the services they cannot comprehend, and cannibalize their market share) have value. And it’s a train they have to catch. Let me tell a story here:

Some time ago, while working for a small systems integrator, we had been setting up a public IaaS cloud provider for the lcal market. The first approach, based on the free Cloudstack edition was a breeze to work with, very very cheap and quite reliable. The number of customers we attracted was… two. A couple of months later, we revisited our strategy – this time, the platform was based on VMware vCloud directory (the very first deployment in the regional market), targeting mostly enterprise customers (and our own customers as well). Things moved a bit better, slowly gaining sales traction. However, this was not enough. A plain IaaS offering would only sell as fast as customers would face the dilemma “buy or rent” and opted for renting infrastructure.

The way forward was to bundle applications with our cloud. We turned to software vendors, hoping to deliver preconfigured cloud servers bundled with software targeting small and medium businesses, with a SaaS licensing scheme. A good move, don’t you think? Guaranteed SLA 99,99%, with preinstalled popular CRM & ERP apps, licensed as a service, no upfront costs, no commitment. Customers would flock to our cloud. Eventually however, they did not. The problem was the total miss of customer addressable market.

Would you like fries with that?

Would you like fries with that?

Let me explain a few things: Building a cloud does not mean that it will sell by itself. We were a small integrator with a cloud service. Our customers were mostly medium enterprises buying services and infrastructure from us. We managed to upsell IaaS to these customers and attract a few new ones. However, no matter how attractive a SaaS offering would be, we had to address a market we did not have any access to. And that’s why we failed.

What does this have to do with telcos and OTT services? A lot. Telcos sell data & voice; they have been doing that forever. On top of their data services lives the entire IaaS, PaaS, SaaS, *aaS ecosystem. The revenue telcos see from the immense value of OTT services is the data rate, nothing more. They don’t capitalize on this market, even though:

  • They own the platform and medium
  • OTT services customers are the same customers they serve,

thigns we lacked when we launched our IaaS and SaaS cloud services. It seems that OTT services are at a cloud reach, don’t they? Every telco can setup an IaaS platform, add some sauce on top (applications, that is) and address their own customer base. We are not referring to telcos-gone-cloud (Verizon/Terremark), we are talking about service portfolios that address a very large part of any telco customer base.

My guess is that all it takes are good synergies here and exploitation of local markets: Upsell services to existing customer by letting other vendors cross-sell their products and services. There are lots of moving parts here, the most important challenge being to build an efficient ecosystem, reach & upsell to own subscribers and produce value to the end consumer. One needs to step into customer shoes, tick the right boxes from a desired service list, find the right partners and deliver these services from established sales channels.

 

The IT manager role (car paradigm)

The “pre-cloud” era (or, before virtualization and public cloud providers): Read instructions, assemble all parts, turn engine on, let everybody drive (and then pick up the pieces):

Some assembly required…

The “cloud” era (or, virtualized datacenter, services outsourced in the cloud): Get in, put company in the back seat, CEO sits in front, CIO drives: