Category Archives: infrastructure

Cloud infrastructure Economics: Calculating IaaS cost

In our previous post, we referenced a number of financial factors and operational parameters that we need to take into account in order to calculate some meaningful costs for IaaS services. Let’s see now how these are combined to produce IaaS cost items.

Your mileage may vary on IaaS; you may be renting datacenter space, leasing equipment, operating your own facilities or simply using public clouds to run your business. However, at the end of the day, you need to utilize an apples-to-apples metric to find out which strategy is the most cost-effective. For IaaS, the metric is the footprint of your infrastructure, expressed in terms of virtual computing units: The monthly cost of a single virtual server, broken down to virtual CPU/Memory and virtual storage resources.

You can check online the formulas in this sheet here. If you want to take a peek on how the formulas work, continue reading.

For simplicity, we will not dive into virtua machine OS licensing costs here – they are easy to find out, anyway (those familiar with Microsoft SPLA should have an idea). We will calculate only monthly running costs of IaaS, expressed in the following cost items:

  • SRVCOST: Individual virtual server monthly cost: Regardless of virtual machine configuration, the monthly cost of spinning up a virtual machine.
  • COMPUTECOST: Virtual computing unit cost: The cost of operating one virtual memory GB and assorted CPU resources per month.
  • DISKCOST: Virtual disk unit cost: The cost of operating one virtual storage GB per month.

Almost all IaaS public cloud providers format their pricelists according to these three cost items, or bundle them in prepackaged virtual server sizes. Calculating these can help immensely in finding good answers to the “build or buy” question if you plan to adopt IaaS for your organization, or determine the sell price and margins if you are a public cloud provider.

Let’s see now each cost item one by one.

Individual server cost (SRVCOST)

How much does it cost to spin up a single virtual machine per month? What do we need to take into account? Well, a virtual machine, regardless of its footprint, needs some grooming and the infrastructure it will run on. The assorted marginal costs (cost to add one more virtual machine to our IaaS infrastructure) are the following:

  • C_SRV: Cost of maintaining datacenter network infrastructure (LAN switching, routing, firewall, uplinks) and computing infrastructure software costs (support & maintenance). We do not include here hardware costs since these are related to the footprint of the virtual machine.
  • C_DCOPS: Cost of manhours required to keep the virtual machine and related infrastructure up and running (keep the lights on)
  • C_NWHW: Cost of network related hardware infrastructure required to sustain one virtual machine. These are pure hardware costs and reflect the investment in network infrastructure needed to keep adding virtual machines.

An essential unit used in most calculation is the cost of rack unit. Referring to our older post for the EPC variable, this is expressed as


This gives us an approximation of the cost of one rack unit per month in terms of monthly electiricy and hosting cost (EPC).

C_SRV is expressed as a function of NETSUPP (monthly network operating & support costs), RU_NET (total network infrastructure footprint), CALCLIC (virtualization/computing infrastructure maintenance & software costs) and SRV (total virtual servers running). The formula is:


C_RU*RU_NET is the hosting cost of the entire networking infrastructure (switches, patch panels, load balancers, firewalls etc).

C_DCOPS is straightforward to calculate:


And finally C_NWHW is the hardware cost needed to add one more virtual server. To calculate C_NWHW we take into account the current network infrastructure cost and then we calculate how much money we have to borrow to expand it in order to provision one more virtual server. The way we do this is to divide the total network infrastructure cost with the number of provisioned virtual machines and spread this cost over the lifecycle of the hardware (AMORT), augmented with a monthly interest rate (INTRST):


Computing cost (COMPUTECOST)

As a computing unit, for simplicity we define one GB of virtual RAM coupled with an amount of processing power (CPU). Finding the perfect analogy between memory and CPU power is tricky and there is no golden rule here, so we define the metric as the amount of virtual RAM. The exact CPU power assigned to each virtual RAM GB depends on the amount of physical RAM configured in each physical server (SRVRAM) and the number of physical CPU cores of each server. COMPUTECOST is broken down to two cost items:

  • C_MEM: It is the cost associated with operating the hardware infrastructure that provisions each virtual RAM GB.
  • C_SRVHW: It is the cost associated with purchasing the hardware infrastructure required to provide each virtual RAM GB.

C_MEM depends on running costs and is the cost of compute rack units divided by the total virtual RAM deployed in our cloud:


Note that in some cases (like VMware’s VSPP program) you may need to add up to the above cost software subscription/license costs, if your virtualization platform is licensed per virtual GB.

C_SRVHW is calculated in a more complex way. First, we need to find out the cost of hardware associated with each virtual GB of RAM. This is the cost of one physical server equipped with RAM, divided with the amount of physical RAM adjusted with the memory overprovisioning factor:


In a similar way with C_NWHW, we calculate the acquisition cost spread over the period of infrastructure lifecycle, with the monthly interest rate:


Virtual storage cost (DISKCOST)

Calculating DISKCOST is simpler. The two cost items, in a similar way to COMPUTECOST are:

  • C_STOR: It is the cost associated with operating the hardware infrastructure that provisions each virtual RAM GB.
  • C_STORHW: It is the cost associated with purchasing the hardware infrastructure required to provide each virtual disk GB.

C_STOR is based on the existing operating costs for running the storage infrastructure and is calculated proportionally to the provisioned disk capacity:


C_STORHW is the cost of investment for each storage GB over the infrastructure lifecycle period:



One can elaborate on this model and add all sorts of costs and parameters, however, from our experience, this model is quite accurate for solving an IaaS financial exercise. What you need simple datacenter metrics and easily obtained costs.


Cloud infrastructure Economics: Cogs and operating costs

Perhaps the most important benefit of adopting cloud services (either from a public provider or internally from your organization) is that their cost can be quantified and attributed to organizational entities. If a cloud service cannot be metered and measured, then it should not be called a cloud service right?

So, whenever you need to purchase a cloud service or when you are called to develop one, you are presented with a service catalog and assorted pricelists, from where you can budget, plan and compare services. Understanding how the pricing has been formulated is not part of your business since you are on the consumer side. However, you should care: You need to get what you pay for. There must be a very good reason for a very expensive or a very cheap cloud service.

In the past, we have developed a few cloud services utilizing own resources and third party services. Each and every time, determining whether launching the service commercially would be a sound practice depended on two factors:

  • Would customers pay for the service? If yes, at what price?
  • If a similar service already was on the market, where would our competitors stand?
  • What is the operating cost of the service?

Answering the first two questions is straightforward: Visit a good number of trusted and loyal customers, talk to them, investigate competition. That’s a marketing and sales mini project. But answering the last question can be a hard thing to do.

Let us share some insight on the operating costs and cost-of-goods for a cloud service and in particular, infrastructure as a service (IaaS). Whether you already run IaaS for your organization or your customers, you are in one of the following states:

  1. Planning to launch IaaS
  2. Already running your datacenter

State (1) is where you have not yet invested anything. You need to work on implementation and operational scenarios (build or buy? Hire or rent?) and do a good part of marketing plans. State (2) is where you have already invested, you have people, processes and technology in place and are delivering services to your internal or external customers. In state (1) you need to develop a cost model, in state (2) you need to calculate your costs and discover your real operating cost.

In both cases, the first thing you need to do before you move on with cost calculation is to guesstimate (state 1) or calculate (state 2) the footprint of your investment and delivered services. From our experience, the following parameters are what you should absolutely take into account in order to properly find out how much your IaaS really costs.

Financial parameters (real money)

  • EPC: Electrical power and hosting cost. How much do (or would) you pay for electricity and hosting. This can be found from your electricity bill, your datacenter provider monthly invoice or from your financial controller (just make sure you ask them the right questions, unless you want to get billed with the entire company overhead costs). EPC is proportional to your infrastructure footprint (ie number of cabinets and hardware).
  • DCOPS: Payroll for the operations team. You need to calculate the total human resource costs here for the team that will operate IaaS services. You may include here also marketing & sales overhead costs.
  • CALCLIC: Software licensing and support costs for IaaS entire computing infrastructure layer. These are software costs associated with the infrastructure (eg, hypervisor licenses), not license costs for delivered services, eg Microsoft SPLA costs.
  • STORLIC: Software licensing and support costs for your entire storage infrastructure. Include here in their entirety also data backup software costs.
  • SERVER: Cost of a single computing server. It’s good to standardize on a particular server model (eg 2-way or 4-way, rackmount or blade). Here you should include the cost of a computing server, complete with processors but without RAM. RAM to CPU ratio is a resource that is adjusted according to your expected workloads and plays a substantial role in cost calculation. If you plan to use blade servers, you should factor here the blade chassis as well.
  • MEMORY: Average cost of 1 GB or RAM.
  • STORINFRA: Cost of your storage infrastructure, as is, or the storage infrastructure you plan to purchase. Storage costs are not that easy to calculate as a factor of 1 disk GB units, since you have to take into account SAN, backup infrastructure, array controllers, disk enclosures and single disks. Of course we assume you utilize a centralized storage infrastructure, pooled to your entire computing farm.
  • NETINFRA: Cost of data network. As above, include here datacenter LAN, load balancers, routers, even cabling.
  • NETSUPP: Cost of network support (monthly). Include here software licensing, antivirus subscriptions and datacenter network costs.

Operational parameters (Facts and figures)

  • RUAmount of available rack units in your datacenter. This is the RU number you can use to install equipment (protected with UPS, with dual power feeds etc).
  • RU_STOR: Rack units occupied by storage systems
  • RU_CALC: Rack units occupied by computing infrastructure (hypervisors)
  • RU_NET: Rack units occupied by network infrastructure
  • SRV: Virtual machines (already running or how many you plan to have within the next quarter)
  • INTRST: Interest rate (cost of money): Monthly interest rate of credit lines/business loans
  • TOTALMEM: Total amount of virtual memory your SRV occupy
  • TOTALSTOR: Total amount of virtual storage your SRV occupy
  • SRVRAM: Amount of physical memory for each physical server. This is the amount of RAM you install in each computing server. It is one of the most important factors, since it depends on your average workload. A rule of thumb is that for generic workloads, a hardware CPU thread can sustain up to 6 virtual computing cores (vcpu). For each vcpu, you need 4 GB of virtual RAM. So, for a 2-socket, 6-core server you need 2 (sockets) x 6 (cores) x 6 (vcpu) x 4 (GB RAM) = 288 GB RAM. For a 4-way, 8-core server beast with memory intensive workloads (say 8 GB per vcpu) you need 4 x 8 x 6 x 8 = 1536 GB RAM (1.5 TB).
  • MEMOVERPROV: Memory overprovisioning for virtual workloads. A factor that needs tuning from experience. If you plan conservatively, use a 1:1 overprovisioning factor (1 GB of physical RAM to 1 GB of virtual RAM). If you are more confident and plan to save costs, you can calculate an overprovisioning factor of up to 1.3. Do this if you trust your hypervisor technology and have homogenous workloads on your servers (for example, all-Windows ecosystem) so that your hypervisor can take advantage of copy-on-write algorithms and save physical memory.
  • AMORT: Amortization of your infrastructure. This is a logistics & accounting term, but here we mainly use this to calculate the lifespan of our infrastructure. It is expressed in months. A good value is 36 to 60 months (3 to 5 years), depending on your hardware warranty and support terms from your vendor.

If you can figure out the above factors, you can proceed with calculating your operating IaaS costs. Keep reading here!

A quick tour of Cloudstack, now a part of Citrix, has developed a neat, compact, yet powerful platform for cloud management: Enter cloudstack, a provisioning, management and automation platform for KVM, Xen and VMware, already trusted for private and public cloud management frmo companies like Zynga (got Farmville?), Tata communications (public IaaS) and KT (major Korean Service provider).

Recently I had the chance to give cloudstack a spin in a small lab installation with one NFS repository and two Xenservers. Interested in how it breathes and hums? Read on, then.

Cloudstack was installed in a little VM in our production vSphere environment. Although it does support vSphere 4.1, we decided to try it with Xen and keep it off the production ESX servers. Installation was completed in 5 minutes (including the provisioning of the Ubuntu 10.04 server from a ready VMware tremplate) and cloudstack came to life, waiting for us to login:

The entire interface is AJAX – no local client. In fact, cloudstack can be deployed in a really small scale (a standalone server) or in a full-blown fashion, with redundant application and database servers to fulfill scalability and availability policies.

Configuring cloudstack is a somewhat more lengthy process and requires reading the admin guide. We decided to follow the simple networking paradigm, without VLANs and use NFS storage for simplicity. Then, it was time to define zones, pods and clusters, primary and secondary storage. In a nutshell:

  • A zone is a datacenter. A zone has a distinct secondary storage, used to store boot ISO images and preconfigured virtual machine templates.
  • A pod is servers and storage inside a zone, sharing the same network segments
  • A cluster is a group of servers with identical CPUs (to allow VM migration) inside a pod. Clusters share the same primary storage.
We created a single zone (test zone) with one pod and two clusters, each cluster consisting of a single PC (one CPU, 8 GB RAM) running Xenserver 5.6. Configuring two clusters was mandatory, since the two Xenservers were of different architectures (Core 2 and Xeon). After the configuration was finished, logging in to Cloudstack as administrator brings us to the dashboard.

In a neat window, the datacenter status is shown in clear, with events and status in the same frame. From here an administrator has full power over the entire deployment. This is a host (processing node in Openstack terms) view:

You can see the zone hierarchy in the left pane and the virtual machines (instances) running on the host shown in the pane on the right.

Pretty much, what an administrator can do is more or less what Xencenter and vCenter do: Create networks, virtual machine templates, configure hosts and so on. Let’s see how the cloudstack templates look like:

Cloudstack comes with some sample templates and internal system virtual machine templates. These are used internally, but more on them later. The administrator is free to upload templates for all three hypervisor clans (KVM, Xen and Vcenter). For KVM, qemu images, for VMware, .ova and for Xenserver VHD. We created one Windows 2008 server template quite easily, by creating a new VM in Xencenter, installing Xentools and then uploading the VHD file in Cloudstack:

As soon as the VHD upload is finished, it is stored internally in the Zone secondary storage area and is ready to be used by users (or customers).

How does cloudstack look like from the user/customer side? We created a customer account (Innova) and delegated access to our test zone:

Customers (depending on their wallet…) have access to one or more pods and can create virtual machines freely, either from templates of from ISO boot images they have access to, without bringing into the loop cloudstack administrators. Creating a new virtual machine (instance) is done through a wizard. First, select your favorite template:

Then, select a service offering from preconfigured sizes (looks similar to EC2?)

Then, select a virtual disk. A template comes with its own disk (in our case the VHD we uploaded earlier), but you can add more disks to your instances. This can also be done after the instance is deployed.

…and after configuring the network (step 4), you are good to go:

The template will be cloned to your new instance, boot up, and form this point on, you can log in through the web browser – no RDP or VNC client needed!

It’s kind of magic — doing this via an app server seems impossible, right? Correct. Cloudstack deploys silently and automagically its own system VMs that take care of template deployment to computing nodes and storage. Three special kinds of VMs are used:

  • Console proxies that relay to a web browser VNC, KVM console or RDP sessions of instances. One console proxy runs in every zone.
  • Secondary storage VM, that takes care of template provisioning
  • Virtual router, one for every domain (that is, customers), which supplies instances with DNS services, DHCP addressing and firewalling.
Through the virtual router users can add custom firewall rules, like this:
All these system virtual machines are managed directly from cloudstack. Login is not permitted and they are restarted upon failure. This was demonstrated during an unexpected Xenserver crash, which brought down the zone secondary storage VM. After the Xenserver was booted up, the secondary storage VM was restarted automatically by cloudstack and relevant messages showed up in the dashboard. Cool, huh?

Customers have full power over their instances, for example, they can directly interact with virtual disks (volumes), including creating snapshots:

In all, from our little cloudstack deployment we were really impressed. The platform is very solid, all advertised features do work (VM provisioning, management, user creation and delegation, templates, ISO booting, VM consoles, networking) and the required resources are literally peanuts: It is open source and all you need are L2 switches (if you go with basic networking), servers and some NFS storage. Service providers investigating options for their production IaaS platform definitely should look into offerings, which has been a part of Citrix since July 2011.

Of supermarkets and clouds

OK, no more cloud computing definitions for me. I’ve found the perfect metaphor to explain what cloud computing is: The supermarket.

Probably you don’t remember how your parents (or grandparents) did their shopping in ye olde days, when supermarkets did not exist. Well, I can still remember my grandmother; she took her shopping bag and went to the butcher around the corner, the fish market downtown, the grocery store across the street and so on. It was fun; each shop had its own smell, arrangement, window and a different face behind the bench. The whole process took hours but it sure was a pleasant thing to do. And you had to do that over and over again, at least 2-3 times a week.

Now, my grandmother has passed away and all these little shops are long gone. Behold the supermarket. Drive, park, grab a cart, cross all the aisles, fill the cart, push across the tellers, pay, load car, drive away, talk to nobody. You’re done in one hour tops. And you’ve got to do that only once per week (depending on the mouths you have to feed…)

What does this have to with cloud computing? Think about it:

  • Cloud computing is about infrastructure uniformity. Like a supermarket, you have abundance of a limited number of the latest choices: Storage is massive, yet in two or three flavors (FC, NFS, iSCSI). Servers are Intel/AMD only with the same CPU stepping. Software stacks are canned – and everything must be kept at the same current revision, otherwise things will start breaking off. In contrast, a cluster of “legacy” HP superdomes or Sun E-series boxes, complete with their own SAN, backup TAN and a team of humans to manage them smells and feels like that old local shop around the corner: It has a little bit of everything. Complex, disparate, old software stacks. Dedicated storage. Cluster-specific network interconnects. Cryptic hardware. Exotic chips. Loyal admins. Human interaction. Everything.
  • Cloud computing is about making things easier. Service provisioning is a few clicks away. Hardware provisioning does not exist; everything is racked, cabled and powered once. System reconfiguration is almost automatic. In a legacy environment (well, in a non-cloud IT shop) trips to the computer room are frequent, CD/DVD swapping does happen, system provisioning is still a ritual ceremony, installing firmware, operating systems, service packs, patches and applications. Just like paying a visit to the grocery shop, then the bakery and the butcher, carrying those heavy shopping bags. Now, think how shopping is done in a supermarket and you get the picture.
  •  Supermarkets are big, neighborhood shops are small. Big size means cheap prices and countless shelves with goods. The same applies to cloud computing: Clouds are efficient in XXL sizes; that’s why cloud provider datacenters are massive. The downside? In a supermarket you can buy only what’s on the shelf and pay what the pricetag says. Unless you buy tons of stuff, you cannot ask the management to bring in a new product at a better price. In a small shop, if the owner knows your grandmother, well, you can ask for extra candies.
  • There is a supermarket in every town, meaning you can find your preferred brand of coffee (as long as it’s on the shelf) all over the country. If your local supermarket is blown to bits by a giant spider/tsunami/alien, drive to the next town. Cloud metaphor: More or less, all cloud service providers have redundant datacenters and data replication across them, so whenever a network outage or a natural disaster strikes, it’s likely that your services will survive.

Of datacenter transformation and clouds

Looking back a few years, when virtualization was still “under evaluation” and Facebook a fancy new thing, the term “cloud” did not exist. It’s not a coincidence that we started talking about cloud computing no sooner than Internet industries (Salesforce, Google, Amazon et al) reached the critical mass to offer services attractive to enterprise IT, and at the same time, the enterprise IT emerged as a business enabler.

What happened? Software and infrastructure as a service, a game well understood by cloud providers, came within the reach of enterprise datacenters. Private clouds now are a reality, a way to provide services to internal customers of an organization on demand, swiftly, on a consolidated multitenant architecture. For an IT worker, the change of the landscape is dramatic. The datacenter transformation from the established server-OS-application stack to the mesh of the cloud (server-hypervisor-shared storage-virtual networking-virtual OS-dynamic load balancing-automatic scaling-resource metering-automation-application server-AJAX stacks) is so immense that it’s very hard to keep up and sometimes, not understood at all.

Let’s spend 168 seconds and see what a private cloud stack would look like:

  • Uniform servers, which in some cases have dual or triple role (computing, storage, networking) with lots of RAM, CPU cores and network ports
  • The network is a massive switching core, tuned to offer troublesome automation and constant reconfiguration, using the same fabric for both data and storage traffic (check out OpenFlow)
  • Massive storage with lots of spindles to cope with mostly write traffic, integrated snapshots and data replication
  • Two or more hypervisor clans (XEN, VMware, HyperV, KVM), each with its own management and provisioning stack
  • A variety of virtual machine templates, all flavors of Linux and Windows, from desktop (VDI) to server
  • Metering of resource consumption and billing
  • Service catalog to end consumers, provisioning workflows
  • Licensing made quite complex (see this and this)
  • Lots of the usual enterprise stacks (eg Citrix, Oracle, Websphere) virtualized and highly available from the hypervisor layer
  • Virtual backup solutions (like Veeam)
  • And an automation stack to rule them all (see this)
Most of these require skills beyond those of an average IT engineer. Storage, networking and virtualization skills are mandatory in order to understand and handle this stack. But, that’s what a cloud is: more than the sum of its parts. 
Ironically, integrators that claim to offer “datacenter transformation” services in essence mean cloud computing, but have failed to realize this fact. They do not grasp the full picture and tend to offer point solutions (a hypervisor here, a smarter backup solution there and shiny powerful servers). The sooner system integrators realize this, the better for them and their customers.

Virtualization and licensing, a Monty Python story: IBM & Oracle

Following up from my last post… Let’s take a look at how Oracle and IBM deal with virtual environments.

What do they have in common? Well, the weight of a CPU core. How much of a CPU is a CPU core? Oracle’s latest manifest is here. In a nutshell, a CPU core counts as much as Oracle licensing scheme dictates. For instance, in dual core Xeon, a core reasonably is “half” a CPU, so 0.5 + 0.5 (cores) = 1 processor. That’s reasonable. Yet, in a 4-core CPU, a core is not 0.25 of a processor – it still is 0.5, so a Xeon 5504 is equal to 0.5+0.5+0.5+0.5=2 processors according to Oracle. Therefore, for Oracle EE licensed per CPU on a dual socket, 6-core server you need to buy 2 X 6 X 0.5 = 6 CPU licenses. Other CPU variants have different core weights, ranging from 0.25 to 1.

What about virtual systems? Study this document. Oracle has additional definitions on what server partitioning is. Simply put, if a system has the ability to be broken into smaller computers, each with own CPU and memory resources and this configuration is done “on the iron” (hardware assisted via firmware code), that’s hard partitioning. Examples are IBM LPARs (PPC), LDOMs (SPARC), vPars & nPars (HP). Quite strangely, Oracle counts as hard partitions Sun Oracle Solaris Containers which are implemented entirely inside the Solaris OS without hardware assistance. Now, if you run Oracle inside an OS sitting on a hardware partition, then you can license it by the number of CPUs included in the partition. For example, if you have a 4-socket, 6-core Intel server running Solaris (that is 12 licenses “processors”) but prefer to run Oracle on a Solaris container that is capped at only two processing cores then you need 2 x 0.5 = 1 processor license.

And now the $30000 question: What about a virtualized fabric like vSphere, XEN, KVM and HyperV? Oops. They are not hard partitions. They are soft partitions. Oracle does not care for soft partitions; you have to license the entire physical server. This means that if you prefer to run Oracle inside a VM with 2 virtual CPUs running in an ESX server like the one described above, you have to purchase 12 licenses. And, if you allow this VM to float to another similar ESX server, you have to purchase another 12 processor licenses.

Bad: If you have big multicore boxes for your virtual hosts and need to run only a few VMs with Oracle DBMS, do not virtualize.

Good: If you run tons of small Oracle DBMSs and expect this number to grow, go virtual after you license all virtual hosts that Oracle VMs are allowed to run on. Then you do not need to care about licensing no matter how many DBMSs you use.

What about IBM? Well, being a manufacturer of highly sophisticated, out-of-this-world, next generation computer gear, IBM has chosen to implement an equally sophisticated, out-of-this-world, next generation CPU licensing scheme, here. IBM goes beyond the obsolete concept of CPU core and socket – they define the Processor Value Unit. Just click the link to see for yourselves.

What about virtualization? No worries, here is a full guide, consisting of detailed scenario analyses (in PDF format, downloadable via FTP…). I won’t go into much detail, after all, posts on this blog are meant to be read in 168 seconds, but here’s an example with VMware:

First of all, a VMware vCPU is mapped to a processing core on the underneath ESX server. Then, you must find the corresponding PVU for that core, add all the vCPU->PVU values and voila, you have the PVUs of your virtual machine. Let’s therefore assume that you have 3 VMs running IBM software with a total of 12 vCPUs PVUs. Normally, you would pay 12 licenses. But, if the ESX server underneath has only 8 CPU cores (real, physical, made of silicon), IBM is generous and will license you with the least number of PVUs vs CPU cores: Only 8 cores. The same applies for a vSphere cluster: If your VMs float across the cluster and the total number of PVUs exceed the number of your cluster physical CPU cores, you will be charged for the least quantity.

Virtualization and software licensing, a true Monty Python story: The case of Microsoft

I’ve watched lots of presentations and workshops on virtualization and I’ve done a few myself to customers. Quite naturally, all focus on how easy and magical it is to take your real servers, made of metal and plastic, and magically turn them into software bits and pieces, untouchable and pure, running in the Matrix. But few, very few dare to unfold the horrible truth about what happens to your software licenses as soon as you virtualize commercial software.

Straight to the point: We assume that you have a windows server shop and dare to go virtual. Every system runs some sort of Windows server (Standard & Enterprise) and your applications are Oracle databases, MS SQL Server, Exchange server and some IBM Websphere application servers. Windows are licensed by the server, Oracle, WebSphere & SQL server by the CPU cores or sockets and Exchange by server. Be prepared for a cosmic effect on your software licenses.

Let’s begin with Microsoft. Luckily, MS has a sort of guide here on how virtualization affects licensing – make sure you read the accompanying Word documents (if you can take it). First, you have to know that Microsoft allows Windows VMs loaded with MS applications to float from server to server, as long as they are in the same “server farm”. What is a server farm? Well. Up to two datacenters connected via network no more than 4 timezones away

Oh, kindly note that we refer only to Microsoft Volume Licensing, not OEM or FPP (Full Packaged Products). They don’t apply. You have been warned.

Now, how are Windows servers licensed under a virtual fabric (in the same “server farm”, so to speeak)? If you believe that a properly licensed Windows 2008R2 physical server that was sucked into the virtual fabric is allowed to run as a VM and hop from ESX to ESX, then, you are wrong. It’s now allowed, unless it is the sole Windows Server Standard Edition running on your ESX. If it was an enterprise edition, well, you can run up to four instances on that ESX. What is the solution??? Go ahead and buy Windows Server Datacenter Edition (licensed per CPU) and assign one license to each and every ESX/XEN/KVM host you have. Only then you can run as many Windows Server VMs you wish on your entire server farm….

What about Microsoft suites like Exchange, Sharepoint, SQL server? The situation with SQL server is that now it’s licensed per virtual processor  – that’s vCPU, meaning that if you have a two-socket, 4-core per CPU ESX/XEN/KVM server and you have two Windows/SQL server Enterprise VMs with four vCPUs assigned to each VM, you need 2 X 4 = 8 processor licenses, regardless if the physical system has two processors. The good thing is that your Windows/SQL server VM is allowed to hop from server to server. Now, for Sharepoint, Exchange etc, a plain old server license is sufficient for Microsoft to allow you to play.

I won’t calculate relevant costs, this is left as an exercise to the reader (Hint: For an initial P2V migration of 4 to 1, costs only for Windows licenses can rise 6-fold, however, a properly licensed virtual fabric can run an unlimited number of Windows VMs). I would advise you to contact your Microsoft TAM to clarify the details; we have only scratched the surface. VDI licenses and desktop OSs are another story.