Want to do cloud? You need to sell it, stupid

From a technology standpoint, cloud computing is the convergence of many beautiful things: Think SOA, governance, virtualization, The Web, data management and process automation. All running inside a lean machine, making the delivery of computing as easy as shopping in a mall.

However, building a cloud is one thing and making money out of it is another story. It’s not only about the infrastructure, is a lot more.

A successful and complete cloud stack has lots of moving parts, and most of them are software, not servers, storage and switches. It’s about your network provider and datacenter SLA, your ISO20000/270001 compliance, a decent chargeback and billing system, the applications or vApp templates you build and  a lot more stuff. It’s complex, with many stakeholders and at the end of the day, it’s not cheap to build. It resembles the eTOM model with a twist of ITIL. But if you manage to sell it, it’s a money making machine – either by slashing operating costs more than half or by generating revenue from your customers.

Now, who are you and to whom should you sell?

  • You are an enterprise or a large organization: If you have less than a  hundred internal users, forget it, otherwise, build your own private cloud. The benefits you will get from knowing who owns that IT resource, why, for how long and charging back the resource to the end user are too good to ignore and stick to a break and fix IT mentality.
  • You are a managed service provider/systems integrator: Building a public cloud is something you can do well; you know the technology and have the building blocks. But, to whom would you sell it? Most likely, you are already consuming private cloud services internally; finding external customers for public cloud services is easy, right? Wrong. Your business organization is set up to deliver vertical specialized services, but cannot do volume selling: Successful public clouds are built on customer volume and this is something you do not have (and do not know how to build up). What you can do is to go to your customers and sell your existing services porfolio in a cloudified form: Disaster recovery as a service and secure workload bursting are services you can deliver and successfuly market.
  • You are in the telco/service provider/web hosting business. Excellent, you know pretty well how to deal with lots and lots of customers. You know very well how to sell services in neat packaging. What you don’t know is how to build cloud services – this is the reason that Verizon, Sprint and others have long ago swallowed datacenter builders like Terremark and Savvis. However, public clouds fit very well in a service provider business model: Customer volume is there, charging/billing/provisioning are there, governance and compliance are there, datacenters and networking are in place. The technology stack bits and pieces are missing but they are easy to shop. What are the speedbumps? Telco strategists have known for decades how to market broadband, datapipes and voice. What they do not understand 100% is how to sell software, and cloud computing is, well, software.


 

 

 

Usage metering and charging with Cloudstack

One of the prominent features of an IaaS cloud is that one can meter its resource usage by its consumers. Metrics are everywhere: From the hypervisor, virtual disk size, network I/O, occupied IP addresses, virtual CPUs and RAM, they are all over the place, waiting to be collected. As soon as you can grab a handful of metrics, you can implement chargeback policies and report back on your users on their resource consumption or, if you run a public IaaS shop, somehow transform these metrics to invoices.

Cloud.com’s cloudstack comes with an excellect usage server, recording metrics directly from its accounts. During installation, simply select the “Install Usage Server” option and perform some basic configuration, and you are all set to go. The usage server collect no less than thirteen (as of cloudstack release 2.2) metrics, which can be found here. In short, some of the most important ones are:

  • RUNNING_VM: Total hours a virtual machine is started and running on the hypervisor
  • ALLOCATED_VM: Total hours a VM exists (no matter if it’s up or down). Useful parameter for charging OS license usage, for example Microsoft SPLA licenses.
  • IP_ADDRESS: Self evident; applies to public (Internet) IP addresses consumed by a cloudstack account. These addresses are (according to cloudstack architecture) attached to the virtual router of the user
  • NETWORK_BYTES_SENT and NETWORK_BYTES_RECEIVED: Traffic passing through the virtual router of a user
  • VOLUME: Size in bytes of user volumes
  • TEMPLATE and ISO: Size in bytes of user-uploaded VM templates and ISO images
(For those who are not familiar with cloudstack’s architecture, cloudstack users are part of accounts. Virtual machines belonging to a single account live in their own private VLAN, totally isolated from other accounts. Access to the Internet, DHCP addressing, DNS and VPN termination, all take place in a special cloudstack virtual machine, a virtual router. Every account has its own virtual router, not directly controlled by the end user, but via the cloudstack API).
 

The service (“cloud-usage”) starts along with the rest cloud services on your cloudstack controller and its configuration variables are at the global parameters of cloudstack. The most important are usage.stats.job.aggregation.range and usage.stats.job.exec.time. The first controls the aggregation interval (in minutes) of collected metrics and the second the time the aggregation algorithm kicks in. Remember to restart the usage server service (“cloud-usage”) everytime you play with these variables.

All metrics are stored in a second database, called “cloud_usage”. To see if your usage server really works, connect to that database and see if its tables start to fill (all metrics tables start with “usage_*”). Data can be retrieved from the database, however, a more elegant way is to use the cloudstack API. The most useful API calls are:

  • listUsageRecords: Takes as arguments account, start & end date and returns usage records for the specified time interval.
  • generateUsageRecords: Starts the aggregation process asynchronously

Accessing the API is a breeze: Generate the secret and API keys from the console and pass them as arguments to a python script or a simple wget and target the API port (which is 8080 for plain http, or a designated SSL port).

So, what do you do with all these collected metrics? Well, there are two ways to deal with them. The first is to write a few complex scripts that collect the metrics from the API, sanitize them, implement your billing scheme and export to your reporting tool or ERP to generate invoices.

The second is to use an off the shelf charging/billing solution. As of January 2012, Amysta have a product in beta and Ubersmith offer complete cloudstack billing in their product, Ubersmith DE.

Bring out your (virtual) dead

Virtualization is cool. Especially for someone that was used to racking, cabling, imaging, installing apps, adding memory, swapping disks powering on/off and the rest. Jumping to vCenter (or XenCenter or virtmanager) is like scuttling your sailboat and getting aboard a hovercraft. We all love it!

Yet, as soon as you pass the thirty-forty virtual machine mark, something happens. Quite naturally, your virtual infrastructure goes into yellow, then orange, then red. Memory utilization is the first to suffer, then your thin provisioning disk array starts to fill up… Days before your virtual farm shuts down or refuses to deploy new virtual machines, you realize that lots of virtual servers are useless – and orphans: You don’t know why they are there.

This is called virtual sprawl. You have lots of “virtual assets” taking up CPU, memory, IP addresses and disk blocks for nothing; they just sit there. Worst, you don’t touch them because they are not yours. How do you clean up?

Oh, he is dead - No, I am not - Yes, you are

How do you control who has the right to create a VM, how long it is used and when it is time to let it go? Virtual infrastructure management tools usually leave the VM lifecycle management aside: What they do best is managing the hypervisors, not your IT policies and business procedures. Options?

The answer here is “private cloud management”. What is required is an automation layer on top of your hypervisor and virtual centers, which implements business procedures, capacity planning, virtual lifecycle management and end user self service portals. A nifty tool is Embotics vCommander, sitting atop your VMware infrastructure and letting your internal customers order a VM, have it approved by their supervisor, charged for its use and specify a decomissioning date. In addition, it can do capacity planning and let you know when you will be out of infrastructure.

Another option is enterprise private/public cloud platform. Abiquo is one of them, able to span across all hypervisors (HyperV, VMware, Xen, Oracle VirtualBox and KVM) with fine grained user control and resource tiering.

Whatever you choose, it is a good practice to have your virtual users accountable for the virtual assets they own and use. Adding a note on their virtual machine summary in vCenter or utilizing a private cloud controller, it is a practice that will pay off in the mid term.

SaaS at your fingertips

Have you ever wondered what does it take to develop an SaaS app? What kind of developer resources are needed? Infrastructure? Tools? Platforms?

Let’s say you want to build a small app that shows on the fly the value of a stock and its market cap. You need a place to host it, a developer to write the code and find a credible feed of stock price info. Sounds a bit (but not too) complex.

Or, do it in 30 seconds, provided you know what a spreadsheet is… Try this.

(Hint: It’s a Google Docs spreadsheet. Hosted somewhere in a Google datacenter. Stock price and marketcap retrieved via Google Finance).

Breaking up with your cloud provider

Suppose you run an average SMB business and you’re proud of having achieved a fairly low TCO by hosting all your operations on your favorite cloud provider, AcmeCloud, who do an excellent job running your virtual machines flawlessly with a 100% uptime and perfect network latency figures. All is well, you have no IT in house and your computing geeks funnel their resources developing new services and bringing in new customers every day. AcmeCloud does all the chores for you (operating VMs, allocating storage, snapshoting and replicating volumes, maintaining SSL certificates and network load balancing) and charging you only for what you use. Until that day…

It started with a 2-hour outage, attributed to a lightning strike. Then, two months later in the peak of the holiday season, you are notified that you must change your SSL certificates due to an intrusion to AcmeCloud secure server – and inform your customers, all 2000 of them, to do the same. And a few weeks later, due to a mishap during a network upgrade, your data volumes suddenly become unresponsive for a whole day causing data loss and forcing a rollback to 2-day old snapshots. Time to find a new home for your data services.

As easy as it is to start using cloud services, that’s how difficult it becomes to change to a new cloud provider. Downloading VM images and uploading them to another provider over the Internet is expensive – you have to utilize high bandwidth for a long time. Exporting data sets from an SaaS data repository and importing them to another is even more difficult, since you may have to adjust data schemas and types. Hoping that all will go smooth and you will be done in a few days is equally possible to finding a pink elephant grazing in your back yard.

In the traditional, old-fashioned IT world we all love to hate, where you have your servers in your little computer room backing them up to tapes and NFS volumes, the story described above is equal to a disaster recovery event. Something breaks beyond repair and you have to rush to find new servers and Internet uplink, then restore everything – well, everything that’s possible – to the new systems and restore services. This has happened once or twice to IT veterans and it’s quite painful to recall.

One could argue that a disaster recovery site and a BC plan would be the solution to this, but, how many average SMBs do you know that can afford a second datacenter? Very few. But, what would be the analogy of a disaster recovery site in the cloudified enterprise? Simply, a second cloud provider. Let’s take some time and weigh the pros and cons of moving your SMB to the clouds, using two cloud providers from day one.

The bad stuff first:

  • Costs will double: You pay two providers. That’s straightforward (really?)
  • Complexity will also double. Each provider has their own interfaces and APIs, which you have to familiarize with.
  • You have to maintain data consistency from one provider to the other. If you go for an active-passive scheme, you have to transfer data from one provider to the other on a frequent schedule
  • You have to control your DNS domain so that you can update your domain entries when you have to switch from one provider to the other

 

The good stuff:

  • Costs will not necessarily double! By utilizing both providers at the same time, you can split services among both. When either fails, utilizing their elastic cloud infrastructure you can instantly fire up dormant Vms on the other.
  • You have the luxury to make tactical decisions at your own time, not under time pressure. For example, you can tune your online services at your own pace by balancing them across both or preferring one of them that offers better network latency, while keeping data services on the other that offers cheaper storage.
  • You can plan a cloud strategy, by eliminating one of two providers and migrating to a third without losing deployed services.
  • By being forced to move data back and forth from one provider to the other, your IT skills in data governance and transformation will be enriched, mandating your organization to retain control over your data lifecycle and not delegate this function to the cloud provider.

 

Planning a cloud strategy with two cloud providers instead of one is the same pattern that cloud providers themselves utilize: Reliable services built on unreliable resources. You cannot trust 100% any cloud provider, but you can trust a service model that is built on redundant service blocks.

Adding value to SaaS

Software as a service is an entirely different animal from IaaS or PaaS. Implementing the latter two can be done (almost) with platforms available off the shelf and engaging a few consultants: Grab your favorite cloud automation platform (pick any: Eucalyptus, [Elastic|Open|Cloud]stack, Applogic, Abiquo, even HP SCA, throw in volume servers and storage, host on a reliable DC and you are good to go).

On the other hand, SaaS is something you have to:

  1. Conceive. IaaS and PaaS are self explanatory (infrastructure and platform: Virtual computing and database/application engine/analytics for rent); SaaS is… everything: from cloud storage to CRM for MDs.
  2. Implement: SaaS is not sold in shops. You have to develop code. This means, finding talented and intelligent humans to write code, and keep them with you throughout the project lifecycle.
  3. Market: Finding the right market for your SaaS is equally important to building it. SaaS is a service; services are tailored for customers and come in different sizes, colours, flavors. One SaaS to rule them all does not work.
  4. Sell: Will you go retail and address directly end customers? Advertising and social media is the road to go. Wholesale? Strike a good revenue sharing deal with somebody that already has customers within your target group, say, a datacenter provider or web hosting.
  5. Add some value to your SaaS. Cloudifying a desktop application brings little value to your SaaS product: It’s as good as running it on the desktop; the sole added value is ubiquitous access over the web. Want some real value? Eliminate the need to do backups. Integrate with conventional desktop software. Do auto-sync. Offer break-away capability (take your app and data and host somewhere else).
Let’s take two hypothetical examples: Cloud storage and CRM for doctors.
Cloud storage is a good offering for customers seeking a secure repository, accessible from everywhere. Let’s consider two approaches:
  • High end branded storage array with FC and SSD disks
  • 5-minute snapshots, continuous data protection
  • FTP and HTTP interface
  • Disk encryption
  • Secure deletion
The second approach would be:
  • WebDAV interface
  • Data retention
  • Daily replication
  • Auto sync with customer endpoints
  • Integrated content search

What’s wrong with the first approach? It is typical of the IT mindset: Offer enterprise IT features, like OLTP/OLAP-capable storage to the cloud. Potential customers? Enterprises that need to utilize high-powered data storage. Well, if you are an enterprise, most likely you’d rather keep your OLTP/OLAP workloads in house, wouldn’t you? Why bother?

The second approach offers services that are not delivered from your enterprise IT machinery. It’s added value to a cloud storage service and at the end of the day, they are deemed too expensive or complicated to implement in house. Potential customers? Enterprises that have not implemented these services but would seriously consider renting them.

Let’s consider now a cloud CRM for doctors. What would be some value added features for private MDs, apart from a database with customer names and appointment scheduling? I can think of a few:

  • Brief medical history of patient delivered to the doctor’s smartphone/pad. Can save lives.
  • List of prescribed medicines with direct links to medicare/manufacturer site. Patients can forget or mix up their prescribed drugs; computers never forget.
  • Videochat with patient.
  • Patient residence on Google maps and directions how to get there

A quick tour of Cloudstack

Cloud.com, now a part of Citrix, has developed a neat, compact, yet powerful platform for cloud management: Enter cloudstack, a provisioning, management and automation platform for KVM, Xen and VMware, already trusted for private and public cloud management frmo companies like Zynga (got Farmville?), Tata communications (public IaaS) and KT (major Korean Service provider).

Recently I had the chance to give cloudstack a spin in a small lab installation with one NFS repository and two Xenservers. Interested in how it breathes and hums? Read on, then.

Cloudstack was installed in a little VM in our production vSphere environment. Although it does support vSphere 4.1, we decided to try it with Xen and keep it off the production ESX servers. Installation was completed in 5 minutes (including the provisioning of the Ubuntu 10.04 server from a ready VMware tremplate) and cloudstack came to life, waiting for us to login:

The entire interface is AJAX – no local client. In fact, cloudstack can be deployed in a really small scale (a standalone server) or in a full-blown fashion, with redundant application and database servers to fulfill scalability and availability policies.

Configuring cloudstack is a somewhat more lengthy process and requires reading the admin guide. We decided to follow the simple networking paradigm, without VLANs and use NFS storage for simplicity. Then, it was time to define zones, pods and clusters, primary and secondary storage. In a nutshell:

  • A zone is a datacenter. A zone has a distinct secondary storage, used to store boot ISO images and preconfigured virtual machine templates.
  • A pod is servers and storage inside a zone, sharing the same network segments
  • A cluster is a group of servers with identical CPUs (to allow VM migration) inside a pod. Clusters share the same primary storage.
We created a single zone (test zone) with one pod and two clusters, each cluster consisting of a single PC (one CPU, 8 GB RAM) running Xenserver 5.6. Configuring two clusters was mandatory, since the two Xenservers were of different architectures (Core 2 and Xeon). After the configuration was finished, logging in to Cloudstack as administrator brings us to the dashboard.

In a neat window, the datacenter status is shown in clear, with events and status in the same frame. From here an administrator has full power over the entire deployment. This is a host (processing node in Openstack terms) view:

You can see the zone hierarchy in the left pane and the virtual machines (instances) running on the host shown in the pane on the right.

Pretty much, what an administrator can do is more or less what Xencenter and vCenter do: Create networks, virtual machine templates, configure hosts and so on. Let’s see how the cloudstack templates look like:

Cloudstack comes with some sample templates and internal system virtual machine templates. These are used internally, but more on them later. The administrator is free to upload templates for all three hypervisor clans (KVM, Xen and Vcenter). For KVM, qemu images, for VMware, .ova and for Xenserver VHD. We created one Windows 2008 server template quite easily, by creating a new VM in Xencenter, installing Xentools and then uploading the VHD file in Cloudstack:

As soon as the VHD upload is finished, it is stored internally in the Zone secondary storage area and is ready to be used by users (or customers).

How does cloudstack look like from the user/customer side? We created a customer account (Innova) and delegated access to our test zone:

Customers (depending on their wallet…) have access to one or more pods and can create virtual machines freely, either from templates of from ISO boot images they have access to, without bringing into the loop cloudstack administrators. Creating a new virtual machine (instance) is done through a wizard. First, select your favorite template:

Then, select a service offering from preconfigured sizes (looks similar to EC2?)

Then, select a virtual disk. A template comes with its own disk (in our case the VHD we uploaded earlier), but you can add more disks to your instances. This can also be done after the instance is deployed.

…and after configuring the network (step 4), you are good to go:

The template will be cloned to your new instance, boot up, and form this point on, you can log in through the web browser – no RDP or VNC client needed!

It’s kind of magic — doing this via an app server seems impossible, right? Correct. Cloudstack deploys silently and automagically its own system VMs that take care of template deployment to computing nodes and storage. Three special kinds of VMs are used:

  • Console proxies that relay to a web browser VNC, KVM console or RDP sessions of instances. One console proxy runs in every zone.
  • Secondary storage VM, that takes care of template provisioning
  • Virtual router, one for every domain (that is, customers), which supplies instances with DNS services, DHCP addressing and firewalling.
Through the virtual router users can add custom firewall rules, like this:
All these system virtual machines are managed directly from cloudstack. Login is not permitted and they are restarted upon failure. This was demonstrated during an unexpected Xenserver crash, which brought down the zone secondary storage VM. After the Xenserver was booted up, the secondary storage VM was restarted automatically by cloudstack and relevant messages showed up in the dashboard. Cool, huh?

Customers have full power over their instances, for example, they can directly interact with virtual disks (volumes), including creating snapshots:

In all, from our little cloudstack deployment we were really impressed. The platform is very solid, all advertised features do work (VM provisioning, management, user creation and delegation, templates, ISO booting, VM consoles, networking) and the required resources are literally peanuts: It is open source and all you need are L2 switches (if you go with basic networking), servers and some NFS storage. Service providers investigating options for their production IaaS platform definitely should look into cloud.com offerings, which has been a part of Citrix since July 2011.