Tag Archives: tech dive

Usage metering and charging with Cloudstack

One of the prominent features of an IaaS cloud is that one can meter its resource usage by its consumers. Metrics are everywhere: From the hypervisor, virtual disk size, network I/O, occupied IP addresses, virtual CPUs and RAM, they are all over the place, waiting to be collected. As soon as you can grab a handful of metrics, you can implement chargeback policies and report back on your users on their resource consumption or, if you run a public IaaS shop, somehow transform these metrics to invoices.

Cloud.com’s cloudstack comes with an excellect usage server, recording metrics directly from its accounts. During installation, simply select the “Install Usage Server” option and perform some basic configuration, and you are all set to go. The usage server collect no less than thirteen (as of cloudstack release 2.2) metrics, which can be found here. In short, some of the most important ones are:

  • RUNNING_VM: Total hours a virtual machine is started and running on the hypervisor
  • ALLOCATED_VM: Total hours a VM exists (no matter if it’s up or down). Useful parameter for charging OS license usage, for example Microsoft SPLA licenses.
  • IP_ADDRESS: Self evident; applies to public (Internet) IP addresses consumed by a cloudstack account. These addresses are (according to cloudstack architecture) attached to the virtual router of the user
  • NETWORK_BYTES_SENT and NETWORK_BYTES_RECEIVED: Traffic passing through the virtual router of a user
  • VOLUME: Size in bytes of user volumes
  • TEMPLATE and ISO: Size in bytes of user-uploaded VM templates and ISO images
(For those who are not familiar with cloudstack’s architecture, cloudstack users are part of accounts. Virtual machines belonging to a single account live in their own private VLAN, totally isolated from other accounts. Access to the Internet, DHCP addressing, DNS and VPN termination, all take place in a special cloudstack virtual machine, a virtual router. Every account has its own virtual router, not directly controlled by the end user, but via the cloudstack API).

The service (“cloud-usage”) starts along with the rest cloud services on your cloudstack controller and its configuration variables are at the global parameters of cloudstack. The most important are usage.stats.job.aggregation.range and usage.stats.job.exec.time. The first controls the aggregation interval (in minutes) of collected metrics and the second the time the aggregation algorithm kicks in. Remember to restart the usage server service (“cloud-usage”) everytime you play with these variables.

All metrics are stored in a second database, called “cloud_usage”. To see if your usage server really works, connect to that database and see if its tables start to fill (all metrics tables start with “usage_*”). Data can be retrieved from the database, however, a more elegant way is to use the cloudstack API. The most useful API calls are:

  • listUsageRecords: Takes as arguments account, start & end date and returns usage records for the specified time interval.
  • generateUsageRecords: Starts the aggregation process asynchronously

Accessing the API is a breeze: Generate the secret and API keys from the console and pass them as arguments to a python script or a simple wget and target the API port (which is 8080 for plain http, or a designated SSL port).

So, what do you do with all these collected metrics? Well, there are two ways to deal with them. The first is to write a few complex scripts that collect the metrics from the API, sanitize them, implement your billing scheme and export to your reporting tool or ERP to generate invoices.

The second is to use an off the shelf charging/billing solution. As of January 2012, Amysta have a product in beta and Ubersmith offer complete cloudstack billing in their product, Ubersmith DE.

Tech dive: Custom incidents in HP NNMi 9.10

This post is a tech dive into HP NNMi 9.10, intended to illustrate a way to create custom incidents. People that make a living from tuning and playing with NNM should find it rather interesting, others are encouraged to seek amusement elsewhere….

NNMi 9.10 is a substantial progress from NNM editions before 8.xx. It is an entirely new implementation, based on JBoss and designed as NNM should have been all along: Multithread, multiuser, with a decent database and an open (web services) API. Being new, certain things are done in a different way, and one of them is generating custom incidents.

Recently I was asked to implement in NNMi 9.10 a way to create an event and change the status of a network node whenever a certain condition occured. The nodes in question were branch office routers with ISDN interfaces as backup lines and the condition was the activation of the ISDN interface. The customer wants their NOC to be alerted whenever a branch office router activates the ISDN interface when the primary line goes down. The catch here is that a switch to the ISDN backup line is not regarded as an event from the NNMi perspective, since whenever the router detects that the primary route goes down, it turns it administratively down and brings up the ISDN interface, so there is no fault: The router is polled normally from the ISDN side from NNMi and no incident is created, the node remains green.

In previous versions of NNM, it was possible to change the internal OvIfUp event of NNM so that it triggered an external action, a perl script that manually changed the node status and created an NNM event. With NNMi 9.10, this is no longer the case. So, what do we do now?

The first step is to go to the configuration menu -> Custom poller configuration and create a custom poller collection, as above. Enable the custom poller and create a new one, with the name “ISDN Poller”.

Above is the poller creation form. The important stuff is the MIB expression and the MIB Filter variable. What we want is to check the operational status of the ISDN interface: The algorithm that the poller should do is to poll the interfaces of the router, filter out the ISDN interface via the “ifDescr” MIB variable and then check the value of the operational status of that interface. If this is “Up” the router should be set in the “Major” state. This is shown in the form above: The MIB filter is set to ifDescr and we select to poll the ifOperStatus object from the “MIB Expression” field: We create a new expression, select “ifOperStatus” from the MIB mgmt-2 interfaces tree and set the node status to “Major”.

The next step is to make this poller work for us, so we need to bind it to a policy, as shown below. Go to the “Policies” tab and create a new policy. Select the node group that you want the new custom poller to be applied to and type the MIB filter. The MIB filter will be matched to the “MIB Filter Variable” of the previous form, in our case, this is “Dialer1”, which is set to the router as the ISDN interface.

That’s it. After this configuration, the custom poller will be activated according to the defined policy: It will run only for the node group (Collection) you have specified, will poll only the interfaces that their description is “Dialer1”, which in our case are ISDN interfaces, and whenever the Operational Status (ifOperStatus) of these interfaces is set to “1”, which is “Up”, a new incident will be created and the node will turn orange on the map (Major status). Straightforward? No. Does it work? Yes.