Category Archives: Cloud Computing

Basic guide to the issues around CNAME at the zone apex

What is a CNAME?

In DNS, CNAME stands for “canonical name”.

“Canonical” is one of those words you hear every now and then in technology discussions, but not many people are exactly sure what it means. In effect, it means “official”, which can be further extrapolated to “universally accepted”.

So in a DNS zone file, a CNAME looks like this: 3600 IN CNAME

In this case, the canonical name for the server which hosts is:

An alias (“also know as”) of that canonical name is:

So if you request this name, which is an alias, the response you will get will be data for the canonical record, for any type of record. For example, if you request the MX data for, you will/should receive the MX data for

At this point, it is important to understand that is an alias of, and not the other way around.

What is the zone apex?

The zone apex is that part of the a DNS zone for which data exists for the entire domain, rather that a specific host in the domain.

So if you want to define an MX records for the entire domain,, you would create that record at the apex: 3600 IN MX

Its worth noting that there is a difference between the default record type for a domain and the zone apex:

* 3600 IN A

The default value for a particular record type defines the data that should be returned if a request is received for a record type which does not have a specific match is the zone file.

What’s a CNAME at the zone apex?

A CNAME at the zone apex looks like this: 3600 IN CNAME

This is a common requirement for website owners who want to advertise their website without the “www” prefix and who also use the services of a 3rd party web hosting company who cannot provide them with a dedicated ip address and instead provide them with a canonical name for the website running on their infrastructure.

Why is it not allowed?

Given that so many web owners have the requirement outlined above, it seems incredulous that this isn’t allowed, which is why this is such a hot topic.

Firstly, let’s clarify that there is no explicit RFC rule that says “you can’t have a CNAME at the zone apex”.

What the RFCs do say is that if you define a CNAME with a name (the left side of the record declaration) of “”, you can’t create any other record of any other type using the same name. For example, the following would be non-compliant: 3600 IN CNAME  3600 IN MX

The reason for this goes back to understanding that the “alias” is the left side and the “official” name is the ride side. If something is “official”, there can only be one version of it. The first record in the above sequence tells DNS that is “official” for, so DNS doesn’t want to know about any other records for

But, you’ll ask, why can’t DNS simply segregate “officialness” based on the record type? The answer most top software testing companies will give you to this is that the DNS standard came into being long before HTTP or any contemplation of shared web hosting, and it is no longer practical to reverse engineer all of the DNS software that has grown out of the standard to fit this specific use case.

Is this strict?

This is where is starts to get interesting.

Given that so many people want CNAMEs at their zone apex ($$$$), many software developers and their product managers have taken a sideways look at the RFC and determined that it permits a degree of flexibility in its implementation:

If a CNAME RR is present at a node, no other data should be present; this ensures that the data for a canonical name and its aliases cannot be different.

The key phrase is “should be”. The argument runs that the absence of the phrase “must be” is in fact a license to interpret the standard more liberally, and there are DNS server implementations on the market that will allow CNAME records to co-exist with other records with the same name.

If you’re using a web hosting or DNS provider who says you can have a CNAME at the zone apex, they will be using such an implementation. This isn’t something that exists only in the dark, backstreets of web hosting. Global internet services providers like Cloudflare have permitted CNAMEs at the zone apex.

In truth, this interpretation exists on very shaky foundations. There is a more detailed discussion of this issue here.

How does this problem manifest itself?

RFCs exist to allow different people design software that will interoperate. If person A is writing code to implement a DNS server, and person B is writing software to implement a DNS client. If they both follow the standard, the 2 pieces of software should work together. The RFC doesn’t tell them how to write code, but it does tell them how their code should behave.

When people beginning interpreting the intent of an RFC, nothing good comes of it. It may not be immediately apparent, but the longer software exists, the more edge cases it has to deal with, and that’s where it becomes important that one piece of software can anticipate the response of another.

In terms of a practical example, this is really good:

In this case, MS Exchange Server 2010 stopped delivering mail to addresses whose DNS zone had a CNAME at the zone apex. The Exchange mail server was relying on a local DNS cache. Previously, someone had queried an A record for, and received a CNAME response. That data was cached. Later, when the MX record for was queried, the cache ignored the fact that the cached record was an A record (this was compliant behaviour) and returned a CNAME. The Exchange server correctly rejected this as CNAMEs are not valid data for MX queries.

Are there workarounds?

The first workaround is to not implement the RFC standard. Some providers will tell that this is their workaround, but it isn’t. It’s just chicanery, and you should avoid it.

The big cloud hosting companies are the best place to go for workarounds. Amazon AWS have a black box solution in Route53 which allows CNAMEs at the zone apex if the canonical name is an AWS resource identifier (like an ARN for an Elastic Load Balancer).

The most en vogue workaround at the moment is what is called CNAME flattening.

What is CNAME flattening?

DNS server software that implements CNAME flattening permits the user to create multiple CNAMEs of the same name with different official values, which allows the user to create a CNAME at the zone apex. When you configure the zone file in this way, the server will accept it and start as normal.

When a query is then received for one of these records, rather than return the CNAME value, the server will go off and query the CNAME value, and any subsequent values, until it gets to an IP address, and then return that IP address as the response to the original query.

Is CNAME flattening standards compliant?

Yes and no.

On the one hand it permits the existence of something that the RFC says is not permitted, but equally, it behaves in a way that is RFC compliant.

Whether a user wants to rely on CNAME flattening is something they will have to make a call on them according to their individual circumstances.

Everything you want to know about AWS Direct Connect but were afraid to ask

AWS Direct Connect is one of those AWS services that everybody knows about but not too many people use. Learn more about your options here. I’ve recently been involved in the set up of a redundant AWS Direct Connect link. To assist others considering doing the same, I’m sharing what I’ve learned.


This is big.

Within an AWS availability zone, and between availability zones in the same region, EC2 instances use jumbo frames. However, jumbo frames are not supported on AWS Direct Connect links, so you will be limited to a maximum MTU of 1500. You may wish to consider the implications of this before you consider using AWS Direct Connect.

Update: as of Nov 2018, Jumbo Frames are now supported on AWS Direct Connect!


What is AWS Direct Connect?

Its a dedicated link between a 3rd party network and AWS. That means data flows over a dedicated isolated connection, which means you get dedicated, consistent bandwidth, unlike a VPN, which flows over the public Internet.

How is it provisioned?

You have 2 choices. AWS partners with co-location data centre providers across their various regions. This involves AWS dropping wholesale connectivity directly into the Meet Me Rooms in these 3rd party data centres. If your equipment is located in one of these data centres, your AWS Direct Connect connection is then simply patched from your cabinet into the Meet Me Room. This is called a Cross Connect.

If you are not using one of AWS’s co-location data centre partners, you can still make a Direct Connect link from your corporate network to AWS. This involves linking your corporate network to one of the data centres where AWS has a presence in their Meet Me Room, from where you can make on onward connection to AWS. The Direct Connect documentation lists telecoms providers in each region who can provide this service, and the data centres to which they can make connections.

What speeds are available?

By default, you can get either a 10GB or 1GB connection, but you can also consult directly with the AWS partners to get lower speed connections.

What do you pay?

You pay per hour for the amount of time your connection is “up” (connected at both ends). What you pay per hour depends on the speed of your connection. If you provision a connection but it isn’t “up”, you don’t pay, unless you leave that unconnected connection in place for > 90 days (after which you start paying the standard rate).

You also pay per GB of data transferred from AWS to your location. You don’t pay for data transferred from your location to AWS.

What if I need more than 10GB?

You can aggregate multiple 10GB connections together.

How stable are the connections?

Whereas connecting to AWS with a VPN provides for 2 BGP routes from your location to AWS, a Direct Connect link is a single point of failure. It is thought (presumed?) that AWS provide for a certain level of redundancy once the connection leaves the Meet Me Room in the data centre, but there are no guarantees about this and AWS do not offer an SLA for connectivity.

What hardware do I need?

You will need L3 network hardware. It will need to be able to do BGP routing and support encrypted BGP passphrases. It will need to have sufficient port speed to connect to the Direct Connect uplinks you have provided. If this is a virgin install in a co-location data centre, there are switches available that can do both L3 and L2, handle BGP and provide redundancy for 2 Direct Connect connections. This negates the need to purchase both routers and switches. You should be able to get this kit for < €20,000. Providers will almost certainly try to sell you more expensive kit. If you’re using Direct Connect, they presume money is no object for you.

What are the steps required to set up a connection?

Decide if you need a single connection or if your going to need a pair of redundant connections.

Decide what speed connection you need. Don’t guess this. Estimate it based on current network traffic in your infrastructure.

Design you IP topology

If you are going to use one of the co-location data centres, contact them. Otherwise, contact one of the Telecoms Provider partners. They will provide pricing/guidance in terms of connecting your equipment or location to the relevant Meet Me Room. Moreover, Advanced Telecom Systems can help.

Procure the termination hardware on your side of the connection.

Once you have provisioned your connection and hardware, starting building your configuration on the AWS side of the connection.

What do I need on the in terms of configuring the VPC I am connecting to?

Typically, you will be connecting resources in a VPC to your co-location data centre of on-premises infrastructure. There are a number of hops between a VPC and a Direct Connect connection.

Working out from the VPC, the first thing you need is a Virtual Private Gateway (AWS denotes these as VGW, rather than VPG). This is logically a point of ingress/egress to your VPC. You will be asked to chose a BGP identifier when creating this. If you use BGP already, supply what you need. Otherwise, let AWS generate one for you.

When you have created this, you next create a Route Table that contains a route for the CIDR of your co-location data centre or on-premises infrastructure that points to the VGW you created earlier.

Next, create a subnet(s) (or use an existing one) and attach the Route Table to that subnet. Anything resources that need to use the Direct Connect connection need to be deployed in that subnet(s). Its probably worth deploying an EC2 instance in that subnet for testing.

This is all you need to do in the VPC configuration (you can apply NACLs, security etc later. Leave everything open for now for testing.)

How do I set up the Direct Connect configuration on the AWS side?

Once you’ve configured your VPC, you now need to configure your Direct Connect service (you don’t need to do these in any particular order. You can start with Direct Connect if you like).

Create the connections (dxcon) you require in the AWS Direct Connect console. You’ll be asked for a location to connect to and chose a speed of either 10GB or 1GB (if you want a lower speed, you’ll need to talk to your Telco or data centre before you can proceed).

The connection will be provisioned fairly quickly, and show itself in a “provisioning” state. After a few hours, it will be in a “down” state. At this point, you can select actions and download what is called a Letter of Authority (LOA) for the connection. This will specify what ports in the Meet Me Room your connection should be patched in to. You need to forward this to your co-location data centre or Telco for them to action.

Note: it is not infrequent to find the ports you have been allocated are already in use by someone else. In this case, delete the connection and start again. If you can, check with the data centre verbally that the ports are free before you submit the LOA to them. Repeat all of above if you have multiple connections. Redundancy is dealt with later in the process.

To be able to use your connection, you now need to attach a Virtual Interface (dxvif) to it. You have options here, and as is always the case, options make things a bit more complicated.

You can connect a Virtual Interface to either a VGW (Virtual Private Gateway) or a Direct Connect Gateway (not the same thing as a Direct Connect connection).

If you connect to a VGW, you will only ever be able to connect to the VPC to which that VGW provides access.

If you connect to a Direct Connect Gateway, you can associate multiple VGWs with that Gateway, allowing you access to multiple VPCs *across all AWS regions*. If you want to use this option, you need to create a Direct Connect Gateway before you create a Virtual Interface.

I can’t see any reason other than corporate governance and security why you would not want to use a Direct Connect Gateway, so I’d suggest using that option if in doubt.

So now proceed and create your Virtual Interface. If you only want to attach it to the VGW you created earlier, that option is there for you. Otherwise, attach it to the Direct Connect Gateway you created.

Once you have your Virtual Interface, go back to the Connections panel and associate that with one of your connections. You will need a dedicated Virtual Interface for each connection (you can also attach multiple Virtual Interfaces to the same connection, but that isn’t relevant here).

The final step here only occurs if you are using a Direct Connect Gateway. If you are, you need to associate the VGW you created in your VPC with the Direct Connect Gateway. It should be presented as option for you in the list of available VGWs. Start typing its identifier into the search field if not. The UI can be a bit flaky here.

That should be everything. Redundancy is the next piece.

How do I configure redundancy on the AWS side?

If you want redundant connectivity, you really need to use a Direct Connect Gateway rather than linking your connection directly to a VGW. I *think* this is a requirement for redundancy. If not, its still my recommendation.

If you have done that, you should now have 2 Virtual Interfaces and 1 VGW associated with your Direct Connect Gateway. Think of the Direct Connect Gateway as a router. The 2 Virtual Interfaces are on the external side of the router, linking in to 2 Direct Connect connections. The VGW is on the AWS side of the router, linking back to the VPC.

That should be all that is required. Traffic will flow out of the VPC through the VGW into the Direct Connect Gateway, which is BGP enabled and links into the 2 Virtual Interfaces, which are also BGP enabled. If one connection goes down, BGP routes the traffic on to the other connection. This is transparent to the VPC.

What about redundancy on the other side of the connection?

This is matter for your network administrator or service provider. Typically, the 2 connections will terminate in a logical stack of redundant routers/switches which are BGP enabled and can transfer traffic flow between the external connections.

How do I know it’s working?

You won’t see the state of your connections and Virtual Interfaces switch to “available” until L2 connectivity is established and the necessary BGP authentication handshake has occurred. At that point, you should be able to send ICMP requests from your termination hardware to the EC2 instance you created in your VPC earlier.

Good luck!

Using Elasticsearch Logstash Kibana (ELK) to monitor server performance

There are myriad tools that claim to be able to monitor server performance for you, but when you’ve already got a sizeable bag of tools doing various automated operations, its always nice to be able to fulfil an operational requirement using one of those rather than having to on board another one.

I love Elasticsearch. It can be a bit of minefield to learn, but when you get to grips with it, and bolt on Kibana, you realise that there is very little you can’t do with it.

Even better, Amazon AWS now have their own Elasticsearch Service, so you can reap all the benefits of the technology without having to worry about maintaining a cluster of Elasticsearch servers.

In this case, my challenge was to expose performance data from a large fleet of Amazon EC2 server instances. Yes, there is certain amount of data available in AWS Cloudwatch, but it lacks key metrics like memory usage and load average, which are invariably the metrics you must want to review.

One approach to this would be to put some sort of agent on the servers and have a server poll the agent, but again, that’s extra tools. Another approach would be to put scripts on the servers that push metrics to Cloudwatch, so that you can augment the existing EC2 Cloudwatch data. This was something we considered, but with this method, the metrics aren’t logged to the same place in Cloudwatch as the EC2 data, so it all felt a bit clunky. And you only get 2 weeks of backlog.

This is where we turned to Elasticsearch. We were already using Elasticsearch to store information about access to our S3 buckets, which we were happy with. I figured there had to be a way to leverage this to monitor server performance, so set about some testing.

Our basic setup was a Logstash server using the S3 Input plugin, and the Elasticsearch output plugin, which was configured to send output to our Elasticsearch domain in AWS

output {
 if [type] == "s3-access" {
     elasticsearch {
         index => "s3-access-%{+YYYY.MM.dd}"
         hosts => ["search-*********"]
         ssl => true

We now wanted to created a different type of index, which would hold our performance metric data. This data was going to be taken from lots of servers, so Logstash needed a way to ingest the data from lots of remote hosts. The easiest way to do this is with the Logstash input plugin syslog. We first set up Logstash to listen for syslog input.

input {
     syslog {
         type => syslog
         port => 8514

We then get our servers to send their syslog output to our Logstash server, by giving them a universal rsyslogd configuration, where is our Logstash server:

#Logstash Configuration
$WorkDirectory /var/lib/rsyslog # where to place spool files
$template LogFormat,"%HOSTNAME% ops %syslogtag% %msg%"

We now update our output plugin in Logstash to create the necessary Index in Elasticsearch:

output {
 if [type] == "syslog" {
    elasticsearch {
       index => "test-syslog-%{+YYYY.MM.dd}"
       hosts => ["search-*********"]
       ssl => true
 } else {
    elasticsearch {
       index => "s3-access-%{+YYYY.MM.dd}"
       hosts => ["search-*********"]
       ssl => true

Note that I have called the syslog Index “test-syslog-…”. I will explain this in a moment, but its important that you do this.

Once these steps have been completed, it should be possible to see syslog data in Kibana, as indexed by Logstash and stored in our AWS Elasticsearch domain.

Building on this, all we had to do next was get our performance metric data into the syslog stream on each of our servers. This is very easy. Logger is a handly little utility that comes pre-installed on most Linux distros that allows you send messages to syslog (/var/log/messages by default).

We trialled this with Load Average. To get the data to syslog, we set up the following cronjob on each server:

* * * * * root cat /proc/loadavg | awk '{print "LoadAverage: " $1}' | xargs logger

This writes the following line to /var/log/messages every minute:

Jun 21 17:02:01 server1 root: LoadAverage: 0.14

It should then be possible to search for this line in Kibana

message: "LoadAverage"

to verify that it is being stored in Elasticsearch. When we do find results in Kibana, we can see that the LogFormat template we used in our server rsyslog conf has converted the log line to:

server1 ops root: LoadAverage: 0.02

To really make this data useful however, we need to be able to perform visualisation logic on the data in Kibana. This means exposing the fields we require and making sure those field have the correct data type for numerical visualisations. This involves using some extra filters in your Logstash configuration.

filters {
   if [type] == "syslog" {
       grok {
          match => { "message" => '(%{HOSTNAME:hostname})\s+ops\s+root:\s+(%{WORD:metric-name}): (%{NUMBER:metric-value:float})' }

This filter operates on the message field after it has been converted by ryslog, rather than on the format of the log line in /var/log/messages. The crucial part of this is to expose the Load Average value (metric-value) as a float integer, so that Kibana/Elasticsearch can deal with it as an integer rather than a string. If you only specify NUMBER as your grok data type, it will be exposed as a string, so you need to add the “:float” to complete the data type conversion to type integer.

To verify that it is exposed as a string, look in Kibana under Settings -> Indices. You should only have a single Index Pattern at this point (test-syslog-*). Refresh the field list for this, and search for “metric-value”. At this point, it may indicate that the data type for this is “String”, which we can now deal with. If it already has data type “Number”, you’re all set.

In Elasticsearch indices, you can only set the data type for a field when the index is created. If your “test-syslog-” index was created before we properly converted “metric-value” to an integer, you can now create a new index and verify that metric-value is an integer. To do this, update the output plugin in your Logstash configuration and restart Logstash.

output {
 if [type] == "syslog" {
    elasticsearch {
       index => "syslog-%{+YYYY.MM.dd}"
       hosts => ["search-*********"]
       ssl => true

A new Index (syslog-) will now be created. Delete the existing Index pattern in Kibana and create a new one for syslog-*, using @timestamp as the default time field. Once this has been created, Kibana will obtain and updated field list (after a few seconds), and in this, you should see that “metric-value” now has a data type of “Number”.

(For neatness, you may want to replace the “test-syslog-” index with a properly named index even if you data type for “metric-value” is already “Number”).

Now that you have the data you need in Elasticsearch, you can graph it with a visualisation.

First, set your interval to “Last Hour” and create/save a Search for what you want to graph, eg:

metric-name: "LoadAverage" AND hostname: "server1"

Now, create a Line Graph visualisation for that Search, setting the Y-Axis to Average for field “metric-value” and the X-axis to Data Histogram. Click “Apply” and you should see a graph like below:

Screen Shot 2016-06-22 at 10.32.56



Migrating MySQL from AWS RDS to EC2

Applications that use MySQL as their underlying RDBMS commonly evolve as follows:

  1. Application and MySQL server on same EC2 instance
  2. Application balanced between multiple EC2 instances and MySQL server moved to RDS instance
  3. MySQL server moved back to EC2 with DIY High Availability infrastructure
  4. MySQL server moved to Bare Metal with DIY High Availability infrastructure in Co-Lo data centre

As each of these migration steps arrives, the size of the dataset under MySQL management is larger, and the availability of the application more critical, making each step exponentially more complex.

In recent months, I have had to manage Step 3 in this live cycle (the migration back to EC2 from RDS). The following is an account of my experience.

The dataset involved was 4TB in size. That isn’t huge by today’s standards, but its large enough to involve multiple days of data transfer and to require something more than a mysqldump and import in your planning.

The dataset was also highly volatile, in that it was being augmented 24/7, and relied on stored procedures to aggregate data on a daily basis on which commercial SLAs were based. In other words, stopping updates to the dataset for anything more than a couple of hours was not an option.

Time pressure was a further consideration. RDS has a hard limit of 6TB of disk space for an instance (and a 2TB file size limit), and our application was due to introduce new functionality that would increase the rate of data accumulation dramatically. We estimated that we had 2-3 months to complete the transition before the 6TB limit appeared on the horizon.

We did our research and decided on a strategy. We would create a Read Replica of our RDS master and allow it to come into sync. When it was in sync, we would promote it to a standard RDS instance and note the replication point in the Bin Log. We would then do a full mysqldump of the database and inject that directly into our EC2 master, which we estimated would take 96 hours. When this was complete, we would make the EC2 master a slave of the RDS master, and start replication from the point in the Bin Log we had previously noted. We estimated that the data gap would take 18-20 hours to fill, after which we would have a full and intact dataset in EC2.

This plan was fine except for one detail. Because of data relies extensively on stored procedures, it requires a lot of RAM and CPU grunt to get through its workload. Under normal circumstances, we maintained a Read Replica for the RDS master, to allow for intensive read queries that would not impact on the processing capability of the RDS master. On occasion, when there were replication issues, the Bin Log on the RDS would grow rapidly, consuming several hundred GBs of disk space. This isn’t supposed to be an issue in MySQL, but the internal mechanics of RDS and how the Bin Log is managed seem to make it an issue. We we saw the Bin Log growing to this extent, performance on the RDS master rapidly degraded, requiring us to terminate replication completely (in order that RDS would flush the Bin Log).

Given that our plan involved allowing the Bin Log to grow over 96 hours, we were obviously concerned. We discussed this with our support partners, Percona, who recommend an alternative strategy.

They suggested using the MySQL Bin Log utility to back up the Bin Log to location outside RDS, which we could then stream into our EC2 master. This would involved extra steps in the process, and tighter co-ordination, but it seemed to be a lot less riskier in terms of impacting on the RDS master. Our new plan was therefore as follows:

  1. Ensure all applications are using a DNS record for MySQL server that has 0 sec TTL
  2. Create a Read Replica of the RDS master and allow to come in sync
  3. Stop replication on the replica, note the replication point and promote to master
  4. Configure RDS master to retain at least 12 hours of Bin Log, and wait for 12 hours (ensuring that Bin Log growth does not impact on performance during this time)
  5. Start Bin Log backup from RDS master to disk on EC2 master
  6. Commence mysqldump from RDS master and inject directly into EC2 master
  7. On completion of mysqldump and injection, start restore of Bin Log file into EC2 master
  8. Verify that RDS master and EC2 master are approximately in sync
  9. Pause updates to dataset in RDS master for approx. 1 hour
  10. Verify that RDS master and EC2 master are fully in sync
  11. Stop Bin Log backup and Bin Log restore
  12. Re-create stored procedures on EC2 master
  13. Change DNS record for MySQL Server to point to EC2 master
  14. Re-commence updates to dataset

On completion of this process, we had moved our 4TB dataset from RDS to EC2 with only a 1 our interruption in the data update process. For High Availability, we created 2 slaves and managed these with MySQL Utilities. We placed 2 HA Proxy nodes in front of this MySQL server farm and balanced traffic to the HA Proxy nodes with an Elastic Load Balancer listening for TCP (rather than HTTP) connections.

Its probably also worth mentioned that EC2 also has disk limits. A single EBS volumes can have a maximum size of 16TB. To overcome this, you can combine multiple EBS volumes into an LVM set, or use software based RAID 0. We were initially concerned about using these sort of virtual disks for storing data, but this should be less of a concern when you remember than EBS itself has multiple layers of redundancy. We went for an LVM configuration.




Rackspace – Engineered to Fail

I’ve read several articles highlighting US customs data over the last few years regarding Ecommerce Strategy and comparing the cloud infrastructure services offered by Rackspace and Amazon AWS.

Typically, these articles arrive at no firm conclusion as to which is better, referring to issues like cost, support, availability etc.

As someone who has used both services for over 5 years, I find these articles incomprehensible. From a technical viewpoint, there is no comparison between these services. Its like comparing an iPhone 6 to a pocket calculator. Both have a screen, a battery and a digital pulse, but when it comes to sophistication and functionality, they are for all intents and purposes different services.

To put it bluntly, Rackspace is a truly awful experience. They position themselves as a “managed” cloud services provider, which should begin to give an indication of the problem. The beauty of cloud services is that they don’t need to be managed. You buy them, consume them and dispose of them, its as easy as using a supplement or a diet from, you consume and dispose everything bad from your body.

Being a “managed” cloud services provider is like being a “managed” self-service car wash. If the car wash machine is so complex, inflexible and unreliable as to require the constant attention of a human being to ensure that users can wash their cars, then those users might as well just go home and wash their cars in the drive (ie have on-premises infrastructure).

From what I can see, the difference in Amazon and Rackspace in this regard stems from their inception.

Amazon’s AWS platform was a spin-off from their shopping function. They had lots of spare compute capacity outside peak periods and decided to hire it out, along with the tools they used to manage it. This, according to Prosyn IT Support, it was was a battle-hardened infrastructure that was used in real, live-fire web environments, and felt familiar and well-designed to actual system engineers.

Rackspace’s service seems to have been designed by marketing professionals. Its ridiculously basic, doesn’t seem to accommodate any future-proofing, and is totally inflexible. Much more attention seems to have gone into the marketing strategy (check out the number of pretty people on the Rackspace website, compared to the JPG-free Amazon AWS site) than their actual technology.

To illustrate this, I’m going to give a specific example. As well as backing up the points made above, I’m also hoping that this article will be picked up by search engines, as it highlights a major flaw in a certain Rackspace functionality, which would cause problems for Rackspace users if not addressed.

When you create a Rackspace Cloud server, you are given an option to schedule daily imaging of the server. That means you can create an offline copy of the server at a point in time, which you can restore at a later time to re-establish the functionality of that server.

To most people working in infrastructure operations, this means one thing: backup.

You think: “If I can make a daily image of my server, and hold the 7 most recent images, my backups are sorted.” Inevitably, that’s what a lot of Rackspace users are doing.

But here’s the thing (that you only find out when you ask the question): because of the way this imaging process works, the creation of the images will inevitably start to fail, and there is no mechanism in the platform to alert you when they do fail.

The explanation of the technology is given here:

To summarise:

A Rackspace server image is composed of 2 parts: the base image file when the image was first created, and the extended image file that contain all the changes to the image that have been made since the base was created.

That means that when you restore the image, the base is restored first, and then all the changes are applied to the base from the extended image.

That means that if the data and you server is changing, but not necessarily growing (eg you could be writing huge logs, or a huge database, but pruning effectively) the size of your image is constantly growing. For First Generation Rackspace servers, if an image gets to greater than 160GB/250GB (which is peanuts in today’s Big Data world) the imaging will fail. For Next Generation servers, there is apparently no limit, but check out the comments of the “Racker” on this Rackspace support thread:

“Next Gen has no limits for either Windows or Linux, but as an image gets really large, there may be an increased chance of the process failing (Things sometimes go wrong when you are talking about moving hundreds of GBs of data).”

Wow! Like who would need to manage “hundreds of GBs” of data in 2016?! What is this? Star Trek!?

This is consistent with what I was told on a support thread by another Racker, namely that imaging is offered on a “Best Effort” basis. Remember, this is bits and bytes technology we’re talking about here, where stuff normally works or doesn’t. We’re not talking about nuclear fusion.

The same Racker goes on to say:

“For customers who run into these limits, there is generally a larger issue though. The truth is that you really should NOT be using imaging as a backup solution. Think about it, does it really make sense to backup tons and tons of data every day when only a few things changed on the server? Do you really want to spin up a new Cloud Server just to recover a single file?”

That’s a sort of valid point, but here’s a question: if scheduled daily imaging isn’t suitable for backup, why the hell is scheduling daily imaging made available as a feature, inviting hapless Ops Engineers to think that their servers are being reliably backed up when really they are not? What exactly is the purpose of scheduled daily image if not for backup?

And the reason the point is only “sort of” valid is that there are times when you will need to make a full daily image of a server. Let’s say you have a MySQL server that has a 200GB data payload. You can’t run a mysqldump against that every night, because it will grind the server to a standstill. You have to do a bit-for-bit image of the system to back it up (as recognised by Amazon RDS service, where you can schedule daily snapshots of your RDS instances).

It actually gets worse.

Imaging cannot only fail because of image size, but also because of “bugs” in the Rackspace platform. A few weeks back, I noticed that imaging on one of our smaller Rackspace servers had stopped working. I dialled up a support chat and ask the guy who responded what was going on.

Theodore R: Garreth! thank you for holding. We have a known bug in ORD that we've seen a few failures on scheduled images. To help with this. Go ahead and cancel the two jobs stuck at 0%. Then de-activiate the schedule then re-enable the schedule. I'm sorry about this it is a known issue we are working on resolving this.

Me: If you knew there was a bug why didn't you tell your customers?

Theodore R: I don't have that answer. As I'm front line support but I will bring that up to my manager in our team meeting today.

Theodore R: I do apologize about this

So they had a bug in their platform that has probably disabled scheduled images for hundreds customers, which isn’t alerted, and they haven’t told anyone!

This is just a sample of the grind I go through with Rackspace every week. While writing this, I am monitoring a ticket they’ve opened to tell me that one of my servers has failed and they are working on it. I have been instructed:

“Please do not access or modify ‘<server-name>’ during this process.”

Of course, it doesn’t seem to dawn on them that this could be a public web server, with thousands of users knocking on the http door all the time, and the only way I can stop this is to login to the server to shutdown the web server, which I am apparently not supposed to do.”

If you still don’t believe me, you can look at another piece of evidence. For the last year, Rackspace have been offering a service called “Fanatical Support for Amazon AWS” (Pretty People on web page? Check.)

Yes, you can pay Rackspace to “manage” your investment in their main competitor. This is basically Rackspace saying “Yes, we know our service is dogfood, but in order to keep the lights on, we going to try and squeeze a few dollars out of customers who’ve seen the light and are moving elsewhere.

Like I said at the start, ignore the clickbait “comparison” articles. Rackspace is something you should avoid in your IT organisation in the same way you avoid IE6 and Blackberrys.

Allowing puppet agents manage their own certificates


Why would you want to allow a puppet agent manage the certificates the puppet master holds for that agent? Doesn’t that defeat the whole purpose of certificate based authentication in puppet?

Well, yes, it does, but there are situations in which this is useful, but only where security in not a concern!!

Enter Cloud Computing.

Servers in Cloud Computing environments are like fruit flies. There are millions of them all over the world being born and dying at any given time. In a an advanced Cloud configuration they can have lifespans of hours, if not minutes.

As puppet generally relies on fully qualified domain names to match agent requests to stored certificates, this can become a bit of a problem, as server instances that come and go in something like a DCIM software can sometimes be required to have the same hostname at each launch.

Imagine the following scenario:

You are running automated performance testing, in which you want to test the amount of time if takes to re-stage an instance with a specific hostname and run some tests against it. Your script both launches the instance an expects the instance to contact a puppet master to obtain its application.

In this case, the first time the instance launches, the puppet agent will generate a client certificate signing request, send that to the master, get it signed and pull the necessary catalog. The puppet master will then have certificate for that agent.

Now, you terminate the instance and re-launch it. The agent presents another signing request, with the same hostname, but this time the puppet master refuses to play, telling you that it already has a certificate for that hostname, and the one you are presenting doesn’t match.

You’re snookered.

Or so you think. The puppet master has a REST api that is disabled by default but when you can open up to it receive HTTP requests to manage certificates. To enable the necessary feature, add the following to your auth.conf file

path /certificate_status
auth any
method find, save, destroy
allow *

Restart the puppet master when you’ve done this.

sudo service puppetmaster restart

Next, when you start you server instance, include the following script at boot. It doesn’t actually matter when this is run, provided it is run after the hostname of the instance has been set.


curl -k -X PUT -H "Content-Type: text/pson" --data '{"desired_state":"revoked"}' https://puppet:8140/production/certificate_status/$HOSTNAME

curl -k -X DELETE -H "Accept: pson"  https://puppet:8140/production/certificate_status/$HOSTNAME

rm -Rf /var/lib/puppet/ssl/*

puppet agent -t

This will revoke and delete the agent certificate on the master, delete the agent’s copy of the certificate and renew the signing process, giving you new certs on the agent and master and allowing the catalog to be ingested into the agent.

You can also pass a script like this as part of the Amazon EC2 process of launching an instance.

aws ec2 run-instances  --user-data file://./

Where is the name of the locally saved script file, and it is saved in the same directory as your working directory (otherwise include the absolute path).

With this in place, each time you launch a new instance, regardless of its hostname, it will revoke any existing cert that has the same hostname, and generate a new one.

Obviously, if you are launching hundreds of instances at the same time, you may have concurrency issues, and some other solution will be required.

Again, this is only a solution for environments where security is not an issue.

Install Ruby for Rails on Amazon Linux

A quick HOWTO on how to install Ruby for Rails on Amazon Linux

Check your Ruby version (bundled in Amazon Linux)

ruby -v
ruby 2.0.0p481 (2014-05-08 revision 45883) [x86_64-linux]

Check your sqlite3 version (bundled with Amazon Linux)

sqlite3 --version
3.7.17 2013-05-20 00:56:22 118a3b35693b134d56ebd780123b7fd6f1497668

Check Rubygems version (bundled with Amazon Linux)

gem -v

Install Rails (this sticks on the command line for a while, be patient. The extra parameters exclude the documentation, which if installed, can melt the CPU on smaller instances whilst compiling)

sudo gem install rails --no-ri --no-rdoc

Check Rails installed

rails --version
Rails 4.1.6

Install gcc (always handy to have)

sudo yum install -y gcc

Install ruby and sqlite development packages

sudo yum install -y ruby-devel sqlite-devel

Install node.js (Rails wants a JS interpreter)

 sudo bash
curl -sL | bash -
sudo yum install -y nodejs

Install the sqlite3 and io-console gems

gem install sqlite3 io-console

Make a blank app

mkdir myapp
cd myapp
rails new .

Start it (in the background)

bin/rails s &

Hit it

wget -qO- http://localhost:3000

Debug (Rails console)

bin/rails c

Application monitoring with Nagios and Elasticsearch

As the applications under your control grow, both in number and complexity, it becomes increasingly difficult to rely on predicative monitoring.

Predicative monitoring is monitoring things that you know should be happening. For instance, you know your web server should be accepting HTTP connections on TCP port 80, so you use a monitor to test that HTTP connections are possible on TCP port 80.

In more complex applications, it harder to predict what may or may not go wrong; similarly, some things can’t be monitored in predictive way, because your monitoring system may not be able to emulate the process that you want to monitor.

For example, lets say your application sends Push message to a mobile phone application. To monitor this thoroughly, you would have to have a monitor that persistently sends Push messages to a mobile phone, and some way of monitoring that the mobile phone received them.

At this stage, you need to invert your monitoring system, so that it stops asking if things are OK, and instead listens for applications that are telling it that they are not OK.

Using your application logs files is one way to do this.

Well-written applications are generally quite vocal when it comes to being unwell, and will always describe an ERROR in their logs if something has gone wrong. What you need to do is find a way of linking your monitoring system to that message, so that it can alert you that something needs to be checked.

This doesn’t mean you can dispense with predictative monitoring altogether; what is does means is that you don’t need to rely on predicative monitoring entirely (or in other words, you don’t need to be able to see into the future) to keep your applications healthy.

This is how I’ve implemented log based monitoring. This was something of a nut to crack, as our logs arise from an array of technologies and adhere to very few standards in terms of layout, logging levels and storage locations.

The first thing you need is a logstash implementation. Logstash comprises a stack of technologies: an agent to ship logs out to a Redis server; a Redis server to queue logs for indexing; a logstash server for creating indices and storing them in elasticsearch; an elasticsearch server to search your indices.

The setup of this stack is beyond this article; its well-described over on the logstash website, and is reasonably straightforward.

Once you have your logstash stack set up, you can start querying the elasticsearch search api for results. Queries are based on HTTP POST and JSON, and results are output in JSON.

Therefore, to test you logs, you need to issue a HTTP POST query from Nagios, check the results for ERROR strings, and alert accordingly.

The easient way to have Nagios send a POST request with a JSON payload to elasticsearch is with the Nagios jmeter plugin, which allows you to create monitors based on your jmeter scripts.

All you need then is a correctly constructed JSON query to send to elasticsearch, which is where things get a bit trickier.

Without going into this in any great detail, formulating a well-constructed JSON query that will parse just the right log indices in elasticsearch isn’t easy. I cheated a little in this. I am familiar with the Apache Lucene syntax that the Logstash Javascript client, Kibana, uses, and was able to formulate my query based on this.

Kibana sends encrypted queries to elasticsearch, so you can’t pick them out of the HTTP POST/GET variables. Instead, I enabled logging of slow queries on elasticsearch (threshold set to 0s) so that I could see in the elasticsearch logs what exact queries were being run against elasticsearch. Here’s an example:

  "size": 100,
  "sort": {
    "@timestamp": {
      "order": "desc"
  "query": {
    "filtered": {
      "query": {
        "query_string": {
          "query": "NOT @source_host:\"uatserver\"",
          "default_field": "_all",
          "default_operator": "OR"
      "filter": {
        "range": {
          "@timestamp": {
            "from": "2014-10-06T11:05:25+00:00",
            "to": "2014-10-06T12:05:25+00:00"
  "from": 0

You can test a query like this by sending it straight to your elasticsearch API:

curl -XPOST 'http://localhost:9200/_search' -d '{"size":100,"sort":{"@timestamp":{"order":"desc"}},"query":{"filtered":{"query":{"query_string":{"query":"NOT @source_host:\"uatserver\"","default_field":"_all","default_operator":"OR"}},"filter":{"range":{"@timestamp":{"from":"2014-10-06T11:05:25+00:00","to":"2014-10-06T12:05:25+00:00"}}}}},"from":0}'

This searches a batch of 100 log entries that do not have a tag of “uatserver”, from a previous 5 minute period.

Now that we now what we want to send to elasticsearch, we can construct a simple jmeter script. In this this, we simply specify a a HTTP POST request, containing Body Data of the JSON given above, and include a Response Assertion for the strings we do not want to see in the logs.

We can then use that script in Nagios with the jquery plugin. If the script finds the ERROR string in the logs, it will generate an alert.

2 things are important here:

The alert will only tell you that an error has appeared in the logs, not what that error was; and if the error isn’t persistent, the monitor will eventually recover.

Clearly, there is a lot of scope for false negatives in this, so if your logs are full of tolerable errors (they shouldn’t be really) you are going to have to be more specific about your search strings.

The good news is that if you get this all working, its very easy to create new monitors. Rather than writing bespoke scripts and working with Nagios plugins, all you need to do is change the queries and the Response Assertions in your meter script (learn about CBD OIl and its benefits at, and you should be able to monitor anything that is referenced in your application logs.

To assist in some small way, here is a link to a pre-baked JMeter script that includes an Apache Lucene query, and is also set up with the necessary Javascript-based date variables to search over the previous 15 minutes.

Command line tool for checking status of instances in Amazon EC2

I manage between 10 and 15 different Amazon AWS accounts for different companies.

When I needed to find out information about a particular instance, it was a pain to have log into the web interface each time. Amazon do provide an API that allows you query data about instances, but to use that, you need to store an Access Key and Secret on your local computer, which isn’t very safe when you’re dealing with multiple account.

To overcome, this I patched together Tim Kay’s excellent aws tool with GPG and a little PHP, to create a tool which allows you query the status of all instances in a specific region in an Amazon EC2 account, using access credentials that are locally encrypted, so that storing them locally isn’t an issue.

Output from the tool is presented on a line by line basis, so you can use grep to filter the results.

Sample output: aws.account1 us-east-1

"logs-use"  running  m1.medium  us-east-1a  i-b344b7cb
"adb2-d-use"  running  m1.small  us-east-1d  i-07d3e963
"pms-a-use"  running  m1.medium  us-east-1a  i-90852ced
"s2-sc2-d-use"  running  m1.medium  us-east-1d  i-3d40b442
"ks2-sc3-d-use"  running  m1.small  us-east-1d  i-ed2ed492
"ks1-sc3-c-use"  running  m1.small  us-east-1c  i-6efb9612
"adb1-c-use"  running  m1.small  us-east-1c  i-98cf44e4
"s1-sc1-c-use"  running  m1.medium  us-east-1c  i-956a76e8
"sms2-d-use"  running  m1.medium  us-east-1d  i-a86ef686
"uatpms-a-use"  running  m1.small  us-east-1a  i-b8cf5399
"uatks1-sc3-c-use"  running  t1.micro  us-east-1c  i-de336dfe
"uats1-sc1-c"  running  m1.medium  us-east-1c  i-35396715
"uatadb1-c-use"  running  t1.micro  us-east-1c  i-4d316f6d
"sms1-c-use"  running  m1.medium  us-east-1c  i-31b29611

(Note that public ips have been changed in this example)

You can obtain the tool from Bitbucket:

How to monitor the Amazon Linux AMI Security Patch RSS feed with Nagios

People who use Amazon AWS will be familiar with the Amazon Linux AMI, which is a machine image provided by Amazon with a stripped down installation of Linux.

The AMI acts as a starting point for building up your own AMIs, and has its own set of repos maintained by Amazon for obtaining software package updates.

Amazon also maintains an RSS feed, which announces the availability of new security patches for the AMI.

One of the requirements of PCI DSS V2 compliance is as follows:

6.4.5 Document process for applying security patches and software updates

That means you have to have a written down process for being alerted to and applying software patches to servers in your PCI DSS scope.

You could of course commit to reading the RSS feed every day, but that’s human intervention, which is never reliable. You could also set up your Amazon servers to simply take a system wide patch update every day, but if you’d prefer to review the necessity and impact of patches before applying them, that isn’t going to work.

Hence, having your monitoring system tell you if a new patch has been released for a specific software component would be nice thing to have, and here it is, in the form of a Nagios plugin.

The plugin is written in PHP (I’m a ex-Web Developer) but is just as capable as when it comes to Nagios as PERL and Python (without the need for all those extra modules).

I’ve called it check_rss.php, as it can be used on any RSS feed. There is another check_rss Nagios plugin, but it won’t work in this instance, as it only checks the most recent port in the RSS stream, and doesn’t include any way to retire alerts.

You can obtain the Plugin from Bitbucket:

The script takes the following arguments:

“RSS Feed URL”

“Quoted, comma Separated list of strings you want to match in the post title”

“Number of posts you want to scan”

“Number of days for which you want the alert to remain active”



define command {
    command_name check_rss
    command_line $USER1$/check_rss.php $ARG1$ $ARG2$ $ARG3$ $ARG4$


check_command   check_rss!!"openssl"!30!3

You need to tell Nagios how long you want the alert to remain active, as you have no way of resolving the alert (ie you can’t remove it from the RSS feed)

This mechanism allows you to “silence” the alert after a number of days. This isn’t a feature of Nagios, rather of the script itself.

The monitor will alert if it finds *any* patches, and include *all* matching patches in its alert output.