Category Archives: The Internet

Basic guide to the issues around CNAME at the zone apex

What is a CNAME?

In DNS, CNAME stands for “canonical name”.

“Canonical” is one of those words you hear every now and then in technology discussions, but not many people are exactly sure what it means. In effect, it means “official”, which can be further extrapolated to “universally accepted”.

So in a DNS zone file, a CNAME looks like this:

www.nightbluefruit.com. 3600 IN CNAME www.nightbluefruit.hostingcompany.com

In this case, the canonical name for the server which hosts www.nightbluefruit.com is:

www.nightbluefruit.hostingcompany.com

An alias (“also know as”) of that canonical name is:

www.nightbluefruit.com

So if you request this name, which is an alias, the response you will get will be data for the canonical record, for any type of record. For example, if you request the MX data for www.nightbluefruit.com, you will/should receive the MX data for  www.nightbluefruit.hostingcompany.com.

At this point, it is important to understand that www.nightbluefruit.com is an alias of www.nightbluefruit.hostingcompany.com, and not the other way around.

What is the zone apex?

The zone apex is that part of the a DNS zone for which data exists for the entire domain, rather that a specific host in the domain.

So if you want to define an MX records for the entire domain, nightbluefruit.com, you would create that record at the apex:

nightbluefruit.com. 3600 IN MX mail.nightbluefruit.com.

Its worth noting that there is a difference between the default record type for a domain and the zone apex:

*.nightbluefruit.com. 3600 IN A 192.168.100.1

The default value for a particular record type defines the data that should be returned if a request is received for a record type which does not have a specific match is the zone file.

What’s a CNAME at the zone apex?

A CNAME at the zone apex looks like this:

nightbluefruit.com. 3600 IN CNAME www.nightbluefruit.hostingcompany.com.

This is a common requirement for website owners who want to advertise their website without the “www” prefix and who also use the services of a 3rd party web hosting company who cannot provide them with a dedicated ip address and instead provide them with a canonical name for the website running on their infrastructure.

Why is it not allowed?

Given that so many web owners have the requirement outlined above, it seems incredulous that this isn’t allowed, which is why this is such a hot topic.

Firstly, let’s clarify that there is no explicit RFC rule that says “you can’t have a CNAME at the zone apex”.

What the RFCs do say is that if you define a CNAME with a name (the left side of the record declaration) of “nightbluefruit.com”, you can’t create any other record of any other type using the same name. For example, the following would be non-compliant:

nightbluefruit.com. 3600 IN CNAME www.nightbluefruit.hostingcompany.com.
nightbluefruit.com  3600 IN MX mail.nightbluefruit.com.

The reason for this goes back to understanding that the “alias” is the left side and the “official” name is the ride side. If something is “official”, there can only be one version of it. The first record in the above sequence tells DNS that www.nightbluefruit.hostingcompany.com is “official” for nightbluefruit.com, so DNS doesn’t want to know about any other records for nightbluefruit.com.

But, you’ll ask, why can’t DNS simply segregate “officialness” based on the record type? The answer to this is that the DNS standard came into being long before HTTP or any contemplation of shared web hosting, and it is no longer practical to reverse engineer all of the DNS software that has grown out of the standard to fit this specific use case.

Is this strict?

This is where is starts to get interesting.

Given that so many people want CNAMEs at their zone apex ($$$$), many software developers and their product managers have taken a sideways look at the RFC and determined that it permits a degree of flexibility in its implementation:

If a CNAME RR is present at a node, no other data should be present; this ensures that the data for a canonical name and its aliases cannot be different.

The key phrase is “should be”. The argument runs that the absence of the phrase “must be” is in fact a license to interpret the standard more liberally, and there are DNS server implementations on the market that will allow CNAME records to co-exist with other records with the same name.

If you’re using a web hosting or DNS provider who says you can have a CNAME at the zone apex, they will be using such an implementation. This isn’t something that exists only in the dark, backstreets of web hosting. Global internet services providers like Cloudflare have permitted CNAMEs at the zone apex.

In truth, this interpretation exists on very shaky foundations. There is a more detailed discussion of this issue here.

How does this problem manifest itself?

RFCs exist to allow different people design software that will interoperate. If person A is writing code to implement a DNS server, and person B is writing software to implement a DNS client. If they both follow the standard, the 2 pieces of software should work together. The RFC doesn’t tell them how to write code, but it does tell them how their code should behave.

When people beginning interpreting the intent of an RFC, nothing good comes of it. It may not be immediately apparent, but the longer software exists, the more edge cases it has to deal with, and that’s where it becomes important that one piece of software can anticipate the response of another.

In terms of a practical example, this is really good:
https://social.technet.microsoft.com/Forums/exchange/en-US/b3beefee-e353-44ec-b456-f2c70bcd1913/cname-issue?forum=exchange2010

In this case, MS Exchange Server 2010 stopped delivering mail to addresses whose DNS zone had a CNAME at the zone apex. The Exchange mail server was relying on a local DNS cache. Previously, someone had queried an A record for company.com, and received a CNAME response. That data was cached. Later, when the MX record for company.com was queried, the cache ignored the fact that the cached record was an A record (this was compliant behaviour) and returned a CNAME. The Exchange server correctly rejected this as CNAMEs are not valid data for MX queries.

Are there workarounds?

The first workaround is to not implement the RFC standard. Some providers will tell that this is their workaround, but it isn’t. It’s just chicanery, and you should avoid it.

The big cloud hosting companies are the best place to go for workarounds. Amazon AWS have a black box solution in Route53 which allows CNAMEs at the zone apex if the canonical name is an AWS resource identifier (like an ARN for an Elastic Load Balancer).

The most en vogue workaround at the moment is what is called CNAME flattening.

What is CNAME flattening?

DNS server software that implements CNAME flattening permits the user to create multiple CNAMEs of the same name with different official values, which allows the user to create a CNAME at the zone apex. When you configure the zone file in this way, the server will accept it and start as normal.

When a query is then received for one of these records, rather than return the CNAME value, the server will go off and query the CNAME value, and any subsequent values, until it gets to an IP address, and then return that IP address as the response to the original query.

Is CNAME flattening standards compliant?

Yes and no.

On the one hand it permits the existence of something that the RFC says is not permitted, but equally, it behaves in a way that is RFC compliant.

Whether a user wants to rely on CNAME flattening is something they will have to make a call on them according to their individual circumstances.

Using the map directive to resolve multiple conditions in Nginx

As your Nginx configuration expands and becomes more complex, you will inevitably be faced with a situation if which you have to apply configuration directives on a conditional basis.

Nginx includes an if directive, but they really don’t want you to use it: https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/

The alternative that is generally recommended is the map directive, which is super-efficient as map directives are only evaluated when they are used: http://nginx.org/en/docs/http/ngx_http_map_module.html

This isn’t quite as intuitive as the if directive, and newcomers to Nginx can struggle with its logic.

Basically put, map will create a new variable, and assign the value of that variable based on the value of other variables. eg

$new_variable = 1 if $existing_variable = true

or

$new_variable = 0 if $existing_variable = false.

map $existing_variable $new_variable {
    "true" 1;
    "false" 0;
}

You can leverage this in conditional configuration by assigning a configuration value to a new variable, and using that in your configuration. For example, use a different whitelist depending on the source ip address of the request:

map $remote_addr $whitelist {
    "192.168.100.1" "whitelist1.incl";
    "192.168.100.2" "whitelist2.incl";
}

location / {
   ...
   include $whitelist;
   ...
}

This works fine when you you want to set the value of a new variable based on evaluation of one other variable, but in a typical if statement, you can evaluate multiple variables at the same time:

if ( $a == 1 && and $b == 2) etc etc

Can you do the same thing with map?

Some people will tell you can’t, which is technically true, but you can “couple” map blocks to produce the same effect. Let’s use the following example:

If a request is for host "example.com" and the source ip address is 192.168.100.1, return a different home page.

If this instance, we need 2 map blocks:

  • One to test if the host name is “example.com”
  • One to test if the source ip address is 192.168.100.1

and 2 variables:

  • One to hold the value of the home page file ($index_page)
  • One to link the map conditions together ($map_link)
#Test the host name, and assign the home page filename value to the $map_link variable
map $host $map_link {
    default "index.html";
    "example.com" "index_new.html";
}

#Test the source address, and if it matches the relevant address, interpolate the value assigned from the previous map
map $remote_addr $index_page {
    default "index.html";
    "192.168.100.1" "${map_link}";
}

location / {
    ....
    index $index_page;
    ....
}

The full logic here is as follows:

If the hostname is “example.com”, we should provisionally make the home page file “index_new.html”. Otherwise, the home page file should be “index.html”.

If the source ip address is 192.168.100.1, which is the second part of our test, we should refer to the result of the first part of our test. Otherwise, we can ignore the first part of our test and use the default value “index.html”.

 

 

 

 

 

Stagduction

Stagduction

(Noun) A web application state in which the the service provided is not monitored, not redundant and has not been performance tested, but which is in use by a large community of people as a result of poor planning, poor communication and over-zealous sales people.

Is Skype an appropriate tool in corporate environments?

This is a question that has plagued me for several years, in that I have never been able to establish a consistent level of Skype quality in a corporate environment, despite having lots of bandwidth and obtained the consultancy services of CCIE level network experts.

The answer to the question is ultimately, no.

Let me explain by running through the questions.

1. How does Skype work at a network level?

Skype is a “Peer To Peer” (P2P) application. That means that when 2 people are having a Skype conversation, their computers *should* be directly connected, rather than connected via a 3rd computer. For the sake of comparison, Google Hangouts is not a P2P application. Google Hangout participants connect to each other via Google Conference Servers.

2. Does Skype work with UDP or TCP?

Skype’s preference is for UDP, and when Skype can establish a direct P2P connection using UDP, which is typically the case for residential users, call quality is very good. This is because UDP is a much faster protocol than TCP when used for streaming audio and video.

3. What’s the difference between residential and corporate users?

Residential internet connections are typically allocated a temporary fixed public ip address. This IP gets registered to a Skype user on Skype’s servers, so when someone needs to contact that user, Skype knows where to direct the call, and can use UDP to establish a call between the participating users.

In corporate environments, where there are lots of users using the same internet connection, sharing of a a single public IP address between those users has to occur (Port Address Translation). That means that the Skype servers will have registered the same public ip address for all the users in that organisation. This means that Skype is not able to establish a direct UDP P2P connection between a user on the outside of that organisation and a user in that organisation, and has to use other means to make that connection.

4. What are those other means?

When direct connectivity between clients is not possible, Skype uses a process called “UDP hole punching”. In this mechanism, 2 computers that cannot communicate directly with each other communicate with one or more third party computers that can communicate with both computers.

Connection information is passed between the computers in order to try and establish a direct connection between the 2 computers participating in the Skype call.

If ultimately a direct connection cannot be established, Skype will use the intermediary computers to relay the connection between the 2 computers participating in the conversation.

In Skype terminology, these are known as “relay nodes”, which are basically just computers running Skype than have direct UDP P2P capability (typically residential users with good broadband speeds).

From the Skype Administrators Manual:

http://download.skype.com/share/business/guides/skype-it-administrators-guide.pdf

2.2.4 Relays

If a Skype client can’t communicate directly with another client, it will find the appropriate relays for the connection and call traffic. The nodes will then try connecting directly to the relays. They distribute media and signalling information between multiple relays for fault tolerance purposes. The relay nodes forward traffic between the ordinary nodes. Skype communication (IM, voice, video, file transfer) maintains its encryption end-to-end between the two nodes, even with relay nodes inserted.

As with supernodes, most business users are rarely relays, as relays must be reachable directly from the internet. Skype software minimizes disruption to the relay node’s performance by limiting the amount of bandwidth transferred per relay session. 

5. Does that mean that corporate Skype traffic is being relayed via anonymous third party computers?

Yes. The traffic is encrypted, but it is still relayed through other unknown hosts if a direct connection between 2 Skype users is not possible.

6. Is this why performance in corporate environments is sometimes not good?

Yes. If a Skype conversation is dependent on one of more relay nodes, and one of these nodes experiences congestion, this will impact on the quality of the call.

7. Surely, there is some solution to this?

A corporate network can deploy a proxy server, which is directly mapped to a dedicated public ip address. Ideally, this should be a UDP-enabled SOCKS5 server, but a TCP HTTP Proxy server can also be used. If all Skype connections are relayed through this server, Skype does not have to use relay nodes, as Port Address Translation is not in use.

It’s also a good idea to ensure your company’s DNC Compliance. The problem is that manually scrubbing phone lists is a time-consuming process that takes you away from other vital business tasks. A DNC scrubbing company like PossibleNOW offers you several professional services that can make this process much easier and more efficient. Visit https://www.possiblenow.com/do-not-call-list-scrubbing for more info.

8. So what’s the catch?

The problem with this solution is that it is not generally possible to force the Skype client to use a Proxy Server. When the client is configured to use a Proxy Server, it will only use it if there is no other way to connect to the Internet. So, if you have a direct Internet connection, even one based on Port Address Translation, which impacts on Skype quality, Skype will continue to use this, even if a better solution is available via a Proxy Server.

9. Why would Skype do this?

Skype is owned by Microsoft. Skype have a business product that attaches to Microsoft Active Directory that allows you do force a Proxy connection. So if you invest in a Microsoft network, Microsoft will give you a solution to enable better Skype performance in corporate networks. If you don’t want to invest in a Microsoft network, you’re stuck, and your only option is to block all outbound Internet access from your network and divert it via your Proxy server.

For a lot of companies, particularly software development companies who depend on 3rd party web services, this is not a practical option.

10. What is the solution?

At this time the primary options for desktop Audio/Video conferencing are either Skype or Google Hangouts.

When Skype can be used in an environment where P2P UDP connectivity is “always on”, it provides a superior audio/video experience to Google Hangouts, which is not P2P, and which communicates via central Google Servers.

Where an environment uses Port Address Translation, Skype performance will depend on the ability of Skype client to establish connections via relays, which means Skype performance becomes dependent on the resources available to those relays.

In this instance, Google Hangout may be a better choice where consistent quality is required, as quality can be guaranteed by providing sufficient bandwidth between the corporate network and Google.

 

How to install and setup Logstash

So you’ve finally decided to put a system in place to deal with the tsumnami of logs your web applications are generating, and you’ve looked here and there for something Open Source, and you’ve found Logstash, and you’ve had a go at setting it up…

…and then you’ve lost all will to live?

Any maybe too, you’ve found that every trawl through Google for some decent documentation leads you to this video of some guy giving a presentation about Logstash at some geeky conference, in which he talks in really general terms about Logstash, and doesn’t give you any clues as to how you go about bring it into existence?

Yes? Well, hopefully by landing here your troubles are over, because I’m going to tell you how to set up Logstash from scratch.

First, lets explain the parts and what they do. Logstash is in fact a collection of different technologies, in which the Java programme, Logstash, is only a part.

The Shipper

This is the bit that reads the logs and sends them for processing. This is handled by the Logstash Java programme.

Grok

This is the bit that takes logs that have no uniform structure and gives them a structure that you define. This occurs prior to the logs being shipped. Grok is a standalone technology. Logstash uses its shared libraries.

Redis

This is a standalone technology that acts as a broker. Think of it like a turnstile at a football ground. It allows multiple events (ie lines of logs) to queue up, and then spits them out in a nice orderly line.

The Indexer

This takes the nice ordered output from Redis, which is neatly structured, and indexes it, for faster searching. This is handled by the Logstash Java programme.

Elasticsearch

This is a standalone technology, into which The Indexer funnels data, which stores the data and provides search capabilities.

The Web Interface

This is the bit that provides a User Interface to search the data that has been stored in Elasticsearch. You can run the web server that is provided by the Logstash Java programme, or you can run the Ruby HTML/Javascript based web server client, Kibana. Both use the Apache Lucene structured query language, but Kibana has more features, a better UI and is less buggy (IMO).

(Kibana 2 was a Ruby based server side application. Kibana 3 is a HTML/Javascript based client side application. Both connect to an ElasticSearch backend).

That’s all the bits, so lets talk about setting it up.

First off, use a server OS that has access to lots of RPM repos. CentOS and Amazon Linux (for Amazon AWS users) are a safe bet, Ubuntu slightly less so.

For Redis, Elasticsearch and the Logstash programme itself, follow the instructions here:

http://logstash.net/docs/1.2.1/

(We’ll talk about starting services at bootup later)

Re. the above link, don’t bother working through the rest of the tutorial beyond the installation of the software. It demos Logstash using STDIN and STDOUT, which will only serve to confuse you. Just make sure that Redis, Elasticsearch and Logstash are installed and can be executed.

Now, on a separate system, we will setup the Shipper. For this, all you need it the Java Logstash programme and a shipper.conf config file.

Lets deal with 2 real-life, practical scenarios:

1. You want to send live logs to Logstash
2. You want to send old logs to Logstash

1. Live logs

Construct a shipper.conf file as follows:

input {

   file {
      type => "apache"
      path => [ "/var/log/httpd/access.log" ]
   }

}

output {
   stdout { debug => true debug_format => "json"}
   redis { host => "" data_type => "list" key => "logstash" }
}

What this says:

Your input is a file, located at /var/log/httpd/access.log, and you want to record the content of this file as the type “apache”. You can use wildcards in your specification of the log file, and type can be anything.

You want to output to 2 places: firstly, your terminal screen, and secondly, to the Redis service running on your Logstash server

2. Old logs

Construct a shipper.conf file as follows:

input {

tcp {
type => "apache"
port => 3333
}

}

output {
stdout { debug => true debug_format => "json"}
redis { host => "" data_type => "list" key => "logstash" }
}

What this says:

Your input is whatever data you read from TCP port 3333, and you want to record the content of this file as the type “apache”. You can use wildcards in your specification of the log file, and type can be anything.

You want to output to 2 places: firstly, your terminal screen, and secondly, to the Redis service running on your Logstash server.

That’s all you need to do for now on the Shipper. Don’t run anything yet. Go back to your main Logstash server.

In the docs supplied at the Logstash website, you were given instructions how to install Redis, Logstash and Elasticsearch, including the Logstash web server. We are not going to use the Logstash web server, and use Kibana instead, so you’ll need to set up Kibana (3, not 2. Version 2 is a Ruby based server side application).

https://github.com/elasticsearch/kibana/

Onward…

(We’re going to be starting various services in the terminal now, so you will need to open several terminal windows)

Now, start the Redis service on the command line:

./src/redis-server --loglevel verbose

Next, construct an indexer.conf file for the Indexer:

input {
   redis {
      host => "127.0.0.1"
      type => "redis-input"
      # these settings should match the output of the agent
      data_type => "list"
      key => "logstash"

      # We use json_event here since the sender is a logstash agent
      format => "json_event"
   }
}

output {
   stdout { debug => true debug_format => "json"}

   elasticsearch {
      host => "127.0.0.1"
   }
}

This should be self-explanatory: the Indexer is talking input from Redis, and sending it to Elasticsearch.

Now start the Indexer:

java -jar logstash-1.2.1-flatjar.jar agent -f indexer.conf

Next, start Elasticsearch:

./elasticsearch -f

Finally, crank up Kibana.

You should now be able to access Kibana at:

http://yourserveraddress:5601

Now that we have all the elements on the Logstash server installed and running, we can go back to the shipping server and start spitting out some logs.

Regardless of how you’ve set up your shipping server (live logs or old logs), starting the shipping process involves the same command:

java -jar logstash-1.2.1-flatjar.jar agent -f shipper.conf

If you’re shipping live logs, that’s all you will need to do. If you are shipping old logs, you will need to pipe them to the TCP port you opened in your shipper.conf file. Do this is a separate terminal window.

nc localhost 3333 < /var/log/httpd/old_apache.log

Our shipping configuration is setup to output logs both to STDOUT and Redis, so you should see lines of logs appearing on your terminal screen. If the shipper can’t contact Redis, it will tell you it can’t contact Redis.

Once you see logs being shipped, go back to your Kibana interface and run a search for content.

IMPORTANT: if your shipper is sending old logs, you need to search for logs from a time period that exists in those logs. there is no point in searching for content from the last 15 mins if you are injecting logs from last year.

Hopefully, you’ll see results in the Kibana window. If you want to learn the ins and outs of what Kibana can do, have a look at the Kibana website. If Kibana is reporting errors, retrace the steps above, and ensure that all of the components are running, and that all necessary firewall ports are open.

2 tasks now remain: using Grok and setting up all the components to run as services at startup.

Init scripts for Redis, ElasticSearch and Kibana are easy to find through Google. You’ll need to edit them to ensure they are correctly configured for your environment. Also, for the Kibana init script, ensure you use the kibana-daemon.rb Ruby script rather than the basic kibana.rb version.

Place the various scripts in /etc/init.d, and, again on CentOS, set them up to start at boot using chkconfig, and control them with the service command.

Grok isn’t quite so easy.

The code is available from here:

https://github.com/jordansissel/grok/

You can download a tarball of it from here:

https://github.com/jordansissel/grok/archive/master.zip

Grok has quite a few dependencies, which are listed in its docs. I was able to get all of these on CentOS using yum and the EPEL repos:

rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/$(uname -i)/epel-release-5-4.noarch.rpm

then

yum install -y gcc gperf make libevent-devel pcre-devel tokyocabinet-devel

Also, after you have compiled grok, make sure you run ldconfig, so that its libraries are shared with Logstash.

How to explain Grok?

In the general development of software over the last 20-30 years, very little thought has gone into the structure of log files, which means we have lots of different structures in log files.

Grok allows you to "re-process" logs from different sources so that you can give them all the same structure. This structure is then saved in Elasticsearch, which makes querying logs from different sources much easier.

Even if you are not processing logs from different sources, Grok is useful, in that you can give the different parts of a line of a log field names, which again makes querying much easier.

Grok "re-processing", or filtering, as it is called, occurs in the same place as your Shipper, so we add the Grok config to the shipper.conf file.

This involves matching the the various components in your log format to Grok data types, or patterns as they are referred to in Grok. Probably the easiest way to do this is with this really useful Grok debugger:

http://grokdebug.herokuapp.com/

Cut and paste a line from one of your logs into the input field, and then experiment with the available Grok patterns until you see a nice clean JSON object rendered in the output field below.

The Joy of SIP

It’s hard to thing of a technology that has more fundamentally revolutionized the way we do things more than IP telephony, and SIP in particular.

Back in the early 1990s, I was charged with installing a telephone system in an office building that would house 50+ users. I had to buy an Ericsson PABX, which took up an entire corner of the comms room, pull up all the floor panels to route new cables to the desks, buy all sorts of expansion cards to accommodate voicemail and PSTN interfaces, and buy 50 bulky handsets so that users could actually use the thing from their desks. I then had to go off to ESAT and get 8 actual telephone lines, which took 4 weeks to get installed. The whole project cost about €17k, and that was nearly 20 years ago.

Let’s compare that to my most recent installation of a telephone system, earlier this year, again in an office of approximately 50 users.

For the PABX, I bought a tower PC from Amazon which cost €299. I installed the OpenSource software PABX, Asterisk, on this, which took about 20 mins. I then purchased a business package from the VOIP provider, Blueface, which included 8 dedicated Direct Dial extensions, which took about 30 minutes. I then configured Asterisk to use the Blueface SIP trucks, which took another 30 minutes.

Next to handsets. Did I buy 50 handsets?

No, not this time.

This time I purchased 1 handset for the main office number, which cost €99. For everybody else, we installed a SIP Client on their mobile phone, such as CSipSimple, and allocated an internal extension to everybody who wanted one. This client connects to Asterisk over the office’s main WIFI network.

Basically, that means that everybody’s mobile phone doubles up as their internal office extension. When somebody rings their extension, their mobile phone rings, or when they want to make a call, either internally or externally, they use their mobile phone. Any external calls they make are charged to the company via the Blueface account. In addition, all users are now fully mobile, so if they are in a meeting in another part of the building, they can still be contacted on their internal extension.

This company also has external IPSEC VPN access, so for users who are on the road a lot, I installed a VPN client on their phones, which means they can use their internal office phone extension from anywhere in the world, whether they are on a WIFI/3G/LTE connection.

This company is also multi-site. They have offices in different continents, linked together by a VPN. That means that if a manager is on trip to the office in another continent, they can still be contacted on their internal extension, all over IP, because their mobile SIP client is connected into the main corporate LAN over the WIFI.

Meanwhile, that PABX I bought way back when is probably still rotting in a landfill somewhere.

Cache behavior in mobile browsers

I recently had to deal with a situation in which a web server, which served content primarily to mobile devices, was constantly running out of disk space.

The reason for this was that Apache was generating GBs of logs each day. The site associated with the server was a busy site, but it still seemed strange that the logs would grow to that magnitude.

A cursory examination of the logs showed that there were endless requests for static assets, like images, css files and javascript files, which would normally be cached by the browser. The requests were being responded to with a HTTP 304 response, which was even more strange, as the web server was configured to set an Expires header on all such files of 14 days.

A word about HTTP responses before we continue.

When a browser first requests a file, the web server will fetch that file from the file system and deliver it over the network with a HTTP 200 response. By default, the browser will then store that file in its cache. If the browser needs to  get the same file again, because it considers the copy in its cache to be too old, it will send what is referred to a Conditional GET request to the web server (by including a special HTTP header in the request, which includes the date the browser last accessed the file), which asks the web server to send the file again only if it has not been modified since the last request.

The web server then checks the last modified date on the file, and if it is the same as before, the web server will not send the actual file, but will instead issue a HTTP 304 response. This tells the browser that the file has not been modified since it was last accessed, and that it is safe to load that file from its cache.

Setting the Expires header of 14 days for files means that when those files are stored in cache, the browser will only make a Conditional Get request for any such file if 14 days have elapsed since it was first accessed and stored in the cache. This is a trade off between performance and control: your web server gets fewer requests, but there may be a delay in a user seeing an updated file.

In theory, this means that something like an image file should only be requested from the web server once every 14 days, which meant the behavior I was seeing in the logs was very strange indeed.

To get to the bottom of this I ran some Analog analysis on a week’s worth of logs. I targeted requests for a single-image file, and in the first pass, looked for the number of HTTP 200 responses for that image, and in the second pass, I looked for the number of HTTP 304 responses for the same file. I then did a comparison on the profile of mobile browsers making those requests.

The results are given below:

Status 200 % Status 304 %
Total 206448 100.00% 159266 100.00%
Android 2.1 917 0.44% 1449 0.91%
Android 2.2 3964 1.92% 10750 6.75%
Android 2.3 31692 15.35% 78638 49.38%
Android 3.0 57 0.03% 298 0.19%
Android 3.1 626 0.30% 2096 1.32%
Android 3.2 1938 0.94% 7332 4.60%
Android 4.0 28381 13.75% 468 0.29%
Android 4.1 1693 0.82% 78 0.05%
iOS 2 16 0.01% 0 0.00%
iOS 3 357 0.17% 61 0.04%
iOS 4 16975 8.22% 51845 32.55%
iOS 5 53818 26.07% 1412 0.89%
iOS 6* 57079 27.65% 2343 1.47%
iPod 4 878 0.43% 2181 1.37%
iPod 5 1502 0.73% 24 0.02%
iPod 6 1115 0.54% 30 0.02%
iPad 5 48 0.02% 2 0.00%
iPad 6 72 0.03% 2 0.00%

*iOS6 = iPhone 5, iOS5 = iPhone 4 etc etc…please don’t got out looking for the iPhone 6 in the shops!

The highlighted rows show the source of the problem.

Based on this data, it would appear that Android 2.3 browsers, and iOS 4 browsers (iPhone 3), have very limited caching capability.

Between them, they account for 23.57% of traffic on the site in the period in question, but 81.93% of Conditional GET requests. This would seem to suggest issues with the caching function in these browsers, which is most likely due to the cache space available to them reaching capacity.

What seems to be happening is that either new files are not being written to the cache (these phones had limited disk space), causing the browser to constantly refer to an out of date expires date on the files in the cache, of the cache is simply not functioning correctly, causing the browser to issue an unnecessary Conditional GET request.

The implications of this are pretty significant in terms of mobile web performance, which typically relies a much lower bandwidth capacity than the PC web.

Yes, it is true that Android 2.3 and iOS 4.0 are dropping out of the mix as new handsets come on the market, but given the amount of HTTP requests they generate, even a small population of older devices will have an impact on server performance.

Compare the relative data for iOS4 and iOS5 in the table above. There are 4 times as many standard HTTP 200 responses for iOS5 (iPhone 4), indicating that there are 4 times as many iPhone 4s as iPhone 3s  in use on the site, but when you factor in the HTTP 304 responses, the total number of actual HTTP requests issuing from iPhone 3s is greater!

I hope to run this analysis again in 6 months time. The results should make for interesting reading.

Creating an IPSEC VPN between a Cisco ASA router and an Amazon EC2 VPC

Amazon Web Services provides a service whereby you can extend your corporate network into the Amazon Cloud using an IPSEC VPN. They call this Virtual Private Cloud, and its a nifty solution when you need to connect something that you can’t put in the Cloud with something that you can put in the Cloud.

The premise is quite simple.

Amazon allows you to create an IP subnet with a CIDR of your own chosing, and then creates a sort of virtual router for you that both terminates a VPN to your corporate network and provides the necessary routing table to route packets between that CIDR and your corporate network.

Once the setup is complete, the Amazon interface spurts out configuration syntax for various types of devices (Cisco, Juniper) which you then apply on your end to bring the VPN between the network to life.

Fine, except here’s the problem.

If you have a Cisco router on your side of the VPN, which you most probably do, Amazon requires that you use VTI and BGP to create the VPN.

These technologies are not available on all Cisco routers, and in particular on mid-range Cisco ASA routers, which are still in common usage around the Internet. Where these routers are used to create VPNs, they typically use the traditional “crypto map” method, which is a little less intuitive that using VTI and BGP.

So, what can you do?

Well, the good news is that it is still possible to create an IPSEC connection between your corporate network and use a CIDR of your own choosing (as opposed to the randomly allocated one that Amazon gives you when you create a server instance).

The process is as follows:

Create a VPC in the normal way, and choose your CIDR. It doesn’t matter what kind of VPC you create. When asked for the peer address on your side, give the public address of your router (although we won’t actually use this).

Create a server instance in that VPC

Create an Internet Gateway for that VPC

Allocate an Elastic IP Address for the VPC and associate with your server instance

Make sure you can connect to your instance and open a shell

Install the OpenSource IPSEC package, OpenSwan, and configure your Cisco router, as described in this post.

Make sure that this establishes a VPN between the server instance and your Cisco ASA (be careful with ACLs).

Now, go back to your VPC configuration in Amazon, and look at the route table that has been created for you. The target for your local CIDR should be “local”, and the target for everything else (0.0.0.0/0) should be your Internet Gateway (igw-something). Delete any other entries.

If you want other server instances in the VPC to use the VPN, you will have to set up local routes (to your coporate LAN CIDR) on them to route to the server instance hosting the VPN.

And that should be it.

So what is actually happening here?

Basically, you’re setting up a VPC so that you can use a CIDR of your own choosing, but then you’re replacing the VTI-dependent VPN that Amazon creates for you with an OpenSwan-Cisco ASA VPN. As such, the Amazon VPN is actually obsolete, although you will have to keep the VPC running (and pay for it) in order that you can use the CIDR of your own choosing.

The reason that many people need to choose their own CIDRs is that they are setting up permanent services, which need to have fixed ip addresses which they can create permanent routes to.

If this isn’t required for your project, and you’re happy for Amazon to change the private ip address of your server instance when it is stopped/started, you don’t need to worry about creating (or paying for) a VPC.

You can just deploy OpenSwan on your server instance and away you go.

Facebook and Heroku hosting

Update: Full HOWTO instructions here

Creating customised content for Fan Pages has become a big deal for companies, large and small alike. Whether its a competition, a promotion or some sort of gimmick, the use of content to get a user to Like your Fan Page is now standard practice on the Facebook platform.

One of the challenges that many companies, particularly small ones, come up against in this regard, is the issue of hosting, in that content, typically presented on Fan Page tabs, has to be hosted somewhere.

Up until the middle of last year, this wasn’t too big a problem, in that most companies had access to basic hosting by using the same same service that their website was hosted on.

Then, Facebook decided to make it a requirement that all Tab content by hosted on secure links, which meant that if you wanted to host content for a Tab, your hosting service had to be capable of serving content using the https protocol, which meant that you needed a digital certificate installed with your website.

This is something that isn’t needed for the vast majority of websites, so all those companies who had previously hosted their tab content on the same system as their website suddenly found that their hosting service was no longer suitably equipped to meet Facebook’s requirements.

There only option was to go off and pay some tech type to set up a digital certificate for them, which they would then have to renew at regular intervals, paying the tech type each time.

This obviously generated a bit of consternation in the Facebook user community, which finally provoked a response.

Facebook have teamed up with the Cloud Computing provider, Heroku, who offer basic Facebook-ready hosting services, for free, to anyone who wants to host content on a Tab on their Fan Page.

Sounds great, doesn’t it?

Well, all in all, it is, but the only problem is that it isn’t that user friendly.

Yes, when you set up your app on Facebook you have the option to set up Heroku hosting at the same time, but the process of actually publishing your content using that hosting is another matter altogether.

Also, the basic package is very basic. For instance, the database service that the provide will accommodate only 5MB of data. That’s probably fine for most short term promotions or competitions, but not for any sort of permanent app that is going to collect data over a period of time.

If you want more thrust, you’ve got to get your credit card out.

Anyway, its still a welcome development, so lets not be too cynical.

The basic mechanics of using Heroku are thus:

After you select Heroku as your hosting option when setting up your app, you will be guided through the process of setting up a Heroku account on foot of which you will receive an instructional email.

This will show you how to verify your account and give you basic usage instructions.

On Ubuntu, this involves downloading what is called the Heroku Toolbelt, which is a suite of shell tools that allow you to work with your Heroku hosting package.

The Toolbelt for Ubuntu and Linux in general is based on the version control system git, which allows you to update your code and push that code to the hosting server provided by Heroku.

When you create a Facebook application that use Heroku, Heroku automatically creates a sample application in your git repository. You can trial the tools by “cloning” that app to your local environment, updating it and then pushing it out to the hosting platform. This can seem a bit daunting at first, but it becomes pretty straightforward once you follow the tutorials and do it a few times.

Use of the Toolbelt is also based on OpenSSL public/private key encryption, which means one of the first things you have to do is upload a public key to the Heroku system. Heroku presumes that you’ve only ever created one public/private key pair, so if like me you have multiple pairs, you will need to create a .ssh/config file that ensures that Heroku is presented with the correct private key.

Using databases is also a bit of a challenge.

Heroku provides access to a Postgres database instance for each application you create, which is a bit odd, given that 99.9% of Facebook apps are created with MySQL.

Heroku tells you that most of your MySQL code will work with Postgres, but the word most is never very comforting, so if you want access to a genuine MySQL system, you will need to use what Heroku refers to as an “add-on”, such as the add-on for the Cloud MySQL Db service, CloudDB.

To use this, you have to give Heroku your Credit Card number, but the basic Ignite service provided by CloudDB is free to use. Like Heroku, this gives you 5MB data storage.

Cloud DB is based in the Amazon EC2 Cloud Computing Environment in the US, which means that your MySQL instance is going to be at a remove from your hosting environment. This shouldn’t be an issue provided you aren’t transferring large amounts of data in/out of your database.

Once you set up the add-on, you’ll be given the username/password combo for the MySQL server you will be using, plus the hostname and the database name, which means you can login with the MySQL client or phpMyAdmin and manipulate in the normal fashion.

There’s a lot more detail to the use of Heroku that described above, and it isn’t for novices.

However, if you are a Facebook app developer, and you consistently get development requests from clients who don’t have access to digitally-certified hosting services, learning how to use Heroku will be well worth your while.

Design a Simple Tab Based Widget

Widgets appear all over web sites these days. Little boxes of context that you can unwrap to reveal all sorts of wonderful information.

As a web developer, you will often be asked to provide them, and there are various ways of doing this.

Here is my recipe. It doesn’t use any graphics, and relies instead entirely on CSS and JQuery, which makes it really light and easy to use.