Category Archives: The Internet

Stagduction

Stagduction

(Noun) A web application state in which the the service provided is not monitored, not redundant and has not been performance tested, but which is in use by a large community of people as a result of poor planning, poor communication and over-zealous sales people.

Is Skype an appropriate tool in corporate environments?

This is a question that has plagued me for several years, in that I have never been able to establish a consistent level of Skype quality in a corporate environment, despite having lots of bandwidth and obtained the consultancy services of CCIE level network experts.

The answer to the question is ultimately, no.

Let me explain by running through the questions.

1. How does Skype work at a network level?

Skype is a “Peer To Peer” (P2P) application. That means that when 2 people are having a Skype conversation, their computers *should* be directly connected, rather than connected via a 3rd computer. For the sake of comparison, Google Hangouts is not a P2P application. Google Hangout participants connect to each other via Google Conference Servers.

2. Does Skype work with UDP or TCP?

Skype’s preference is for UDP, and when Skype can establish a direct P2P connection using UDP, which is typically the case for residential users, call quality is very good. This is because UDP is a much faster protocol than TCP when used for streaming audio and video.

3. What’s the difference between residential and corporate users?

Residential internet connections are typically allocated a temporary fixed public ip address. This IP gets registered to a Skype user on Skype’s servers, so when someone needs to contact that user, Skype knows where to direct the call, and can use UDP to establish a call between the participating users.

In corporate environments, where there are lots of users using the same internet connection, sharing of a a single public IP address between those users has to occur (Port Address Translation). That means that the Skype servers will have registered the same public ip address for all the users in that organisation. This means that Skype is not able to establish a direct UDP P2P connection between a user on the outside of that organisation and a user in that organisation, and has to use other means to make that connection.

4. What are those other means?

When direct connectivity between clients is not possible, Skype uses a process called “UDP hole punching”. In this mechanism, 2 computers that cannot communicate directly with each other communicate with one or more third party computers that can communicate with both computers.

Connection information is passed between the computers in order to try and establish a direct connection between the 2 computers participating in the Skype call.

If ultimately a direct connection cannot be established, Skype will use the intermediary computers to relay the connection between the 2 computers participating in the conversation.

In Skype terminology, these are known as “relay nodes”, which are basically just computers running Skype than have direct UDP P2P capability (typically residential users with good broadband speeds).

From the Skype Administrators Manual:

http://download.skype.com/share/business/guides/skype-it-administrators-guide.pdf

2.2.4 Relays

If a Skype client can’t communicate directly with another client, it will find the appropriate relays for the connection and call traffic. The nodes will then try connecting directly to the relays. They distribute media and signalling information between multiple relays for fault tolerance purposes. The relay nodes forward traffic between the ordinary nodes. Skype communication (IM, voice, video, file transfer) maintains its encryption end-to-end between the two nodes, even with relay nodes inserted.

As with supernodes, most business users are rarely relays, as relays must be reachable directly from the internet. Skype software minimizes disruption to the relay node’s performance by limiting the amount of bandwidth transferred per relay session. 

5. Does that mean that corporate Skype traffic is being relayed via anonymous third party computers?

Yes. The traffic is encrypted, but it is still relayed through other unknown hosts if a direct connection between 2 Skype users is not possible.

6. Is this why performance in corporate environments is sometimes not good?

Yes. If a Skype conversation is dependent on one of more relay nodes, and one of these nodes experiences congestion, this will impact on the quality of the call.

7. Surely, there is some solution to this?

A corporate network can deploy a proxy server, which is directly mapped to a dedicated public ip address. Ideally, this should be a UDP-enabled SOCKS5 server, but a TCP HTTP Proxy server can also be used. If all Skype connections are relayed through this server, Skype does not have to use relay nodes, as Port Address Translation is not in use.

8. So what’s the catch?

The problem with this solution is that it is not generally possible to force the Skype client to use a Proxy Server. When the client is configured to use a Proxy Server, it will only use it if there is no other way to connect to the Internet. So, if you have a direct Internet connection, even one based on Port Address Translation, which impacts on Skype quality, Skype will continue to use this, even if a better solution is available via a Proxy Server.

9. Why would Skype do this?

Skype is owned by Microsoft. Skype have a business product that attaches to Microsoft Active Directory that allows you do force a Proxy connection. So if you invest in a Microsoft network, Microsoft will give you a solution to enable better Skype performance in corporate networks. If you don’t want to invest in a Microsoft network, you’re stuck, and your only option is to block all outbound Internet access from your network and divert it via your Proxy server.

For a lot of companies, particularly software development companies who depend on 3rd party web services, this is not a practical option.

10. What is the solution?

At this time the primary options for desktop Audio/Video conferencing are either Skype or Google Hangouts.

When Skype can be used in an environment where P2P UDP connectivity is “always on”, it provides a superior audio/video experience to Google Hangouts, which is not P2P, and which communicates via central Google Servers.

Where an environment uses Port Address Translation, Skype performance will depend on the ability of Skype client to establish connections via relays, which means Skype performance becomes dependent on the resources available to those relays.

In this instance, Google Hangout may be a better choice where consistent quality is required, as quality can be guaranteed by providing sufficient bandwidth between the corporate network and Google.

 

How to install and setup Logstash

So you’ve finally decided to put a system in place to deal with the tsumnami of logs your web applications are generating, and you’ve looked here and there for something Open Source, and you’ve found Logstash, and you’ve had a go at setting it up…

…and then you’ve lost all will to live?

Any maybe too, you’ve found that every trawl through Google for some decent documentation leads you to this video of some guy giving a presentation about Logstash at some geeky conference, in which he talks in really general terms about Logstash, and doesn’t give you any clues as to how you go about bring it into existence?

Yes? Well, hopefully by landing here your troubles are over, because I’m going to tell you how to set up Logstash from scratch.

First, lets explain the parts and what they do. Logstash is in fact a collection of different technologies, in which the Java programme, Logstash, is only a part.

The Shipper

This is the bit that reads the logs and sends them for processing. This is handled by the Logstash Java programme.

Grok

This is the bit that takes logs that have no uniform structure and gives them a structure that you define. This occurs prior to the logs being shipped. Grok is a standalone technology. Logstash uses its shared libraries.

Redis

This is a standalone technology that acts as a broker. Think of it like a turnstile at a football ground. It allows multiple events (ie lines of logs) to queue up, and then spits them out in a nice orderly line.

The Indexer

This takes the nice ordered output from Redis, which is neatly structured, and indexes it, for faster searching. This is handled by the Logstash Java programme.

Elasticsearch

This is a standalone technology, into which The Indexer funnels data, which stores the data and provides search capabilities.

The Web Interface

This is the bit that provides a User Interface to search the data that has been stored in Elasticsearch. You can run the web server that is provided by the Logstash Java programme, or you can run the Ruby HTML/Javascript based web server client, Kibana. Both use the Apache Lucene structured query language, but Kibana has more features, a better UI and is less buggy (IMO).

(Kibana 2 was a Ruby based server side application. Kibana 3 is a HTML/Javascript based client side application. Both connect to an ElasticSearch backend).

That’s all the bits, so lets talk about setting it up.

First off, use a server OS that has access to lots of RPM repos. CentOS and Amazon Linux (for Amazon AWS users) are a safe bet, Ubuntu slightly less so.

For Redis, Elasticsearch and the Logstash programme itself, follow the instructions here:

http://logstash.net/docs/1.2.1/

(We’ll talk about starting services at bootup later)

Re. the above link, don’t bother working through the rest of the tutorial beyond the installation of the software. It demos Logstash using STDIN and STDOUT, which will only serve to confuse you. Just make sure that Redis, Elasticsearch and Logstash are installed and can be executed.

Now, on a separate system, we will setup the Shipper. For this, all you need it the Java Logstash programme and a shipper.conf config file.

Lets deal with 2 real-life, practical scenarios:

1. You want to send live logs to Logstash
2. You want to send old logs to Logstash

1. Live logs

Construct a shipper.conf file as follows:

input {

   file {
      type => "apache"
      path => [ "/var/log/httpd/access.log" ]
   }

}

output {
   stdout { debug => true debug_format => "json"}
   redis { host => "" data_type => "list" key => "logstash" }
}

What this says:

Your input is a file, located at /var/log/httpd/access.log, and you want to record the content of this file as the type “apache”. You can use wildcards in your specification of the log file, and type can be anything.

You want to output to 2 places: firstly, your terminal screen, and secondly, to the Redis service running on your Logstash server

2. Old logs

Construct a shipper.conf file as follows:

input {

tcp {
type => "apache"
port => 3333
}

}

output {
stdout { debug => true debug_format => "json"}
redis { host => "" data_type => "list" key => "logstash" }
}

What this says:

Your input is whatever data you read from TCP port 3333, and you want to record the content of this file as the type “apache”. You can use wildcards in your specification of the log file, and type can be anything.

You want to output to 2 places: firstly, your terminal screen, and secondly, to the Redis service running on your Logstash server.

That’s all you need to do for now on the Shipper. Don’t run anything yet. Go back to your main Logstash server.

In the docs supplied at the Logstash website, you were given instructions how to install Redis, Logstash and Elasticsearch, including the Logstash web server. We are not going to use the Logstash web server, and use Kibana instead, so you’ll need to set up Kibana (3, not 2. Version 2 is a Ruby based server side application).

https://github.com/elasticsearch/kibana/

Onward…

(We’re going to be starting various services in the terminal now, so you will need to open several terminal windows)

Now, start the Redis service on the command line:

./src/redis-server --loglevel verbose

Next, construct an indexer.conf file for the Indexer:

input {
   redis {
      host => "127.0.0.1"
      type => "redis-input"
      # these settings should match the output of the agent
      data_type => "list"
      key => "logstash"

      # We use json_event here since the sender is a logstash agent
      format => "json_event"
   }
}

output {
   stdout { debug => true debug_format => "json"}

   elasticsearch {
      host => "127.0.0.1"
   }
}

This should be self-explanatory: the Indexer is talking input from Redis, and sending it to Elasticsearch.

Now start the Indexer:

java -jar logstash-1.2.1-flatjar.jar agent -f indexer.conf

Next, start Elasticsearch:

./elasticsearch -f

Finally, crank up Kibana.

You should now be able to access Kibana at:

http://yourserveraddress:5601

Now that we have all the elements on the Logstash server installed and running, we can go back to the shipping server and start spitting out some logs.

Regardless of how you’ve set up your shipping server (live logs or old logs), starting the shipping process involves the same command:

java -jar logstash-1.2.1-flatjar.jar agent -f shipper.conf

If you’re shipping live logs, that’s all you will need to do. If you are shipping old logs, you will need to pipe them to the TCP port you opened in your shipper.conf file. Do this is a separate terminal window.

nc localhost 3333 < /var/log/httpd/old_apache.log

Our shipping configuration is setup to output logs both to STDOUT and Redis, so you should see lines of logs appearing on your terminal screen. If the shipper can’t contact Redis, it will tell you it can’t contact Redis.

Once you see logs being shipped, go back to your Kibana interface and run a search for content.

IMPORTANT: if your shipper is sending old logs, you need to search for logs from a time period that exists in those logs. there is no point in searching for content from the last 15 mins if you are injecting logs from last year.

Hopefully, you’ll see results in the Kibana window. If you want to learn the ins and outs of what Kibana can do, have a look at the Kibana website. If Kibana is reporting errors, retrace the steps above, and ensure that all of the components are running, and that all necessary firewall ports are open.

2 tasks now remain: using Grok and setting up all the components to run as services at startup.

Init scripts for Redis, ElasticSearch and Kibana are easy to find through Google. You’ll need to edit them to ensure they are correctly configured for your environment. Also, for the Kibana init script, ensure you use the kibana-daemon.rb Ruby script rather than the basic kibana.rb version.

Place the various scripts in /etc/init.d, and, again on CentOS, set them up to start at boot using chkconfig, and control them with the service command.

Grok isn’t quite so easy.

The code is available from here:

https://github.com/jordansissel/grok/

You can download a tarball of it from here:

https://github.com/jordansissel/grok/archive/master.zip

Grok has quite a few dependencies, which are listed in its docs. I was able to get all of these on CentOS using yum and the EPEL repos:

rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/$(uname -i)/epel-release-5-4.noarch.rpm

then

yum install -y gcc gperf make libevent-devel pcre-devel tokyocabinet-devel

Also, after you have compiled grok, make sure you run ldconfig, so that its libraries are shared with Logstash.

How to explain Grok?

In the general development of software over the last 20-30 years, very little thought has gone into the structure of log files, which means we have lots of different structures in log files.

Grok allows you to "re-process" logs from different sources so that you can give them all the same structure. This structure is then saved in Elasticsearch, which makes querying logs from different sources much easier.

Even if you are not processing logs from different sources, Grok is useful, in that you can give the different parts of a line of a log field names, which again makes querying much easier.

Grok "re-processing", or filtering, as it is called, occurs in the same place as your Shipper, so we add the Grok config to the shipper.conf file.

This involves matching the the various components in your log format to Grok data types, or patterns as they are referred to in Grok. Probably the easiest way to do this is with this really useful Grok debugger:

http://grokdebug.herokuapp.com/

Cut and paste a line from one of your logs into the input field, and then experiment with the available Grok patterns until you see a nice clean JSON object rendered in the output field below.

The Joy of SIP

It’s hard to thing of a technology that has more fundamentally revolutionized the way we do things more than IP telephony, and SIP in particular.

Back in the early 1990s, I was charged with installing a telephone system in an office building that would house 50+ users. I had to buy an Ericsson PABX, which took up an entire corner of the comms room, pull up all the floor panels to route new cables to the desks, buy all sorts of expansion cards to accommodate voicemail and PSTN interfaces, and buy 50 bulky handsets so that users could actually use the thing from their desks. I then had to go off to ESAT and get 8 actual telephone lines, which took 4 weeks to get installed. The whole project cost about €17k, and that was nearly 20 years ago.

Let’s compare that to my most recent installation of a telephone system, earlier this year, again in an office of approximately 50 users.

For the PABX, I bought a tower PC from Amazon which cost €299. I installed the OpenSource software PABX, Asterisk, on this, which took about 20 mins. I then purchased a business package from the VOIP provider, Blueface, which included 8 dedicated Direct Dial extensions, which took about 30 minutes. I then configured Asterisk to use the Blueface SIP trucks, which took another 30 minutes.

Next to handsets. Did I buy 50 handsets?

No, not this time.

This time I purchased 1 handset for the main office number, which cost €99. For everybody else, we installed a SIP Client on their mobile phone, such as CSipSimple, and allocated an internal extension to everybody who wanted one. This client connects to Asterisk over the office’s main WIFI network.

Basically, that means that everybody’s mobile phone doubles up as their internal office extension. When somebody rings their extension, their mobile phone rings, or when they want to make a call, either internally or externally, they use their mobile phone. Any external calls they make are charged to the company via the Blueface account. In addition, all users are now fully mobile, so if they are in a meeting in another part of the building, they can still be contacted on their internal extension.

This company also has external IPSEC VPN access, so for users who are on the road a lot, I installed a VPN client on their phones, which means they can use their internal office phone extension from anywhere in the world, whether they are on a WIFI/3G/LTE connection.

This company is also multi-site. They have offices in different continents, linked together by a VPN. That means that if a manager is on trip to the office in another continent, they can still be contacted on their internal extension, all over IP, because their mobile SIP client is connected into the main corporate LAN over the WIFI.

Meanwhile, that PABX I bought way back when is probably still rotting in a landfill somewhere.

Cache behavior in mobile browsers

I recently had to deal with a situation in which a web server, which served content primarily to mobile devices, was constantly running out of disk space.

The reason for this was that Apache was generating GBs of logs each day. The site associated with the server was a busy site, but it still seemed strange that the logs would grow to that magnitude.

A cursory examination of the logs showed that there were endless requests for static assets, like images, css files and javascript files, which would normally be cached by the browser. The requests were being responded to with a HTTP 304 response, which was even more strange, as the web server was configured to set an Expires header on all such files of 14 days.

A word about HTTP responses before we continue.

When a browser first requests a file, the web server will fetch that file from the file system and deliver it over the network with a HTTP 200 response. By default, the browser will then store that file in its cache. If the browser needs to  get the same file again, because it considers the copy in its cache to be too old, it will send what is referred to a Conditional GET request to the web server (by including a special HTTP header in the request, which includes the date the browser last accessed the file), which asks the web server to send the file again only if it has not been modified since the last request.

The web server then checks the last modified date on the file, and if it is the same as before, the web server will not send the actual file, but will instead issue a HTTP 304 response. This tells the browser that the file has not been modified since it was last accessed, and that it is safe to load that file from its cache.

Setting the Expires header of 14 days for files means that when those files are stored in cache, the browser will only make a Conditional Get request for any such file if 14 days have elapsed since it was first accessed and stored in the cache. This is a trade off between performance and control: your web server gets fewer requests, but there may be a delay in a user seeing an updated file.

In theory, this means that something like an image file should only be requested from the web server once every 14 days, which meant the behavior I was seeing in the logs was very strange indeed.

To get to the bottom of this I ran some Analog analysis on a week’s worth of logs. I targeted requests for a single-image file, and in the first pass, looked for the number of HTTP 200 responses for that image, and in the second pass, I looked for the number of HTTP 304 responses for the same file. I then did a comparison on the profile of mobile browsers making those requests.

The results are given below:

Status 200 % Status 304 %
Total 206448 100.00% 159266 100.00%
Android 2.1 917 0.44% 1449 0.91%
Android 2.2 3964 1.92% 10750 6.75%
Android 2.3 31692 15.35% 78638 49.38%
Android 3.0 57 0.03% 298 0.19%
Android 3.1 626 0.30% 2096 1.32%
Android 3.2 1938 0.94% 7332 4.60%
Android 4.0 28381 13.75% 468 0.29%
Android 4.1 1693 0.82% 78 0.05%
iOS 2 16 0.01% 0 0.00%
iOS 3 357 0.17% 61 0.04%
iOS 4 16975 8.22% 51845 32.55%
iOS 5 53818 26.07% 1412 0.89%
iOS 6* 57079 27.65% 2343 1.47%
iPod 4 878 0.43% 2181 1.37%
iPod 5 1502 0.73% 24 0.02%
iPod 6 1115 0.54% 30 0.02%
iPad 5 48 0.02% 2 0.00%
iPad 6 72 0.03% 2 0.00%

*iOS6 = iPhone 5, iOS5 = iPhone 4 etc etc…please don’t got out looking for the iPhone 6 in the shops!

The highlighted rows show the source of the problem.

Based on this data, it would appear that Android 2.3 browsers, and iOS 4 browsers (iPhone 3), have very limited caching capability.

Between them, they account for 23.57% of traffic on the site in the period in question, but 81.93% of Conditional GET requests. This would seem to suggest issues with the caching function in these browsers, which is most likely due to the cache space available to them reaching capacity.

What seems to be happening is that either new files are not being written to the cache (these phones had limited disk space), causing the browser to constantly refer to an out of date expires date on the files in the cache, of the cache is simply not functioning correctly, causing the browser to issue an unnecessary Conditional GET request.

The implications of this are pretty significant in terms of mobile web performance, which typically relies a much lower bandwidth capacity than the PC web.

Yes, it is true that Android 2.3 and iOS 4.0 are dropping out of the mix as new handsets come on the market, but given the amount of HTTP requests they generate, even a small population of older devices will have an impact on server performance.

Compare the relative data for iOS4 and iOS5 in the table above. There are 4 times as many standard HTTP 200 responses for iOS5 (iPhone 4), indicating that there are 4 times as many iPhone 4s as iPhone 3s  in use on the site, but when you factor in the HTTP 304 responses, the total number of actual HTTP requests issuing from iPhone 3s is greater!

I hope to run this analysis again in 6 months time. The results should make for interesting reading.

Creating an IPSEC VPN between a Cisco ASA router and an Amazon EC2 VPC

Amazon Web Services provides a service whereby you can extend your corporate network into the Amazon Cloud using an IPSEC VPN. They call this Virtual Private Cloud, and its a nifty solution when you need to connect something that you can’t put in the Cloud with something that you can put in the Cloud.

The premise is quite simple.

Amazon allows you to create an IP subnet with a CIDR of your own chosing, and then creates a sort of virtual router for you that both terminates a VPN to your corporate network and provides the necessary routing table to route packets between that CIDR and your corporate network.

Once the setup is complete, the Amazon interface spurts out configuration syntax for various types of devices (Cisco, Juniper) which you then apply on your end to bring the VPN between the network to life.

Fine, except here’s the problem.

If you have a Cisco router on your side of the VPN, which you most probably do, Amazon requires that you use VTI and BGP to create the VPN.

These technologies are not available on all Cisco routers, and in particular on mid-range Cisco ASA routers, which are still in common usage around the Internet. Where these routers are used to create VPNs, they typically use the traditional “crypto map” method, which is a little less intuitive that using VTI and BGP.

So, what can you do?

Well, the good news is that it is still possible to create an IPSEC connection between your corporate network and use a CIDR of your own choosing (as opposed to the randomly allocated one that Amazon gives you when you create a server instance).

The process is as follows:

Create a VPC in the normal way, and choose your CIDR. It doesn’t matter what kind of VPC you create. When asked for the peer address on your side, give the public address of your router (although we won’t actually use this).

Create a server instance in that VPC

Create an Internet Gateway for that VPC

Allocate an Elastic IP Address for the VPC and associate with your server instance

Make sure you can connect to your instance and open a shell

Install the OpenSource IPSEC package, OpenSwan, and configure your Cisco router, as described in this post.

Make sure that this establishes a VPN between the server instance and your Cisco ASA (be careful with ACLs).

Now, go back to your VPC configuration in Amazon, and look at the route table that has been created for you. The target for your local CIDR should be “local”, and the target for everything else (0.0.0.0/0) should be your Internet Gateway (igw-something). Delete any other entries.

If you want other server instances in the VPC to use the VPN, you will have to set up local routes (to your coporate LAN CIDR) on them to route to the server instance hosting the VPN.

And that should be it.

So what is actually happening here?

Basically, you’re setting up a VPC so that you can use a CIDR of your own choosing, but then you’re replacing the VTI-dependent VPN that Amazon creates for you with an OpenSwan-Cisco ASA VPN. As such, the Amazon VPN is actually obsolete, although you will have to keep the VPC running (and pay for it) in order that you can use the CIDR of your own choosing.

The reason that many people need to choose their own CIDRs is that they are setting up permanent services, which need to have fixed ip addresses which they can create permanent routes to.

If this isn’t required for your project, and you’re happy for Amazon to change the private ip address of your server instance when it is stopped/started, you don’t need to worry about creating (or paying for) a VPC.

You can just deploy OpenSwan on your server instance and away you go.

Facebook and Heroku hosting

Update: Full HOWTO instructions here

Creating customised content for Fan Pages has become a big deal for companies, large and small alike. Whether its a competition, a promotion or some sort of gimmick, the use of content to get a user to Like your Fan Page is now standard practice on the Facebook platform.

One of the challenges that many companies, particularly small ones, come up against in this regard, is the issue of hosting, in that content, typically presented on Fan Page tabs, has to be hosted somewhere.

Up until the middle of last year, this wasn’t too big a problem, in that most companies had access to basic hosting by using the same same service that their website was hosted on.

Then, Facebook decided to make it a requirement that all Tab content by hosted on secure links, which meant that if you wanted to host content for a Tab, your hosting service had to be capable of serving content using the https protocol, which meant that you needed a digital certificate installed with your website.

This is something that isn’t needed for the vast majority of websites, so all those companies who had previously hosted their tab content on the same system as their website suddenly found that their hosting service was no longer suitably equipped to meet Facebook’s requirements.

There only option was to go off and pay some tech type to set up a digital certificate for them, which they would then have to renew at regular intervals, paying the tech type each time.

This obviously generated a bit of consternation in the Facebook user community, which finally provoked a response.

Facebook have teamed up with the Cloud Computing provider, Heroku, who offer basic Facebook-ready hosting services, for free, to anyone who wants to host content on a Tab on their Fan Page.

Sounds great, doesn’t it?

Well, all in all, it is, but the only problem is that it isn’t that user friendly.

Yes, when you set up your app on Facebook you have the option to set up Heroku hosting at the same time, but the process of actually publishing your content using that hosting is another matter altogether.

Also, the basic package is very basic. For instance, the database service that the provide will accommodate only 5MB of data. That’s probably fine for most short term promotions or competitions, but not for any sort of permanent app that is going to collect data over a period of time.

If you want more thrust, you’ve got to get your credit card out.

Anyway, its still a welcome development, so lets not be too cynical.

The basic mechanics of using Heroku are thus:

After you select Heroku as your hosting option when setting up your app, you will be guided through the process of setting up a Heroku account on foot of which you will receive an instructional email.

This will show you how to verify your account and give you basic usage instructions.

On Ubuntu, this involves downloading what is called the Heroku Toolbelt, which is a suite of shell tools that allow you to work with your Heroku hosting package.

The Toolbelt for Ubuntu and Linux in general is based on the version control system git, which allows you to update your code and push that code to the hosting server provided by Heroku.

When you create a Facebook application that use Heroku, Heroku automatically creates a sample application in your git repository. You can trial the tools by “cloning” that app to your local environment, updating it and then pushing it out to the hosting platform. This can seem a bit daunting at first, but it becomes pretty straightforward once you follow the tutorials and do it a few times.

Use of the Toolbelt is also based on OpenSSL public/private key encryption, which means one of the first things you have to do is upload a public key to the Heroku system. Heroku presumes that you’ve only ever created one public/private key pair, so if like me you have multiple pairs, you will need to create a .ssh/config file that ensures that Heroku is presented with the correct private key.

Using databases is also a bit of a challenge.

Heroku provides access to a Postgres database instance for each application you create, which is a bit odd, given that 99.9% of Facebook apps are created with MySQL.

Heroku tells you that most of your MySQL code will work with Postgres, but the word most is never very comforting, so if you want access to a genuine MySQL system, you will need to use what Heroku refers to as an “add-on”, such as the add-on for the Cloud MySQL Db service, CloudDB.

To use this, you have to give Heroku your Credit Card number, but the basic Ignite service provided by CloudDB is free to use. Like Heroku, this gives you 5MB data storage.

Cloud DB is based in the Amazon EC2 Cloud Computing Environment in the US, which means that your MySQL instance is going to be at a remove from your hosting environment. This shouldn’t be an issue provided you aren’t transferring large amounts of data in/out of your database.

Once you set up the add-on, you’ll be given the username/password combo for the MySQL server you will be using, plus the hostname and the database name, which means you can login with the MySQL client or phpMyAdmin and manipulate in the normal fashion.

There’s a lot more detail to the use of Heroku that described above, and it isn’t for novices.

However, if you are a Facebook app developer, and you consistently get development requests from clients who don’t have access to digitally-certified hosting services, learning how to use Heroku will be well worth your while.

Design a Simple Tab Based Widget

Widgets appear all over web sites these days. Little boxes of context that you can unwrap to reveal all sorts of wonderful information.

As a web developer, you will often be asked to provide them, and there are various ways of doing this.

Here is my recipe. It doesn’t use any graphics, and relies instead entirely on CSS and JQuery, which makes it really light and easy to use.

The Metamorphosis of Apple

In 1984, Apple first aired their Big Brother commercial at half-time in the SuperBowl. Remember it?

The voiceover recites the following monologue:

Today we celebrate the first glorious anniversary of the Information Purification Directives. We have created, for the first time in all history, a garden of pure ideology: where each worker may bloom, secure from the pests of any contradictory... thoughts.
Our Unification of Thoughts is more powerful a weapon than any fleet or army on Earth. We are one people: with one will, one resolve, one cause. Our enemies shall talk themselves to death and we will bury them with their own confusion. We shall prevail!

Apple’s message in this commercial was that they were the antidote to this tyranny of thought purification, that they were the guardians of  free expression, that they could empower the individual.

And, for several years, that was true. Apple did bring the PC to the masses, and served to undermine the dominance of the computing megaliths like IBM.

The problem was that as Apple become more of a corporation, with shareholders and sales forecasts, and not least, competitors, it became increasingly difficult to maintain this ideal, and no where was this more true that with the iPhone.

In the beginning, the iPhone was a technological, and commercial, miracle. It was beautiful to look at and use, and Apple sold millions of things, propelling Apple 2.0 into the stratosphere of corporate IT giants.

While the handsets were flying off the shelves, and no one else was making them, Apple weren’t too concerned with what users did with their phones. Provided they used iTunes to buy music, they could use their hardware, that they owned, to do as they wished. The spirit of 1984 was alive, or at least not dead.

But then, inevitably, the competition arrived, and lots of people started developing software for iPhones, and suddenly Apple released that their users might start using their right to “free expression” to do things that weren’t in sync with the commercial interests of the Apple Corporation.

Apple now faced a choice. They could get on board with the likes of Google, Mozilla, HP, Sun etc and participate freely and enthusiastically in the development of standards based technology, or they could do a Microsoft, and put all their efforts into holding their ground, and hope that clever marketing would make up for the inevitable deficiencies that would result from maintaining a Closed Shop approach to product development.

The crunch came with the release of the Apple iPhone 3GS, the first major revision of the Apple’s flagship product.

Prior to the 3GS, iPhone users were free to load any sort of operating system onto their iPhone they so wished. There were numerous variations out there, which could be loaded via a shortcut in iTunes. This allowed users to do things on their iPhone (that they had bought and paid for) that were not possible with the software loaded by Apple.

This changed with the 3GS. From that point on, it was only possible to load an operating system to your iPhone if that software was signed by Apple. A new version of iTunes was released, which checked the software you were trying to load, and if that was not approved by Apple, iTunes would not complete the installation to your phone.

The significance of this is probably lost on most iPhone users, but it is significant none the less.

For millions of users worldwide, Apple now has complete control of what software they use on their hardware.

Imagine if this were the case with a PC that you bought from PC World. Imagine if you were not allowed to install any software on that PC unless it was approved by a private company like Microsoft?

Now, go back and watch the SuperBowl commercial again.

Which protagonist reminds you more of the Apple Corporation in 2012? The girl with the sledgehammer, or the face on the screen?

How to update a Facebook Fan Page from PHP

Note: Facebook changes its API as frequently as you change your socks. These instructions worked as of Apr 19th 2011.

In a previous post, I outlined how to update your Facebook Profile from a PHP script.

This is fine if you are using Facebook as an individual, but if you are a business or an organisation, you won’t have a profile, you will have a Fan Page, which people ‘Like’.

Updating a Fan Page from a PHP script is a lot more difficult, because Fan pages are managed by users with standard profile accounts, and you need to obtain a extended range of permissions in order make updates to their Pages.

Anyway, its is possible, and here’s how.

Firstly, this is what you will need:

A proper browser (ie anything other than Internet Explorer) that you can be confident will send long HTTP GET requests

cURL, which can be run from a command line (for testing)

A good text editor

A credit card (weird, I know, but go with it)

A standard Facebook account

Overview

We are going to create a Facebook Application, and add that Application to the Facebook User Account/Profile which manages the Page we want to update. The Application will have the necessary permissions to update Pages which the Facebook User Account/Profile manages.

Step 1: Add the Facebook Developer Application to your Facebook Account

Login to Facebook

(If you have added the Facebook Developer application previously, you can skip this bit)

Go to: http://www.facebook.com/developers/

Look for the button that allows you to add the Facebook Developer Application

If you haven’t added this before, you will need to confirm your identity. You may be able to do this with your mobile phone, but if not you will have to enter valid Credit Details. No charge will be made to your Credit Card.

Step 2: Create an Application

Once the Facebook Developer Application is added, you now create an Application. Call it something relevant. You don’t need to enter much in the way of detail for the application. The only important data you need to enter is under Website, where you should enter a Site URL and Site Domain.

These should be something relevant, for instance, the website address of the business/organisation to whom the page relates. Note: enter a trailing slash for the Site URL eg http://www.mysite.com/
Now, save the details, and you will be returned to a summary page which lists the details for your application. You do not need to submit your application to the Facebook Directory.

From this page, copy and paste your Application ID and Application Secret into your text editor

Step 3: Establish access to your Account/Profile for your Application

Starts to get a bit tricky now, so pay attention. Read everything twice.

You now what to obtain an Authorisation Code for your Application. This code will be generated by Facebook based on the permissions your Application requests from your Facebook User Account/Profile when you add your Application to your Faecbook Account/Profile.

Construct the following URL in your text editor:

https://www.facebook.com/dialog/oauth?client_id=<YOUR APPLICATION ID>&redirect_uri=<YOUR SITE URL>&scope=manage_pages,offline_access,publish_stream

YOUR APPLICATION ID = Application ID you took from your Application Details

YOUR SITE URL = The Site URL you entered when setting up your Application

scope = The permissions you are requesting:

manage_pages = allows application to manage pages for which the user is the administrator

offline_access = allows updates to occur indefinitely

publish_stream = allows application to update the feed of the pages for which the user is administrator

Now, ensuring that you are logged into Facebook, paste the URL into your browser and hit enter.

A Facebook page will render, prompting you to add the application with the permissions as described above. Accept this, and you will be redirected to your Site URL. The actual query string to which you are re-directed will contain a long ‘code’ value. Copy this from your browser address bar and paste into your text editor.

This code is your Authorisation Code.

Step 4: Get an Access Token to allow your Application access your Profile

Now that you have an Authorisation Code, which is a sort of once off thing, you can request an Access Token, which will allow your Application have permanent access to your User Profile.

Again in your text editor, construct the following URL:

https://graph.facebook.com/oauth/access_token?client_id=<YOUR APPLICATION ID>&redirect_uri=<YOUR SITE URL, WITH A TRAILING SLASH>&client_secret=<YOUR APPLICATION SECRET>&code=<YOUR AUTHORISATION CODE>

Be extra careful with this. Its a very long string. Ensure you include a trailing slash in your Site URL.

Now, paste the URL into your browser and hit enter. Facebook should return a single line like:

access_token=220088774674094|c7cb68d51ae2f40e9878ab14.xxxxxxxxxx etc etc

(Note: this is not a real access token, its one I made up.)

You now have an Access Token that allows your Application do stuff to your User Profile

Step 5: Get an Access Token for the Page you want to update

Yes, more Access Tokens are needed! You need a specific Access Token for each Page you want to update! Jesus wept!

Construct the following URL in your text editor:

https://graph.facebook.com/me/accounts?access_token=<YOUR ACCESS TOKEN FOR YOUR USER PROFILE>

Paste it into your browser and hit enter. Facebook will now return JSON objects for each of the Pages and Applications that are under your User Profile. You should see an object for the Page you want to update, each of which will include an ‘id’ and ‘access_token’.

Copy and paste these into your text editor.

You now have an Access Token and Page ID for the Page you want to update.

Step 6: Test!

Finally!

Construct the following command in your text editor:

curl -F ‘access_token=<THE ACCESS TOKEN FOR THE PAGE YOU WANT TO UPDATE’ -F ‘message=It works.’ https://graph.facebook.com/<THE ID FOR THE PAGE YOU WANT TO UPDATE>/feed

Paste to a command line and hit enter. If it works, you will get back a JSON object containing an update id, and the message will appear on the Feed of the Facebook Page.

If not, you will get some class of error. Retrace and try again. I can’t emphasis enough how unforgiving the tolerances are here, but if you persist, it will work!

Once you have it working, you can then build the update into your PHP application using cURL.

More here:

http://developers.facebook.com/docs/authentication/
http://developers.facebook.com/docs/reference/api/