Category Archives: Linux

Renaming files with non-interactive sftp

SFTP hangs around the IT Operations world like a bit of a bad smell.

Its pretty secure, it works, and its similar enough to FTP for software developers and business managers to understand, so its not uncommon to find it underpinning vast data transfer processes that have been designed in a hurry.

Of course, its very rudimentary in terms of what it can do, and very dependent on the underlying security of the OS on which it resides, so its not really something that should find a home in Enterprise IT solutions.

Anyway, sometimes you just have to deal with it. One problem that you will often encounter is that while you have SFTP access to a system, you may not have shell access via OpenSSH. This makes bulk operations on files a bit more difficult, but not impossible.

SFTP has a batch mode that allows you pass STDIN commands to the processor. If used in conjunction with non-interactive login (ie an OpenSSH Public/Private Key Pair) you can actually process bulk operations.

Let’s say you want to rename 500 files in a particular directory:

You can list the files as follows:

echo "ls -l1" | sftp -q -i ~/.ssh/id_rsa -b - user@sftp.mycompany.com:/dir1/

In this case, the parameter:

-b -

tells the processor to process the command coming from STDIN

You can now incorporate this into a BASH loop to complete the operation:

for f in `sftp -q -i ~/.ssh/id_rsa -b - user@sftp.mycompany.com:/dir1/ | grep -v sftp | grep -v Changing`;
    do
    echo "Renaming $f...";
    echo "rename $f $f.renamed" | sftp -q -i ~/.ssh/id_rsa -b - user@sftp.mycompany.com:/dir1/;
done

 

Application monitoring with Nagios and Elasticsearch

As the applications under your control grow, both in number and complexity, it becomes increasingly difficult to rely on predicative monitoring.

Predicative monitoring is monitoring things that you know should be happening. For instance, you know your web server should be accepting HTTP connections on TCP port 80, so you use a monitor to test that HTTP connections are possible on TCP port 80.

In more complex applications, it harder to predict what may or may not go wrong; similarly, some things can’t be monitored in predictive way, because your monitoring system may not be able to emulate the process that you want to monitor.

For example, lets say your application sends Push message to a mobile phone application. To monitor this thoroughly, you would have to have a monitor that persistently sends Push messages to a mobile phone, and some way of monitoring that the mobile phone received them.

At this stage, you need to invert your monitoring system, so that it stops asking if things are OK, and instead listens for applications that are telling it that they are not OK.

Using your application logs files is one way to do this.

Well-written applications are generally quite vocal when it comes to being unwell, and will always describe an ERROR in their logs if something has gone wrong. What you need to do is find a way of linking your monitoring system to that message, so that it can alert you that something needs to be checked.

This doesn’t mean you can dispense with predictative monitoring altogether; what is does means is that you don’t need to rely on predicative monitoring entirely (or in other words, you don’t need to be able to see into the future) to keep your applications healthy.

This is how I’ve implemented log based monitoring. This was something of a nut to crack, as our logs arise from an array of technologies and adhere to very few standards in terms of layout, logging levels and storage locations.

The first thing you need is a logstash implementation. Logstash comprises a stack of technologies: an agent to ship logs out to a Redis server; a Redis server to queue logs for indexing; a logstash server for creating indices and storing them in elasticsearch; an elasticsearch server to search your indices.

The setup of this stack is beyond this article; its well-described over on the logstash website, and is reasonably straightforward.

Once you have your logstash stack set up, you can start querying the elasticsearch search api for results. Queries are based on HTTP POST and JSON, and results are output in JSON.

Therefore, to test you logs, you need to issue a HTTP POST query from Nagios, check the results for ERROR strings, and alert accordingly.

The easient way to have Nagios send a POST request with a JSON payload to elasticsearch is with the Nagios jmeter plugin, which allows you to create monitors based on your jmeter scripts.

All you need then is a correctly constructed JSON query to send to elasticsearch, which is where things get a bit trickier.

Without going into this in any great detail, formulating a well-constructed JSON query that will parse just the right log indices in elasticsearch isn’t easy. I cheated a little in this. I am familiar with the Apache Lucene syntax that the Logstash Javascript client, Kibana, uses, and was able to formulate my query based on this.

Kibana sends encrypted queries to elasticsearch, so you can’t pick them out of the HTTP POST/GET variables. Instead, I enabled logging of slow queries on elasticsearch (threshold set to 0s) so that I could see in the elasticsearch logs what exact queries were being run against elasticsearch. Here’s an example:


{
  "size": 100,
  "sort": {
    "@timestamp": {
      "order": "desc"
    }
  },
  "query": {
    "filtered": {
      "query": {
        "query_string": {
          "query": "NOT @source_host:\"uatserver\"",
          "default_field": "_all",
          "default_operator": "OR"
        }
      },
      "filter": {
        "range": {
          "@timestamp": {
            "from": "2014-10-06T11:05:25+00:00",
            "to": "2014-10-06T12:05:25+00:00"
          }
        }
      }
    }
  },
  "from": 0
}

You can test a query like this by sending it straight to your elasticsearch API:


curl -XPOST 'http://localhost:9200/_search' -d '{"size":100,"sort":{"@timestamp":{"order":"desc"}},"query":{"filtered":{"query":{"query_string":{"query":"NOT @source_host:\"uatserver\"","default_field":"_all","default_operator":"OR"}},"filter":{"range":{"@timestamp":{"from":"2014-10-06T11:05:25+00:00","to":"2014-10-06T12:05:25+00:00"}}}}},"from":0}'

This searches a batch of 100 log entries that do not have a tag of “uatserver”, from a previous 5 minute period.

Now that we now what we want to send to elasticsearch, we can construct a simple jmeter script. In this this, we simply specify a a HTTP POST request, containing Body Data of the JSON given above, and include a Response Assertion for the strings we do not want to see in the logs.

We can then use that script in Nagios with the jquery plugin. If the script finds the ERROR string in the logs, it will generate an alert.

2 things are important here:

The alert will only tell you that an error has appeared in the logs, not what that error was; and if the error isn’t persistent, the monitor will eventually recover.

Clearly, there is a lot of scope for false negatives in this, so if your logs are full of tolerable errors (they shouldn’t be really) you are going to have to be more specific about your search strings.

The good news is that if you get this all working, its very easy to create new monitors. Rather than writing bespoke scripts and working with Nagios plugins, all you need to do is change the queries and the Response Assertions in your jmeter script, and you should be able to monitor anything that is referenced in your application logs.

To assist in some small way, here is a link to a pre-baked JMeter script that includes an Apache Lucene query, and is also set up with the necessary Javascript-based date variables to search over the previous 15 minutes.

Negative matching on multiple ip addresses in SSH

In sshd_config, you can use the

Match

directive to apply different configuration parameters to ssh connections depending on their characteristics.

In particular, you can match on ip address, both positively and negatively.

You can specify multiple conditions in the match statement. All conditions must be matched before the match configuration is applied.

To negatively match an ip address, that is, to apply configuration if the connection is not from a particular ip address, use the following syntax

Match Address *,!62.29.1.162/32
ForceCommand /sbin/sample_script

To negatively match more than one ip address, that is, to apply configuration if the connection is not from one of more ip addresses, use the following syntax

Match Address *,!62.29.1.162/32,!54.134.118.96/32
ForceCommand /sbin/sample_script

How to use DJ Bernstein’s daemontools

When I first started working in IT, one of the first projects I had to undertake was to set up a QMail server, which first brought me into contact with DJ Bernstein and his various software components.

One of these was daemontools, which is a “a collection of tools for managing UNIX services”, and which is most frequently used in connection with Qmail.

The deamontools website is from another time. Flat HTML files, no CSS, horizontal rules…its like visiting some sort of online museum. In fact, the website hasn’t changed in over 20 years, and daemontools has been around for that long, and hasn’t changed much in the interim.

The reason for daemontools longevity is quite simple. It works. And it works every time, all the time, which isn’t something you can say about every software product.

So if you need to run a process on a UNIX/Linux server, and that process needs to stay up for a very long time, without interruption, there probably isn’t any other software than can offer the same reliability as daemontools.

Here’s a quick HOWTO:

Firstly, install it, exactly as described here:

http://cr.yp.to/daemontools/install.html

If you can an error during the installation about a TLS reference, edit the file src/conf-cc, and add

-include /usr/include/errno.h

to the gcc line.

Once installed, check:

1. That you have a /service directory
2. That the command /command/svscanboot exists

If this is the case, daemontools is successfully installed

Now, you can create the process/service that you want daemontools to monitor.

Create a directory under /service, with a name appropriate to your service, eg

/service/growfile

(you can also use a symbolic link for this directory, to point to an existing service installation)

In that directory, create a file called run, and give it 755 permission


touch /service/growfile/run
chmod 755 /service/growfile/run

Next, update the run file with the shell commands necessary to run your service


#!/bin/sh

while :
do
echo “I am getting bigger…” > /tmp/bigfile.txt
sleep 1
done

Your service is now set up. To have daemontools monitor it, run the following command:


/command/svscan &

(To start this at boot, add /command/svscanboot to /etc/rc.local, if the install hasn’t done this already)

To see this in action, run ps -ef and have a look at your process list. You will see

1. A process called svsscan, which is scanning the /service directory for new processes to monitor
2. A process called “supervise growfile”, which is keeping the job writing to the file alive

Also, run


tail -f /tmp/bigfile.txt

Every 1 second, you should see a new line being appended to this file:


I am getting bigger...
I am getting bigger...
I am getting bigger...
I am getting bigger...

To test deamontools, delete /tmp/bigfile.txt


rm -f /tmp/bigfile.txt

It should be gone, right?

No! Its still there!


tail -f /tmp/bigfile.txt


I am getting bigger...
I am getting bigger...
I am getting bigger...
I am getting bigger...

Finally, if you want to actually kill your process, you can use the “svc” command supplied with daemontools:

svc -h /service/yourdaemon: sends HUP
svc -t /service/yourdaemon: sends TERM, and automatically restarts the daemon after it dies
svc -d /service/yourdaemon: sends TERM, and leaves the service down
svc -u /service/yourdaemon: brings the service back up
svc -o /service/yourdaemon: runs the service once

This is the basic functionality of daemontools. There is a lot more on the website.

How to install and setup Logstash

So you’ve finally decided to put a system in place to deal with the tsumnami of logs your web applications are generating, and you’ve looked here and there for something Open Source, and you’ve found Logstash, and you’ve had a go at setting it up…

…and then you’ve lost all will to live?

Any maybe too, you’ve found that every trawl through Google for some decent documentation leads you to this video of some guy giving a presentation about Logstash at some geeky conference, in which he talks in really general terms about Logstash, and doesn’t give you any clues as to how you go about bring it into existence?

Yes? Well, hopefully by landing here your troubles are over, because I’m going to tell you how to set up Logstash from scratch.

First, lets explain the parts and what they do. Logstash is in fact a collection of different technologies, in which the Java programme, Logstash, is only a part.

The Shipper

This is the bit that reads the logs and sends them for processing. This is handled by the Logstash Java programme.

Grok

This is the bit that takes logs that have no uniform structure and gives them a structure that you define. This occurs prior to the logs being shipped. Grok is a standalone technology. Logstash uses its shared libraries.

Redis

This is a standalone technology that acts as a broker. Think of it like a turnstile at a football ground. It allows multiple events (ie lines of logs) to queue up, and then spits them out in a nice orderly line.

The Indexer

This takes the nice ordered output from Redis, which is neatly structured, and indexes it, for faster searching. This is handled by the Logstash Java programme.

Elasticsearch

This is a standalone technology, into which The Indexer funnels data, which stores the data and provides search capabilities.

The Web Interface

This is the bit that provides a User Interface to search the data that has been stored in Elasticsearch. You can run the web server that is provided by the Logstash Java programme, or you can run the Ruby HTML/Javascript based web server client, Kibana. Both use the Apache Lucene structured query language, but Kibana has more features, a better UI and is less buggy (IMO).

(Kibana 2 was a Ruby based server side application. Kibana 3 is a HTML/Javascript based client side application. Both connect to an ElasticSearch backend).

That’s all the bits, so lets talk about setting it up.

First off, use a server OS that has access to lots of RPM repos. CentOS and Amazon Linux (for Amazon AWS users) are a safe bet, Ubuntu slightly less so.

For Redis, Elasticsearch and the Logstash programme itself, follow the instructions here:

http://logstash.net/docs/1.2.1/

(We’ll talk about starting services at bootup later)

Re. the above link, don’t bother working through the rest of the tutorial beyond the installation of the software. It demos Logstash using STDIN and STDOUT, which will only serve to confuse you. Just make sure that Redis, Elasticsearch and Logstash are installed and can be executed.

Now, on a separate system, we will setup the Shipper. For this, all you need it the Java Logstash programme and a shipper.conf config file.

Lets deal with 2 real-life, practical scenarios:

1. You want to send live logs to Logstash
2. You want to send old logs to Logstash

1. Live logs

Construct a shipper.conf file as follows:

input {

   file {
      type => "apache"
      path => [ "/var/log/httpd/access.log" ]
   }

}

output {
   stdout { debug => true debug_format => "json"}
   redis { host => "" data_type => "list" key => "logstash" }
}

What this says:

Your input is a file, located at /var/log/httpd/access.log, and you want to record the content of this file as the type “apache”. You can use wildcards in your specification of the log file, and type can be anything.

You want to output to 2 places: firstly, your terminal screen, and secondly, to the Redis service running on your Logstash server

2. Old logs

Construct a shipper.conf file as follows:

input {

tcp {
type => "apache"
port => 3333
}

}

output {
stdout { debug => true debug_format => "json"}
redis { host => "" data_type => "list" key => "logstash" }
}

What this says:

Your input is whatever data you read from TCP port 3333, and you want to record the content of this file as the type “apache”. You can use wildcards in your specification of the log file, and type can be anything.

You want to output to 2 places: firstly, your terminal screen, and secondly, to the Redis service running on your Logstash server.

That’s all you need to do for now on the Shipper. Don’t run anything yet. Go back to your main Logstash server.

In the docs supplied at the Logstash website, you were given instructions how to install Redis, Logstash and Elasticsearch, including the Logstash web server. We are not going to use the Logstash web server, and use Kibana instead, so you’ll need to set up Kibana (3, not 2. Version 2 is a Ruby based server side application).

https://github.com/elasticsearch/kibana/

Onward…

(We’re going to be starting various services in the terminal now, so you will need to open several terminal windows)

Now, start the Redis service on the command line:

./src/redis-server --loglevel verbose

Next, construct an indexer.conf file for the Indexer:

input {
   redis {
      host => "127.0.0.1"
      type => "redis-input"
      # these settings should match the output of the agent
      data_type => "list"
      key => "logstash"

      # We use json_event here since the sender is a logstash agent
      format => "json_event"
   }
}

output {
   stdout { debug => true debug_format => "json"}

   elasticsearch {
      host => "127.0.0.1"
   }
}

This should be self-explanatory: the Indexer is talking input from Redis, and sending it to Elasticsearch.

Now start the Indexer:

java -jar logstash-1.2.1-flatjar.jar agent -f indexer.conf

Next, start Elasticsearch:

./elasticsearch -f

Finally, crank up Kibana.

You should now be able to access Kibana at:

http://yourserveraddress:5601

Now that we have all the elements on the Logstash server installed and running, we can go back to the shipping server and start spitting out some logs.

Regardless of how you’ve set up your shipping server (live logs or old logs), starting the shipping process involves the same command:

java -jar logstash-1.2.1-flatjar.jar agent -f shipper.conf

If you’re shipping live logs, that’s all you will need to do. If you are shipping old logs, you will need to pipe them to the TCP port you opened in your shipper.conf file. Do this is a separate terminal window.

nc localhost 3333 < /var/log/httpd/old_apache.log

Our shipping configuration is setup to output logs both to STDOUT and Redis, so you should see lines of logs appearing on your terminal screen. If the shipper can’t contact Redis, it will tell you it can’t contact Redis.

Once you see logs being shipped, go back to your Kibana interface and run a search for content.

IMPORTANT: if your shipper is sending old logs, you need to search for logs from a time period that exists in those logs. there is no point in searching for content from the last 15 mins if you are injecting logs from last year.

Hopefully, you’ll see results in the Kibana window. If you want to learn the ins and outs of what Kibana can do, have a look at the Kibana website. If Kibana is reporting errors, retrace the steps above, and ensure that all of the components are running, and that all necessary firewall ports are open.

2 tasks now remain: using Grok and setting up all the components to run as services at startup.

Init scripts for Redis, ElasticSearch and Kibana are easy to find through Google. You’ll need to edit them to ensure they are correctly configured for your environment. Also, for the Kibana init script, ensure you use the kibana-daemon.rb Ruby script rather than the basic kibana.rb version.

Place the various scripts in /etc/init.d, and, again on CentOS, set them up to start at boot using chkconfig, and control them with the service command.

Grok isn’t quite so easy.

The code is available from here:

https://github.com/jordansissel/grok/

You can download a tarball of it from here:

https://github.com/jordansissel/grok/archive/master.zip

Grok has quite a few dependencies, which are listed in its docs. I was able to get all of these on CentOS using yum and the EPEL repos:

rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/$(uname -i)/epel-release-5-4.noarch.rpm

then

yum install -y gcc gperf make libevent-devel pcre-devel tokyocabinet-devel

Also, after you have compiled grok, make sure you run ldconfig, so that its libraries are shared with Logstash.

How to explain Grok?

In the general development of software over the last 20-30 years, very little thought has gone into the structure of log files, which means we have lots of different structures in log files.

Grok allows you to "re-process" logs from different sources so that you can give them all the same structure. This structure is then saved in Elasticsearch, which makes querying logs from different sources much easier.

Even if you are not processing logs from different sources, Grok is useful, in that you can give the different parts of a line of a log field names, which again makes querying much easier.

Grok "re-processing", or filtering, as it is called, occurs in the same place as your Shipper, so we add the Grok config to the shipper.conf file.

This involves matching the the various components in your log format to Grok data types, or patterns as they are referred to in Grok. Probably the easiest way to do this is with this really useful Grok debugger:

http://grokdebug.herokuapp.com/

Cut and paste a line from one of your logs into the input field, and then experiment with the available Grok patterns until you see a nice clean JSON object rendered in the output field below.

How to change the hostname on an Amazon linux system without rebooting

I use the following config in the .bash_profile file for the root account on Linux systems:

PS1="${HOSTNAME}:\${PWD##*/} \$ "

This prints the server’s hostname in the shell command prompt, which is handy if you are working on lots of servers simultaneously.

However, I also do a lot of cloning of servers in Amazon. When the command prompt is carried over into the clone, you end up with the hostname of the clone source in the clone itself. Normally, you would solve this by changing /etc/sysconfig/network and rebooting, but this isn’t always practical.

Instead, just change /etc/sysconfig/network as usual, and then issue the following command:

echo "server name" > /proc/sys/kernel/hostname

Then logout and open a shell. New hostname sorted.

How to use git to maintain code between 2 computers

I recently decided to switch from using a traditional Intel based laptop to a Macbook (I know Macbooks also use Intel chips now, but you know what I mean).

Rather than simply make a tarball of all my code projects that I would ship across my network to the new system, I decided to try and be a bit more professional and give git a try.

All things being equal, a single developer working on a code project probably doesn’t need source control, but I figured it was probably wise to learn this, in case some situation arose in future where I was required to collaborate, and where it was important I didn’t look like a complete novice.

I have had some experience with git before, but this was pretty pedestrian, and it didn’t touch on the main challenge with git, which is working from multiple sources with remote repositories.

For the uninitiated, git is a source control system, but not like a traditional source control system.

In a traditional system, the history of changes to a particular code project are stored centrally. In the git system, each participant in the project retains a local history, which they can work on locally, or merge back to the central repository. This logic is somewhat convoluted when you are used to working with a centrally stored repository, which means git can take a bit of getting used to. This isn’t really helped by the syntax of the commands used in git, which are anything but self-explanatory.

I won’t go into git any more than this, because its way more complex than can be explained in a single post. Instead, I’ll focus on the task described by the post title; if you want to learn more about git that this provides for, you’ll have to read a bit more elsewhere, and do quite a bit of trial and error.

OK, so the basic components of the task are:

1. An Internet hosted git repository (repo)

2. Computer 1 with the git client installed

3. Computer 2 with the git client installed

There are several options available for creating an Internet hosted git repo, many of which offer a free basic level of service. I choose Bitbucket. I registered, signed in and create a repo. In completing this task, Bitbucket signalled the commands and URL I would need to work with the repo. Thank you Bitbucket for making life with git a little easier. For the purposes of this post, lets say I’m creating a repo to manage the code for this blog, so I’ve called it “nbfblog”

Next to Computer 1. The installation of git is very straightforward, so I won’t cover it here. Just Google it for your particular OS.

Once installed, you now need to create your local repo. Change to the directory in which your code is stored, and issue the following commands:

git init

git add .

git commit -am "initial checkin"

These commands will create your local repo.

The next task is to create what is referred to as a remote of this repo. When we were on Bitbucket, we basically created an empty repo called nbfblog. We are now going to populate that repo by creating a remote of our local repository. The terminology isn’t great here, which is kind of an issue throughout git.

To create your remote, issue the following command:

git remote add origin https://garrethmcdaid@bitbucket.org/garrethmcdaid/nbfblog.git

This creates a remote called origin, in the Bitbucket account that is identified by the username garrethmcdaid, in the repo nbfblog. Depending on how git is set up on Computer 1, you may or may not be asked for your Bitbucket password when you issue this command.

At this point, all you have done is create a remote. No code as been moved between Computer 1 and the central repository. To do this, issue the following command:

git push -u origin --all

If you’re used to using the command line, you’ll probably presume that the -u argument here means user. Again, its no harm to remind yourself that the git command syntax is misleading; it this instance, -u means upstream, and signifies the name of the remote you want to push your code to, and has nothing to do with usernames. The –all switch means push everything. You specify this as git provides functionality to push only parts of a local repo to remote, which is beyond what we are dealing with here.

At this point, a copy of your code should exist in Bitbucket, which you can check via the web interface. You will see that the code has been created under a branch called master. What’s a branch you ask? A good question, and there are as many answers as there are developers. We are only going to be working with the master branch, so its probably better if we leave the topic of branching for another day.

For now, lets just use an analogy: think of your code as a recipe for sourdough bread. One day, you wake up and decide you want to make sourdough bread with black olives. Rather than throw your original recipe (your master branch) in the bin, you make a photocopy of it (your black olive branch), and then make changes to the photocopy, leaving the original in your recipe folder. When you’re happy with your new recipe, you merge the annotations into the original and throw away the photocopy. You have now merged the blackolive branch into the master branch.

Like I say, we only going to deal with the master branch here (think of it has having 2 recipe folders in 2 different houses, and you want to make sure the recipe for sourdough bread is identical in each of the folders). The only other thing to say is that branches exist in your repo regardless of where it is stored, locally or centrally.

So enough about bread and back to git. To be sure we’re on the same page here, you should probably test making further changes to your local code case, and push them to the central repo. Try this:

Edit a file

git commit -am "Computer 1 update" (should report re. your edit)

git push origin master (should ship all your files, with edits, off to your remote).

Again, check the Bitbucket interface to make sure your changes (commits) have been passed in the central repo.

Now, having created a remote repo, a local repo, and mastered the process of moving code changes from the local repo to the remote repo, lets see if we can introduce Computer 2.

Simple stuff first, make sure the git client is installed.

Now, in this case, we must remember that we have already got a git repo, so we are not going to init another one. Instead, we are going to clone the existing git repo from the remote repository. Change to the parent directory where you want to locate the code base on Computer 2. We will allow git to create a new directory at this location to store the code.

sudo git clone https://garrethmcdaid@bitbucket.org/garrethmcdaid/nbfblog.git httpdocs

Hopefully, this command is self-explanatory. We are cloning the nbfblog repo, and its branches, that exists on Bitbucket under the account identifier garrethmcdaid in the new directory httpdocs.

If this command has worked  (again, your Bitbucket password may be required), you should be able to change into the httpdocs directory and view you code.

Now, the real test. Make some changes to the code on Computer 2. The objective here is to make sure those changes are replicated back to Computer 1 via the central repo.

Remember, git is based on local history rather than central history, so when you make a change to code, the first place to commit that is to the local repo. Lets do that now:

git commit -am "Computer 2 update"

That change now exists in the local repo on Computer 2. To get that change into the central repo, lets push it:

git push origin master

That is, push my local history to the master branch in the “origin” remote repo, which git knows is a remote, as the origin repo was cloned from a remote source. To be more precise, you are merging the code in your local master branch to the master branch in the origin remote.

Once again, to verify that this has worked, go back to the Bitbucket web interface and view your source. The changes you made to your code on Computer 2 should now be visible in that source.

Presuming that they are, well done. You have now mastered the core functionality of git, which is actually the hardest part of git to master. To really push the envelope, lets now make sure that when we go back to Computer 1, our code base has the changes we made on Computer 2.

Change back into the directory in which your code was stored on Computer 1. Now, we want to pull the current history of the master branch in the central repo into the master branch of our local repo.

git pull origin master

git knows that origin refers to a remote, because we created remotely, and we’re telling git that we want the history of the master branch, which again is the only branch we are dealing with. Once this command has processed, check your local code case and make sure the changes made on Computer 2 are there.

If they are, well done again. Now, all you have to do is remember the sequence:

Change a file

Commit locally

Push to remote

Pull from remote

Change a file

Commit locally

Push to remote

Pull to remote

as infinitum….

How to host a Facebook app on Heroku

Heroku is a Cloud Computing service that has recently teamed up with Facebook to provide free hosting for Facebook apps as described here.

The default instructions that issue in relation to hosting apps are somewhat brief, and don’t really describe how you migrate an existing application to Heroku, so I’ve provided a bit more detail here.

If you want to host an app on Heroku, I’d advise following these instructions, rather than choosing the Heroku hosting option when setting up your app on Facebook. It will give you a better understanding of how Facebook and Heroku work together.

Create an Heroku account:

https://api.heroku.com/signup

As part of this you will have to authenticate yourself, which requires entering Credit Card details. Nothing is charged to your Credit Card, but you do need to enter them.

Install the Heroku Toolbelt on your local system:

https://toolbelt.heroku.com/

This will install the Heroku client, the software revision tool Git, and Foreman which is a tool for running apps locally (you only need the Heroku client and Git, however).

Make sure you have an OpenSSH public/private key pair which you can use

You can create one on any Linux system using ssh-keygen, or you can download PuttyGen for Windows which will also create a key pair for you.

Login to Heroku:

Change to the directory in which you application is stored and login to Heroku

prompt> heroku login

Enter you email address and password.

You should now be prompted to upload the public part of your public/private key pair. You can also do this manually, by issuing the command:

heroku keys:add

You can then check that it is uploaded by typing

heroku keys

Once this is done, you are ready to create your application

Initialise a git repo for your application

Before creating your application in Heroku, you need to create a local git repo for it. To do this, run the following commands from the directory in which your application is stored:

git init
git add .
git commit -m "initial checkins"

This will initialise the repo, add the contents of your current directory for tracking and commit the files to the repo

Create the Heroku application

Next, you need to create a Heroku application. Type the following command:

heroku create

Heroku will respond with the name of your application

You can also view the name of your application by typing

git remote -v

Add a MySQL addon to your application

(You can not do this unless your Heroku account has been authenticated)

You need to provide a MySQL addon if your application uses MySQL. ClearDB provides a basic, free addon called Ignite

You can add this by typing

heroku addons:add cleardb:ignite

Once this is added, you can find out the login credentials and DB details by typing

heroku config | grep CLEARDB_DATABASE_URL

(This is a Linux commmand. In Windows, just type heroku config and look for the values you require.)

You can then add these to your application configuration

Import your database

Using a MySQL command line client, import your DB schema into the DB provided by ClearDB

mysql -h cleardb_hostname -u cleardb_username --password=cleardb_password -D cleardb_database_name < path _to_dump_of_your_MySQL_database

Push your application to Heroku:

You are now ready to push your application to Heroku. Just type

git push heroku master

This will push your code to the remote git repository in Heroku and launch it on the hosting service.

You should then be able to access it at the URL you obtain from the command:

heroku info --app your_app_name

Configure your Facebook application:

The final step is to configure your Facebook application so that is uses the URL provided in the previous step. The URL serves both HTTP and HTTPS.

Making changes to your code:

If you need to make a change to a code file, do the following after you have edited the file:

git commit -am "Reason for change"
git push heroku master

If you have to add a file/directory, do the following:

git add
git commit -am "Reason for change"
git push heroku master

If you have to delete a file/directory, do the following:

git rm
git commit -am "Reason for change"
git push heroku master

Heroku guide

https://devcenter.heroku.com/articles/git

How to clone an Amazon EC2 instance between 2 different accounts

So here’s the deal.

(note: this only applies to linux, ebs-backed instances)

Let’s say you’ve built a complex server instance under one account, and now, for whatever reason, you need that server to run under a different account.

Yes, you could go about creating an AMI, and packing that up and sharing it with the other account, but lets say this server has multiple EBS volumes attached, and has a larger data footprint than the 8GB that comes with a standard EBS instance root partition.

Not a problem. Amazon recently provided functionality to share volume snapshots with other accounts, which is an ideal way to transfer server instances between accounts.

Here’s the step by step:

1. For each of the volumes that is included in your system, including the root volume, make a snapshot

2. Update the permissions of each of those snapshots so that they are shared with the other account

3. Logout and login to the other account

4. Make sure you are in the same region as the one where the snapshots were created

5. Set the filter to Private Snapshots; the snapshots you previously shared should now appear

6. Create a volume from each of these, and make sure they are all in the same zone

7. Now, create an EBS-backed instance using the same AMI as was used to create the instance you are trying to transfer (the AMI doesn’t need to be the same if you are using Amazon Linux; again, make sure its in the same zone as the volumes you just created

8. After the instance fully deploys, stop it

9. Detach the default 8GB root volume from the instance. Attach the root volume that you created from the snapshot; attach as /dev/sda1

10. Attach all the other volumes you created from the snapshots, making sure to give them the same device names as on the original instance

11. Start the instance

12. You’re done. To connect, use the same method as on the original instance, rather than how Amazon tells you to connect.

Building a Video Library with FFMPEG

The video above is imported from http://www.centerforclinicalexcellence.com, for whom I’ve recently constructed a Buddypress Video Library using FFMPEG and the JW Player.

The owners of the site had originally wanted to use a third party like Vimeo or Twistage for this solution, but I persuaded them that they’d achieve a lot more flexibility and functionality if they went with a bespoke solution.

This would allow them to integrate seamlessly with their Buddypress User Database, which was not something that was going to happen very easily with a third party API.

I’m pretty happy with the finished product. Users can upload video, rate videos, comment on videos, embed videos in other sites, and linkback to videos through Facebook and Twitter.

FFMPEG isn’t for the faint hearted, however. It generally doesn’t come installed on hosting platforms, and has a long list of dependencies about which it is very particular when installing.

Normally, you can overcome this by installing through a package manager like yum on CentOS, which I have used before, but the current version of FFMPEG uses a version of libmp3lame (3.98.2, which is used for encoding audio) that contains a nasty little bug that prevents the duration of a clip being embeded in Flash encoded videos.

This in turn plays havoc with Flash players, who don’t know who long the video they are playing will run.

Unfortunately, there is no easy way in yum to specify the version of dependencies you want to use, so you have to go through all of FFMPEG’s dependencies and install them manually, just so you can install a downgraded version of libmp3lame (3.97) which doesn’t contain the bug.

You then need to comile FFMPEG from source.

This is a tricky process, but thankfully I found this article which gives a pretty good summary of what you have to do (there are one or two typos in it, but you’ll catch them as you proceed; and install lame 3.97, not 3.982 as listed). You also need to pay close attention re. the linking of libraries as described, and be sure to run ldconfig.

You can also leave 1 or 2 of the slightly less common codecs if they are giving your errors. The ones you really need are lame, faad, faac and vorbis.

JW Player by comparison is a breeze to install. The license and FB and Twitter plugins were purchased for the very reasonable sum of €77. Its a great player, and I’d recommend it to anyone.