The problem

Wikipedia states, the travelling salesman problem as this: “The travelling salesman problem asks the following question: Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?”

The actual equation behind this formula is very complex and the explanation of this is is probably best left to the Wikipedia Page

My use case

I wrote an application that takes X number of orders, and has to split them between 9 vehicles, based on their, weight dimension and delivery location. I was first going to use my own formula to do this, but the time and power taken to run this in production, is a bit out of the question. So instead, I decided to sort all of the orders out into what vehicle they would end up in (code available for sale).

Now, I have 10-15 different addresses that I need to calculate the most efficient route for. So instead of writing a system from scratch, I decided to connect to a third party API.

It started with Google…

So I first wrote the following script in python, to work with the Google Maps API. (I didn’t realise they wanted £7,500 a year ex VAT to use their API, so I moved onto an open source solution). If you are a Google maps user and want to take advantage of my code, it is shown below with comments: (it could be compressed a lot more, but I have tried to make it easy to read for those from a less technical background)

################################################################### Start and finish at same place ######## go via, with route optimise ##
# this method takes advantage of the optimisiation that Google lets us use.
url= "https://maps.googleapis.com/maps/api/directions/json?origin=Portsmouth,UK&destination=Portsmouth,UK&waypoints=optimize:true|Southampton,UK|Poole,UK|Reading,UK|Bridport,Dorset&sensor=false&key=YOURAPIKEYHERE"
# set the waypoints in the url
import urllib2
data = urllib2.urlopen(url)

# now get the json response
pre_json = data.read()
import json
obj = json.loads(pre_json)
data_new = json.dumps(obj, indent=2)
#load into dict
d = json.loads(data_new)

routes = []
# get the starting address
start_address = d['routes'][0]['legs'][0]['start_address']
start_lat = d['routes'][0]['legs'][0]['start_location'][u'lat']
start_long = d['routes'][0]['legs'][0]['start_location'][u'lng']

routes.append({'address': start_address, 'lat': start_lat, 'lng': start_long})
total_time = 0
total_distance = 0
for x in d['routes'][0]['legs']:
	total_time += x['duration']['value']
	total_distance += x['distance']['value']
	address = x['end_address']
	lat = x['end_location'][u'lat']
	lng = x['end_location'][u'lng']
	routes.append({'address': address, 'lat': lat, 'lng': lng})
print '##############################################################################'
print 'Optimised Route'
print '##############################################################################'
tt = (total_time / 3600.00)
td = total_distance*0.000621371192
print "Total Time %.2f hours" % round(tt,2)
print "Total Distance %.2f miles" % round(td,2)
for x in routes:
	print x

## ORIG ROUTE ##
# Now we put the route in, without any optimisation flags so we can compare
url2 = "https://maps.googleapis.com/maps/api/directions/json?origin=Portsmouth,UK&destination=Portsmouth,UK&waypoints=|Southampton,UK|Poole,UK|Reading,UK|Bridport,Dorset&sensor=false&key=YOURAPIKEYHERE"
data2 = urllib2.urlopen(url2)

pre_json2 = data2.read()
import json
obj2 = json.loads(pre_json2)
data_new2 = json.dumps(obj2, indent=2)
#load into dict
d2 = json.loads(data_new2)

routes2 = []

start_address2 = d2['routes'][0]['legs'][0]['start_address']
start_lat2 = d2['routes'][0]['legs'][0]['start_location'][u'lat']
start_long2 = d2['routes'][0]['legs'][0]['start_location'][u'lng']

routes2.append({'address': start_address, 'lat': start_lat, 'lng': start_long})
total_time2 = 0
total_distance2 = 0

for x in d2['routes'][0]['legs']:
	total_time2 += x['duration']['value']
	total_distance2 += x['distance']['value']
	address = x['end_address']
	lat = x['end_location'][u'lat']
	lng = x['end_location'][u'lng']
	routes2.append({'address': address, 'lat': lat, 'lng': lng})
print '##############################################################################'
print 'Original Route'
print '##############################################################################'
tt2 = (total_time2 / 3600.00)
td2 = total_distance2*0.000621371192
print "Total Time %.2f hours" % round(tt2,2)
print "Total Distance %.2f miles" % round(td2,2)
for x in routes2:
	print x

print '##############################################################################'
print 'Summary'
print '##############################################################################'
total_saving_time = tt2 - tt
total_saving_distance = td2 - td
print 'The optimised route is %.2f miles shorter' % round(total_saving_distance,2)
print 'And also saves %.2f hours of time' % round(total_saving_time,2)

Here is an example output:

##############################################################################
Optimised Route
##############################################################################
Total Time 5.95 hours
Total Distance 274.23 miles
{'lat': 50.8166745, 'lng': -1.0833259, 'address': u'Portsmouth, UK'}
{'lat': 51.4544776, 'lng': -0.9781547, 'address': u'Reading, UK'}
{'lat': 50.7150463, 'lng': -1.9872525, 'address': u'Poole, UK'}
{'lat': 50.7335746, 'lng': -2.7583416, 'address': u'Bridport, Dorset DT6, UK'}
{'lat': 50.9096948, 'lng': -1.4047779, 'address': u'Southampton, UK'}
{'lat': 50.8166745, 'lng': -1.0833259, 'address': u'Portsmouth, UK'}
##############################################################################
Original Route
##############################################################################
Total Time 7.32 hours
Total Distance 334.71 miles
{'lat': 50.8166745, 'lng': -1.0833259, 'address': u'Portsmouth, UK'}
{'lat': 50.9096948, 'lng': -1.4047779, 'address': u'Southampton, UK'}
{'lat': 50.7150463, 'lng': -1.9872525, 'address': u'Poole, UK'}
{'lat': 51.4544776, 'lng': -0.9781547, 'address': u'Reading, UK'}
{'lat': 50.7335746, 'lng': -2.7583416, 'address': u'Bridport, Dorset DT6, UK'}
{'lat': 50.8166745, 'lng': -1.0833259, 'address': u'Portsmouth, UK'}
##############################################################################
Summary
##############################################################################
'''
The optimised route is 60.48 miles shorter
And also saves 1.36 hours of time
'''

###############################################################################
time python tsp.py
real	0m5.843s
user	0m0.128s
sys	0m0.040s
###############################################################################

Open Source Style..

Now, lets do the same thing, but with a FREE licence and using Open Source data, using the brilliant API from MapQuest

import requests
import json

# set your app key
my_app_key = "YOUR_KEY"

# add points to the route
route = {"locations":[]}
route['locations'].append( {"latLng":{"lat":51.524410144966154,"lng":-0.12989273652335526}})
route['locations'].append( {"latLng":{"lat":51.54495915136182,"lng":-0.16518885449221493}})
route['locations'].append( {"latLng":{"lat":51.52061842826141,"lng":-0.1495479641837033}})
route['locations'].append( {"latLng":{"lat":51.52850609658769,"lng":-0.20170525707760403}})

# url to post of API
url = "http://open.mapquestapi.com/directions/v2/optimizedroute?key=" + my_app_key
url_basic = "http://open.mapquestapi.com/directions/v2/route?key=" + my_app_key
# Important, we need to add headers saying we are posting JSON
headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}

# build the request
r = requests.post(url, data=json.dumps(route), headers=headers)
r_basic = requests.post(url_basic, data=json.dumps(route), headers=headers)
# load the json respone into the praser 
route_json = json.dumps(r.json(), indent=2)
route_json_basic = json.dumps(r_basic.json(), indent=2)
# load into a python data structure
data = json.loads(route_json)
data_basic = json.loads(route_json_basic)

# Map URL to generate a map showing the routes we are taking on this trip
map_placeholder = "http://open.mapquestapi.com/staticmap/v4/getmap?key={my_app_key}&bestfit={bestfit}&shape={shape}&size=600,600&type=map&imagetype=jpeg"

# Define some functions to return clean data
def get_bounding_box(data):
	# load the bounding box data which forms the edges
	# of the static map we will be using
	bestfit = "{lat1},{long1},{lat2},{long2}"
	bestfit_clean = bestfit.format(lat1=data['ul']['lat'], 
		long1=data['ul']['lng'], lat2=data['lr']['lat'], 
		long2=data['lr']['lng'])
	# now return the clean variable for use in the static map url
	return bestfit_clean

def get_shape(data):
	seq = ""
	for x in data['locations']:
		latLng = str(x['displayLatLng']['lat']) + ',' + str(x['displayLatLng']['lng']) + ','
		seq = seq + latLng
	# remove last trailing comma
	shape_clean = seq[:-1]
	return shape_clean

# run data through the functions
shape_basic = get_shape(data_basic['route'])
shape_optimised = get_shape(data['route'])
bestfit_basic = get_bounding_box(data_basic['route']['boundingBox'])
bestfit_optimised = get_bounding_box(data['route']['boundingBox'])

# generate map URL
map_basic = map_placeholder.format(my_app_key=my_app_key, bestfit=bestfit_basic, 
	shape=shape_basic)
map_optimised = map_placeholder.format(my_app_key=my_app_key, bestfit=bestfit_optimised,
	shape=shape_optimised)

# Get a printout of the data
print '>------------<'
print 'Original Route'
print '>------------<'
# Show original order
for x in data_basic['route']['locationSequence']:
	print route['locations'][int(x)]['latLng']['lat']
print "Total Distance " + str(data_basic['route']['distance']) + ' miles'
print "Total Fuel Used " + str(data_basic['route']['fuelUsed']) + ' litres'
print "Total Time " +  str(data_basic['route']['formattedTime'])
print 'Map ' + map_basic
print ''
print '>-------------<'
print 'Optimised Route'
print '>-------------<'
# Show optimised order
for x in data['route']['locationSequence']:
	print route['locations'][int(x)]
print "Total Distance " + str(data['route']['distance']) + ' miles'
print "Total Fuel Used " + str(data['route']['fuelUsed']) + ' litres'
print "Total Time " +  str(data['route']['formattedTime'])
print 'Map ' + map_optimised

The results shown below, you can use their free GeoCoding service to take address strings and turn them into coordinates.

luke@xbu64:~/python$ python open_tsp.py 
>------------<
Original Route
>------------<
51.524410145
51.5449591514
51.5206184283
51.5285060966
Total Distance 8.482 miles
Total Fuel Used 0.52 litres
Total Time 00:25:13
Map http://open.mapquestapi.com/staticmap/v4/getmap?key=YOURKEY&bestfit=51.547508,-0.201385,51.519817,-0.127893&shape=51.524410145,-0.129892736523,51.5449591514,-0.165188854492,51.5206184283,-0.149547964184,51.5285060966,-0.201705257078&size=600,600&type=map&imagetype=jpeg

>-------------<
Optimised Route
>-------------<
{'latLng': {'lat': 51.524410144966154, 'lng': -0.12989273652335526}}
{'latLng': {'lat': 51.52061842826141, 'lng': -0.1495479641837033}}
{'latLng': {'lat': 51.54495915136182, 'lng': -0.16518885449221493}}
{'latLng': {'lat': 51.52850609658769, 'lng': -0.20170525707760403}}
Total Distance 7.116 miles
Total Fuel Used 0.43 litres
Total Time 00:21:37
Map http://open.mapquestapi.com/staticmap/v4/getmap?key=YOURKEY&bestfit=51.547508,-0.201385,51.519962,-0.130076&shape=51.524410145,-0.129892736523,51.5206184283,-0.149547964184,51.5449591514,-0.165188854492,51.5285060966,-0.201705257078&size=600,600&type=map&imagetype=jpeg

The above code also generates a static map so you can see a visual representation of the route you just solved :)

Need any help with a project that uses Maps, Geo-positional or GIS? Hit me up luke@pumalo.org

The problem

Anyone who has worked in/for a startup will know you aren’t just (your job role) you job is (your job role) + everything else tech related. Back end programmers are writing javascript, front end developers are configuring nginx, you get the jist.

I am part of a new startup GetWork2Day and currently we have 10 people working for the startup, I am the only one who is doing anything technical, which means I am responsible for: Backend, front end, database, server architecture and everything else. Our current server configuration is a basic Ubuntu server, with nginx and everything else needed, its behind a load balancer to handle scaling and it does the job. However, I don’t want to manage a server full time and when we get more staff, I don’t want people have no server experience, poking around with my server, ill come back from a few days off and can foresee the problems already. I need a service where I can just push code to github and it deploys for me.

Wait, what about….

Elastic Beanstalk, Heroku, Digital Ocean.

Well all of thes are great, they really are, but running things outside of the normal remit causes problems, GetWork2Day uses very unique and obscure libraries for some parts, so I wanted more control over the deployment, also we can’t afford to run on any of the three major players.

So how can I homebrew this?

Well, I first looked at Puppet and Bamboo. Both of these are great bits of software, but they are overpriced (for us) and they also work on a fixed IP address (from what I was reading) to connect to your application. If you’re application is behind a load balancer, you IP is always random and so I needed something where my application reached out and connected to my git server, downloaded the source code (if there was new commits), rebuilt it and then re-launched the application. This led me to jenkins.

Why Jenkins?

Jenkins is free and open source and with its great array of plugins, I can make it do whatever I want (within reason). What I did was create a basic jenkins setup with a git project, set it to check for new git commits every hour, if there was new code to download, jenkins would pull the changes, copy the code to the running location with rsync and then restart the local application server. Meaning now all I have to to is push code to the master branch (or however you configure jenkins) then, jenkins will automatically sort everything out for the re-deployment. Even if my application had scaled across 20 servers, each server would stay up-to-date with the latest code, without any further integration from me (or anyone). Jenkins can also run a custom command or script after a job is cloned/build completed successfully, mine is:

# After successfull build

# rsync the build dir to the app dir
rsync -av --delete --delete-excluded --exclude ".pyc" /var/lib/jenkins/jobs/g1/workspace/ /home/app/

# restart supervisor
supervisorctl restart tangable:

What does this have to do with Docker?

Well, with the current setup, I still have a server to maintain and I didn’t wan’t that, I wanted this entire server bundled into one build script, and now I am going to tell you how I built this entire framework with a script in under 40 lines, with the brilliant Dockerfile.

The basics

Ok, so lets start:

# Set base image
FROM ubuntu
MAINTAINER Luke Crooks "luke@pumalo.org"

Now, we have the basics set, we can install some packages:

### APT SECTION ###
# Update aptitude with new repo
RUN apt-get update
# Install software 
RUN apt-get install -y git nginx supervisor wget python-virtualenv curl 
RUN apt-get install -y python-dev postgresql-server-dev-9.3 rsync
# Install libaries
RUN apt-get install -y libjpeg62 libjpeg62-dev zlib1g-dev libpng12-0
RUN apt-get install -y libtiff5 libgif4 libgeos-dev

Now we have the basics we need, we can go ahead and start configuring. In the same folder that you are building your Dockerfile, you will also need the copy of your ssh key (used at github or bitbucket to verify your credentials).

### SSH SECTION ###
# Make ssh dir
RUN mkdir /root/.ssh/
# Copy over private key, and set permissions
ADD id_rsa /root/.ssh/id_rsa
RUN chmod 0600 /root/.ssh/id_rsa
# Add bitbuckets key to known_hosts
RUN ssh-keyscan bitbucket.org >> /root/.ssh/known_hosts

Now we can clone our private repositories, one for all our docker configuration, the other for our application:

# Clone the conf files into the docker container
RUN git clone git@bitbucket.org:Username/docker-conf.git /home/docker-conf
# Clone the repo locally
RUN git clone git@bitbucket.org:Username/application.git /home/app

Awesome, now we have all the configuration filed we need, we can remove the default files and copy your custom configuration files to where they need to be:

# remove default nginx configs
RUN rm /etc/nginx/sites-available/default && rm /etc/nginx/sites-enabled/default
# Copy new settings
RUN cp /home/docker-conf/configs/nginx/nginx.conf /etc/nginx/nginx.conf
RUN cp /home/docker-conf/configs/nginx/sites-available/site.conf /etc/nginx/sites-available/
# Enable the new site with a symbolic link
RUN ln -s /etc/nginx/sites-available/site.conf /etc/nginx/sites-enabled/site.conf

Next up, we are going to install Jenkins and copy over our local config files:

# Install Jenkins
RUN wget -q -O - http://pkg.jenkins-ci.org/debian/jenkins-ci.org.key | sudo apt-key add -
RUN sh -c 'echo deb http://pkg.jenkins-ci.org/debian binary/ > /etc/apt/sources.list.d/jenkins.list'
RUN apt-get update
RUN apt-get install -y jenkins

# Copy over configs
RUN rm /etc/default/jenkins
RUN cp /home/docker-conf/jenkins/default /etc/default/jenkins 
RUN chmod +x /home/docker-conf/jenkins/after-build
RUN rm -rf /var/lib/jenkins/*
RUN cp -a /home/docker-conf/jenkins/root/* /var/lib/jenkins/
# Reset file permissions after moving files
RUN chown jenkins:jenkins -R /var/lib/jenkins/
RUN chmod 775 -R /var/lib/jenkins/

# Give jenkins permissions to manage the supervisor service
RUN echo "jenkins        ALL = NOPASSWD: ALL" >> /etc/sudoers

Now this is fairly python specific, but the same principles will apply for other languages, we need to create an environment to install application dependencies, and install the dependencies from the requirements file:

# Install the python requirements from requirements.txt
RUN virtualenv --no-site-packages "/home/env"
RUN /home/env/bin/pip install -r /home/app/requirements.txt
# Lastly install the app into the virtualenv
RUN /home/env/bin/python2.7 /home/app/setup.py install

Not normal docker behaviour

Usually a docker container is meant to one run process/command, e.g just an application, or a web server or whatever you want. But, if you use a program such as Supervisor you can configure it to run multiple programs, e.g. in old money, you can run an entire LAMP stack inside a docker container. Explaining how to configure Supervisor is outside the scope of this post, but check out their website for some guides.

### SUPERVISOR ###
# Now copy the supervisor configs over
RUN cp /home/docker-conf/configs/supervisor/supervisord.conf /etc/supervisor/supervisord.conf
RUN supervisord -c /etc/supervisor/supervisord.conf

# Web app configs
RUN cp /home/docker-conf/configs/supervisor/site.conf /etc/supervisor/conf.d/app.conf
# Nginx configs
RUN cp /home/docker-conf/configs/supervisor/nginx.conf /etc/supervisor/conf.d/nginx.conf
# Jenkins confgs
RUN cp /home/docker-conf/configs/supervisor/jenkins.conf /etc/supervisor/conf.d/jenkins.conf

Now docker is running our application, nginx and jenkins, all we need to do is expose port 80 and tell docker to run the supervisor command on start:

# Expose the python app to world, note jenkins will not be accessible as we
# are only expising port 80, not 8080 which is where jenkins runs.
EXPOSE 80
CMD ["supervisord", "-n"]

In Summary

So in summary, we have just created one script that builds an Ubuntu image, installs all dependencies, configures the build process, manages the application and web server and takes care of everything else. It is also capable of running behind a load balancer, so if your docker container was scaled into 20 different instances, they would all stay up to date, without having to configure a management application to deploy new code to different IP addresses.

Note

This is not usually the norm for docker deployment, but it works and it works well. Use at your own peril! No databases configuration is shown here as we are using a remote RDS amazon database. You should never host a databased behind your own load balancer as they are destroyed and created frequently.

This is my own personal opinion, many people love Django, I however am not one of those users.

Background

I was an avid Django user for at least 18 months, using and contributing to several projects. At first everything was rosy, then I experienced upgrading from 1.25 to 1.3.

When I first started with Django it was more due to the frustration with Ruby on Rails, I had several projects using Spree and Radiant CMS. I had been using Ruby/Rails for around a year, doing fairly small projects, it was great for what I needed, but I never really felt comfortable with it. I honestly can’t remember what versions of Ruby I was using, but I think it was 1.8. I had a spree site, linked to a Radiant CMS, then spree upgraded to Ruby 1.9 and Radiant didn’t. To get the new features (and support) of spree, I ended up having to split my site across two applications, this ran for about 6 months, until I had given up fighting, and looked at writing a new platform.

I eventually migrated to python (which I had used in college) and re-wrote my setup for Django. I used the brilliant Django-cms, I wrote a custom application adapted from Arkestra. This was a time when the CMS and Arkestra were both under rapid development, but I loved both projects and even helped contribute.

The honeymoon period

Everything was great, then django 1.3 became a final release and apps started moving, It wasn’t really a problem but I remember migrating from django admin media to static files and various other code changes, not something I really enjoyed. Then everything was fine again, until 1.4 came out. This whole process really got on my nerves and changing code every version upgrade was a pain, especially when different libraries updated at different paces. Which basically means, if you want to use code that others have written for django, expect to be updating a lot of it as you migrate and projects get left behind.

I ended up re-writing a lot of third party apps, just so I could migrate my CMS to the the latest version.

And then I needed more

I was then approached to write a custom CRM system for a company, whilst evaluating the best options, I stumbled across Pyramid + SQLAlchemy. It was love at first sight, it was writing pure python with a few web renderers. I could see that major version upgrades wouldn’t be a problem, as I was just writing python.

SQLAlchemy is just a god send, words cannot describe how awesome this package is.

Coding with SQLAlchemy and Pyramid feels brilliant after coming from Django, there are literally no limitations, write what you want quickly and easily and knowing that I am not going to have a headache when a new major version comes out, is even better. Its been 18months with Pyramid now and the documentation and the community are brilliant.

If you have a rainy day to have a go with Pyramid, do it. You won’t regret it.

This post will be updated with any cool cheats/commands that I use on a daily basis, that may help others out.

Make a local user, superuser

Everytime I work on a new machine/environment I need to be able to create and delete databases as a local user quickly and easily, I am by no means a PostgreSQL expert, but I need to be able to create databases, ready for web application development. So the below code should get you started.

First, login as the potgres user, and access the psql shell:

$ sudo su - postgres
$ psql

Now, we need to create a new postgres user, for the sake of this guide, lets assume the name “luke” and obviously replace ‘somepassword’ with your own, secure password.

postgres=# create user luke with password 'somepassword';
CREATE ROLE

Now we have the user setup, lets give him ‘Super User’ credentials:

postgres=# ALTER USER luke with SUPERUSER;
ALTER ROLE

Now, open up a new terminal and you should be able to create and delete databases:

$ createdb staff
$ dropdb staff