Sunday, 18 November 2012

How to minify GeoJSON files?

You can't do web mapping these days without knowing your GeoJSON. It's the vector format of choice among popular mapping libraries like Leaflet, D3.js and Polymaps. Size matters on the web, especially if you want to distribute complex geometries, like the world's countries. The challenge is even bigger if you want to target mobile users - or support web browsers with poor vector handling (IE < 9). This blog post will show you how to minify your GeoJSON files before sending them over the wire.

The first thing you should do is to generalize your vectors so they don't contain more detail than you need. In a previous blog post, I was able to remove 90% of the coordinates without loosing to much detail for map scale I wanted to use. This will of course have a great effect on the file size.

Today, I'm going to use country borders from the Natural Earth dataset. These datasets are already generalized for different scales (1:10m, 1:50m, and 1:110 million), so I'll use them as they are. The 1:110m (small scale) and 1:50m (medium scale) shapefiles will cover the needs for the thematic world maps I plan to make:

The 110m and 50m country polygons shown in QGIS.

Let's open the datasets in QGIS. If you look at the attribute table you'll see that each dataset contains 63 attributes, which makes them very versatile. For your web maps, you probably need just a few of the attributes, and you should remove the ones you don't need. I'm keeping the country name and the ISO 3166-1 country codes (alpha-2, alpha-3, and numeric), which can be used to link country geometries to statistical data. 

Only keep the attributes you need.

Next, we can convert the shapefiles to GeoJSON with ogr2ogr:

ogr2ogr -f "GeoJSON" -lco COORDINATE_PRECISION=1 ne_110m_admin_0_countries.json ne_110m_admin_0_countries.shp

ogr2ogr -f "GeoJSON" -lco COORDINATE_PRECISION=2 ne_50m_admin_0_countries.json ne_50m_admin_0_countries.shp

The important thing is that I'm only keeping one decimal (coordinate precision) for the 110m dataset, and two decimals for the 50m dataset, which is sufficient for my map scales. This will reduce the size of the GeoJSON files by more than half. The size of the 110m GeoJSON is now 207 kB and the 50m version is 1,897 kB. But we can do better.

The files contains a lot of whitespace, which is waste of space. I planned to use Sublime Text to remove the whitespace, but it were not able to handle the 50m GeoJSON file, so I switched to Notepad++. I used these regular expressions:

Find: "([^a-z.]) "
Replace: "$1"
This will remove all whitespace which is not succeeding a letter or a dot, which are present in country names.

Find: "\n,"
Replace: ","
Remove line breaks (keeping some for readability).

Find: "\.0([,\]])"
Replace: "$1"
Remove trailing zeros.

This will reduce the file size of the 110m GeoJSON from 207 to 156 kB, without loosing any data quality. More than 400k of whitespace characters was removed from the 50m GeoJSON file, reducing the file size from 1,897 to 1,481 kB.

If your web server is supporting gzipping on-the-fly, the 110m GeoJSON will end up being 45 kB and the 50m version will be 430 kB. Not bad!

And if this is too much work, you can always download the final GeoJSON files on

NB! Mike Bostock’s TopoJSON would allow us to compress the GeoJSON even more, while preserving topology (shared borders between countries) - but we would need to use a map client supporting the format. Looks promising!

Sunday, 4 November 2012

Mapping New Zealand - a summary

I've had a fantastic two months study trip to New Zealand. Unfortunately, I had to go back to Norway this week to fill up my bank account - just when the summer was arriving in New Zealand. I'm going to miss the beautiful country with its great people.  

New Zealand is the perfect country to map, as an isolated country surrounded by a vast ocean, and because of all the free data available. I hope my Mapping New Zealand blog series has been useful for others as well:

It's a lot of exciting things happening on the New Zealand mapping scene, and the Kiwis are very welcoming people. I want to thank all the nice people I met on my journey, who was very willing to share their knowledge and experiences:

I've been a real map nerd in New Zealand, but I had lots of time to explore the country too. Here are a few of my photos: 


New Leaflet plugin to handle multiple TileMill layers

My setup for the population density map of New Zealand made it easy to create new choropleth maps with New Zealand census data. This blog post explains how you can use Leaflet to switch between multiple interactive layers created with TileMill.

I wanted to create a map of the social geography of New Zealand, using the Index of Deprivation from the Department of Public Health, University of Otago. I downloaded a Excel sheet containing data for the 2006 census area units, which I also used for my population density map. I simply added the data to the same SQLite database, and created the map using the same techniques described in two previous blog posts (1, 2).

The Index of Deprivation is constructed from nine Census 2006 variables, and provides a summary deprivation score from 1 to 10. A score of 1 is allocated to the least deprived 10 percent of areas, and 10 is allocated to the most deprived 10 percent of areas. You'll find more information about the index in the Atlas of Socioeconomic Deprivation in New Zealand.

Wax allows you to easily add an interactive TileMill map to Leaflet or other mapping libraries. Adding more than one interactive layer is not that straightforward, so I wrote a Leaflet plugin:

This plugin allows you to switch between various layers (interactive or not) and it will automatically load and display map legends, and remove the elements when they're no longer needed. It's easy to use the plugin:

Include a wax property when you create a tile layer, containing a TileJSON object or an URL to a TileJSON file.  

var population = L.tileLayer('tiles/nz-popden/{z}/{x}/{y}.png', {
  attribution: 'Statistics New Zealand',
  wax: 'tiles/nz-popden.tilejson'

After you've created the map you simply add:

That's it! :-)

You can try to switch between the various basemaps using the layers control below:

Fullscreen map

Creating map labels with TileMill

It's one obvious thing missing from my maps of New Zealand: map labels. This blog post will show you how to create a transparant layer with map labels with TileMill, which can be added as an map overlay in Leaflet or other mapping libraries.

To create map labels, you need a point dataset containg at least a position and the label text. As we'll see later, information about type of place, importance etc. will also be useful. For my New Zealand maps, I'm using a dataset from LINZ Data Service: NZ Geographic Names (Topo, 1:500k)

I started by using QGIS to convert the shapefile into SQLite database (by right-clicking on the layer name in QGIS and select “Save as…”). This allows us run SQL queries against the data in TileMill. The dataset includes three attributes or columns, - name, size and a code describing the type of place:

I'm opening the SQLite file in TileMill:

I'm using this SQL query to load the data from the SQLite database:  

SELECT * FROM nz_labels ORDER BY size DESC

This will sort the labels after size, in descending order. I'm assuming that larger size means more important labels, and this will instruct TileMill to start with the most important labels (there is not enough space to show all labels on all zoom levels). Remember also to include the SRS projection string for the data source (NZTM2000):

+proj=tmerc +lat_0=0 +lon_0=173 +k=0.9996 +x_0=1600000 +y_0=10000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs

CartoCSS includes a provides several ways to style map labels or text, and I haven't got the time to explore all the possibilities. This is the my CartoCSS for zoom level 5-7:   

I start by defining the fonts I'm going to use. It's a convention in cartography to label water features (lakes, sea, rivers) in serif italic faces, but small serif fonts on screen can be hard to read.. New high resolution screens, like Apple's Retina display, will probably make serif fonts more popular for web maps. You can use TypeBrewer to test various font combinations.  

I'm starting with the places marked as "METR" (probably metropolitan areas, although using this term in New Zealand sounds a bit strange...). I'm aligning the labels (text-horizontal-alignment) to place them over the ocean, so they're not obscuring the map. A halo is added around the labels (text-halo-fill / text-halo-radius) to make the text visible on top of various basemaps.

Auckland, Wellington, Christchurch and Dunedin are the only labels shown on zoom level 5 (in the Web Mercator projection). I'm then adding towns on zoom level 6, and populated places on zoom level 7.

Labels shown on zoom level 6.

The minimum distance between each label (text-min-distance) is 30 pixels to avoid label collisions. This means that many towns or populated places will not be drawn because it's not enough space. More labels will be revealed as you zoom in. Although I sorted the features after the size attribute, it's probably not enough to make a good selection of labels for each zoom level. I'm sorry if I left out your town! :-)     

I'm continuing like this until zoom level 12, gradually adding more labels for features like suburbs, lakes, mountains etc.

The map labels were exported as a separate map using the MBTiles format, and added to my Leaflet map as a map overlay. This is the result:

Fullscreen map

Exploring the MapBox stack: MBTiles, TileJSON, UTFGrids and Wax

In my last blog post, we created a population density map of New Zealand using QGIS, SQLite and TileMill. Today, we’re going to publish this map to the web using various MapBox inventions. I'll also show you how to publish an interactive TileMill map on your own web server using some PHP and JavaScript wizardry. 

I love MapBox. The team behind this platform has created a series of new specifications, allowing us to create fast, good looking and interactive maps. The downside is the limited support for other map projections than Web Mercator.

TileMill allows you to add legends and tooltips to your maps. I’ve added a legend to my population density map with a HTML snippet describing the map and the color scale.

The tooltip shows when the user hovers over or clicks on the map. It allows us to show dynamic content - additional data, images, charts - for each map feature. I want to show the name, total population, area and population density for each feature:

The data fields for the layer are wrapped in curly Mustache tags. These tags will be replaced by data when you interact with the map. You can use the full Mustache template language.

The easy way to publish this map is to upload it to MapBox Hosting, and use the embed code provided. If you want to publish your map on your own web server, this is an alternative route:

To export an interactive map from TileMill, you need to use the MBTiles format. This is an innovative SQLite-based format specification capable of storing millions of map tiles in a single file. The format is also supported by various 3rd-party applications, and I'm sure we'll see a greater adoption in the future.

Within the MBTiles file, the map legend, the tooltip template and information about map extent, zoom levels etc. is stored in a format named TileJSON. This is also an open specification, providing a consistent way of describing a map, making it easier to load and display a map the way it’s meant to be seen.  The TileJSON for my map looks like this:

If you add interactivity to your map (tooltips), your MBTiles file will also include the most impressing part of the MapBox specifications: UTFgrids. This JSON-format allows us to add thousands of interactive points or polygons through interactivity data grids, and it will even work in older browsers with limited support for vector data. 

So how do we turn our MBTiles file into an interactive map? Previously, I've used MBUtil to extract the contents from MBTiles into a directory structure. But by doing this, we loose the benefits of the MBTiles format, like storing a map in a single file and dealing with redundant images. What we need is a script on our web server that will extract content from our MBTiles file on demand. I decided to try a PHP script from infostreams (this is probably not the most scaleable solution). The script supports the full MBTiles specification, including TileJSON and UTFGrids. Installation is simple: just put the .php file and the .htaccess file in the same directory as your .mbtiles files. The .htaccess file includes a rule that rewrites requested URLs on the fly, so the map data is available un URLs like:
So when we have our backend sorted, how can we recreate our interactive map with Leaflet or other JavaScript mapping libraries? This is way the MapBox team created Wax, which is a client implementation of the MBTiles interaction specification. You just include the wax script together with your mapping library of choice, and then you can add interactivity with a few lines of code:

I've also done some extra JavaScript coding to allow switching between various interactive map layers. I'll save that for a later blog post.

The Leaflet map looks like this (there seems to be an issue with the latest Wax distribution and Google Chrome):

Fullscreen map