A few questions and answers about the last posts

I have had a few questions about TWKB and the websocket DEMO

When does the geoemtries actually load?
It is not obvious when the geoemtries actually gets loaded from the server to the browser.

That happens first time you click a layer. Then the geometries are streamed row by row via websocket to the browser which parses the geometry and adds it to the map, also geometry by geometry. Then if you switch off a layer and back again it is just loaded from Leaflet internally.

Can TWKB handle more than 2 dimmensions
Yes, TWKB can handle up to 7 dimmensions.

Can this websocket approach be used for writing back to the database?
Yes that is easy to implement. Just send the geometry back to nodejs with ws.send(), and insert it to the database. There is no function to import from TWKB into PostGIS. That is no big thing to write, but I don’t think there is the same performance need when posting back to the database, since that will be one or two geometries, not thousands of them. So the easiest is to just send it back as WKT and use ST_Geomfromtext to get it in the database.

Mapservice from Websocket with TWKB

For those who don’t know what I am talking about TWKB is a compressed binary export format from PostGIS described here, and here.

It is just in the experimental stadium. The source for the PostGIS part can be found here

The Mapservice
What I find maybe most interesting is the websocket thing. I haven’t played with that before. Maybe this is old news for all of you out there. But websockets works cross-domain. So, a websocket can be approached from a page on my webserver or from a page on your desktop. That makes no difference.

You can download the index.htm from the demo:
The DEMO
put it someone on your computer, open it with a browser and it should load the maps.

You can also put this in a javascript:

var ws = new WebSocket('ws://178.79.156.122:8088');
ws.send(JSON.stringify({"nr":nr,"map_name":map_name}));

and you should get a reply if you use one of the map names that my “service” has.

If you don’t know the map names you can send:

ws.send(JSON.stringify({"request":"getcapabilities"}));

and you will get back a json-object with some metadata (That is demonstrated in the demo too)

All this is very unfinished, but it shows the idea.

Test it in OpenLayers too
The demo I have written is in Leaflet. That is just because it seemed easier to get started to test this in Leaflet. But it would be very interesting if someone took TWKB for a ride with OpenLayers (3). Since OpenLayers 3 promises webGL support a compressed binary format ought to be interesting.

What parameters the websocket takes
This is not even tested all of it, but here is what you can send to the wesocket:

map_name
srid (if not passed the srid of the table will be used, which can be found as default_srid in with getcapabilities)
precision how many decimals the coordinates shall have. Consider what unit your srid has. If not set the default value will be used
center.x & center.y Those coordinates gives the point that the result is ordered from. The idea is to get it order from the middle of the map, which makes sence if you are zoomed in and don’t see the whole map
inverted_lat_lng, boolean

As showed here it can be used directly from the websocket. Then there is not even any compiling involved.

I will, as I have said come up with a post about the technical aspects of the format, but I am afraid that will take some time. Meanwhile I will gladly answer any questions to make it easier getting started.

Nodejs, Websockets and TWKB

A short update about twkb, described in this earlier post

I have been doing some testing, sending the geometries from PostGIS to the client as twkb through a websocket.

This is the first time I have been playing with nodejs and websockets. It is really nice things.

Here is the demo:

http://178.79.156.122/twkb_node/

Wait till the page is properly loaded, and click “Municipalities” or “Areal types”. then you should see the geometries start showing. It should start showing in the middle of the map and going outwards. The neat thing about that is that when you are zoomed in at any place in the map and click “Areal types” for instance, you will almost at once get the geometries where you are zoomed in.

But if you click Municipalities before the “Areal types” are finished, you will have to wait a few seconds. That is because all geometries of the first layer is already queued at the client, and I haven’t found any way to manipulate that queue.

To get the stream ordered by distance to the center of the map is only possible because the geometries is taken directly from the database.
The query for the Municipalities layer looks something like this:

SELECT kommunenr, ST_Astwkb(geom,3,kommunenr,'NDR') geom
FROM kom_geom
ORDER BY ST_Setsrid(ST_Point($1,$2),4326) geom;

where $1 and $2 is lat long from the center of the map.

I think it works pretty fast. Municipalities layer has 435 geometries and 149363 vertex-points in total, and the Areal types layer has 3553 geometries and 179025 vertex-points.

Tiny WKB

Lately I have spent some time on a compressed binary output format from PostGIS. It is so far just some sort of “proof of concept”.

The idea is a binary format with some of the features found in common text-based gis formats. The main features is:

  1. Controllable precision (number of decimals)
  2. Relative coordinates
  3. The ID of the geoemtry is stored in the geoometry

I have called the format Tiny WKB since the wkb-format was the closest I know of. But probably the name should be something different since wkb means “well known binary” and this is not “well known”. But Tiny WKB or twkb will have to do for now. The function in PostGIS to create a twkb geometry is I have called ST_ASTWKB, ST_ASTWKB(geometry, precision, ID, endianess).
The sourse code for the TWKB creator is found in http://svn.osgeo.org/postgis/spike/nicklas/twkb.
Just get it with subversion and compile as usual with PostGIS.

So, let’s take a look at the advertised features:

Control over number of decimals

The geometries stored in the database have far more precision than needed for presentation purposes. Often when showing a map on the web, a precision more than one meter is overkill, even when zoomed in. So, if you are using a meter based projection and you want just full meter precision you set the precision parameter to 0. If you want 10 meters precision you set precision to -1.

Relative coordinates

For instance a line like
‘LINESTRING(352400 6752414, 352415 6752418, 352452 6752402)’
, with relative coordinates looks like this:
‘LINESTRING(352400 6752414, 15 4, 37 -16)’

The more coordinates in a geometry the more space we save by using relative coordinates. This is used in formats like SVG and TopoJSON. It gets more complicated when dealing with a binary format since there is no separator between the numbers. In wkb-format for instance that is no problem since all numbers uses the same number of bytes or bits. The reader just counts the bits and knows where the number stops. But then we would gain nothing from our relative coordinates. So twkb handles 3 different storage sizes of the coordinates 1, 2 or 4 bytes.

Storage of the geometry ID  inside the geometry

In the header of the geometry there is a 4 byte integer for storing an ID of the geometry. This gives some possibilities. For instance we can write an aggregating variant of ST_ASTWKB. Then we can aggregate the twkb geometries to geometry collections grouped by intersection with a grid. Then we get vector-tiles directly from the database with ID inside the tile to each single geometry. So at client side each geometry can be identified and joined to it’s attribute data.

The ID should also make it easier to implement support for typologies. All edges can be sent separately with included ID.

OK, so how small does it get

To give some numbers in bytes:

geometry WKB TWKB incl 4 bytes of ID
POINT(1 1) 21 14
LINESTRING(1 1, 10 15) 41 20
LINESTRING(1 1, 10 15, 22 30) 57 22

Ok, you get the point. Bigger geometries gains more from twkb than smaller. But the gain gets smaller if the need of precision is higher. If we take the last example and wants to store 3 decimals the difference is smaller. Also if the distance between the coordinates increases we need more space to store the relative coordinates. There is also an overhead in changes of sizes. So, to make it extreme:

geometry WKB TWKB incl 4 bytes of ID
LINESTRING(1 1, 1000000 15, 1010 30) 57 36

TWKB is still quite a lot smaller than WKB but the difference is smaller.

But how fast is it?

That is not easy to answer. It takes some overhead to create the TWKB since the geometry have to be analyzed in the database and each coordinate calculated, not just copied. But that overhead seems to disappear in the gain of decreased IO.

I have a layer of all the roads in Norway. Some stats of the layer:
Number of linestrings: 1224248
Total number of vertex points: 23485321

To just check the cost of creating WKB vs TWKB we can do like this (and get the total size as bonus):
SELECT SUM(LENGTH(ST_ASBinary(geom))) FROM veger;
that takes on this machine 979 ms

The corresponding query for TWKB looks like this:
SELECT SUM(LENGTH(ST_ASTWKB(geom,0,gid,’NDR’))) FROM veger;
and takes 2770 ms

So we have an almost 2 seconds overhead.

But if we instead of just checking the size of the result actually puts the result in a table:

CREATE TABLE wkb_veger as
SELECT ST_ASBinary(geom) geom FROM veger;

takes 6437 ms

and

CREATE TABLE wkb_veger as
SELECT ST_ASTWKB(geom,0,gid,’NDR’) geom FROM veger;

takes 3680 ms

So the smaller size even in internal handling in the database eats up the overhead.

To be fair we should mention that we reduce the number of decimals to 0. But actually the original layer had no more than 1 decimal precision even if there is a lot of trailing zeros. So if we create TWKB with 1 decimal instead it takes 4346 ms.

The geometries as WKB uses 368 mb

If stored as TWKB with 0 decimals it uses 63 mb, with 1 decimal 79 mb and with 2 decimals (1 trailing zero), 108 mb. Don’t forget that includes 4 bytes of ID to each geometry.

So, the database is quite fast in handling TWKB. But for web-mapping, how about php and javascript?

The DEMO

I think there is quite a lot of optimization to do in my demo. Maybe NodeJS is faster than php for getting the binary data out for example. Also the javascript TWKB reader I have written probably suffers from bad coding.

But it works, and my hope is that other people finds this interesting enough to build interesting clients. It would for instance be very interesting to see how QGIS would react on more slimmed geometries. I think it would give new possibilities to get faster rendering.

The demo can be found here:

http://179.78.156.122/twkb

It is a Leaflet map. To turn on the TWKB layers check the check boxes in the bottom of the page. I have tested in Chrome and Firefox. As you can see from the timer that appears after a TWKB layer is loaded there is several bottlenecks. I think php seems to work quite slow here and also the addition of the geometries in leaflet. The reading of the TWKB geometries (parsing) is just a very small part of the time it takes. The layers is stored in srid 4326. The first Municipalities layer has 3 decimals and Municipalities HD has 5 decimals. You can see the difference when zooming close. The “Areal Types layer” also has 5 decimals.

Summary

Now this is just a “prof of concept” in my sandbox in PostGIS resporitory.
If this sounds interesting give some feedback what is needed to make something good out of it. If you have the possibility it would be very valuable with a better and more sophisticated client. As mentioned QGIS rendering TWKB would be very interesting. If there is an interest I will write a new post describing the technical aspects of the format.

If the interest is big enough it might go into PostGIS some time 🙂

The future of PostGISonline.org

Is there any interest from someone to support or take over the site postgisonline.org?

Background:

About 3 years ago I started postgisonline.org. It is a site that aims to help people, finding the beauty of spatial sql.

It still has between 300 and 600 unique visits per month and many of the visitors actually looks around on the site and stays for a while. I think that is quite good for a site that has not been refreshed for a few years and doesn’t contain any nudity.

The site runs at a linode virtual server which costs me 20$ a month. I restarted the server the other day and found that it had been up for more than 2 years. That means there is a lot of maintenance that should be done. Most important move it to an os-version that is not obsolete.

My problem now is that I do not have the amount of spare time that I had by the time setting this site up and the 20 $ per month hurts a little.

So, I can see a few options:

1) I close it down in March when the prepaid period at linode is over (say no, say no)

2) Someone likes the concept and wants to take over. (To develop it in any direction but free to use for anybody)

3) Someone want to corporate with me to get things happen

4) Someone wants to pay the cost for hosting (or has some other hosting option) and I will try to do the necessary maintenance

You can give a reply here or write a line to: nicklas.aven@jordogskog.no

Also all feedback is welcome in the decision of closing down or continue and in what direction if continuing.

FOSS4G 2010 in Barcelona

FOSS4G  ConferenceThat’s it. I have been to my first FOSS4G conference.  It has been great days. A lot of interesting talks and presentations, and most important, to meet many of the people you otherwise only know by mail.

For the presentations, I have mainly focused on PostGIS related stuff, but being at FOSS4G conference makes you realize how wide the FOSS4G world is. Now, at home I have also opened the LiveDVD and realize it will take quite a while to browse all all the software included.

Me and Marc Jansen from Terrestris held a presentation about WYTIWYS and postgisonline, two web based applications to show the result of a spatial query presented as a map. There were more people attending our presentation than I had expected since it was at the same time as the presentation of the WMS benchmark. I think we had over 100 persons in the audience. The tutorial I used during the presentation can be found here:
http://www.postgisonline.org/map.php?version=trunk&tutorial=http://jordogskog.no/foss4gtut.htm It is nothing fancy at all, just a try to show most of the functionality in a few minutes. I think the presentation slides will be published too on the website of the conference.

The talk that left me with the strongest impression was not a technical talk. It was a talk of Schuyler Erle with the title “How Crowdsourcing Changed Disaster Relief Forever”. He told us about how Open Street Maps was used during the rescue work after the earth quake on Haiti in january 2010. He told us how people built up the Open Street Maps from almost nothing to a very detailed and usable tool in a very short time after the quake. His point was that the work of those volunteers creating the map from aerial imagery together with open source software like PostGIS, Openlayers and so on made a very big difference. The Open Street Maps was preferred over other alternatives for quality reasons.

When I sat on the plenary sessions with all the attendees in the same room it struck me what an enormous amount of GIS knowledge and competence that was gathered at the same place. Many of the well known names in open source GIS were gathered and covered together a much wider field than any proprietary software conference can cover.

Next year the FOSS4G conference will be in Denver and I really hope it will be possible for me to attend.

What is behind PostGISonline.org

If you are curious and want to know what is behind PostGIS online you can download the database, php-pages and others from:
http://download.jordogskog.no/pgo
There is two files, pgo.sql.tar.gz is the database as a dump and pgo.tar.gz is the rest of the files.

Just so it’s said:
Do whatever you want with this code, but don’t blame me if things goes wrong.
If you develop it I would appreciate to get a note, just of curiosity.

You will find no rocket science but maybe a few ideas that might be interesting. Now it is just presented as a quite raw copy of the site without much explanation so I will give a few notes here.

The basic idea behind the site is to take the sql-string the user gives and create a table from the result and then show that table through MapServer. In the beginning it created a view instead of a table to save some disc writing, but I found that the view had to be evaluted several times to find resulting geometry types,srid and finally to show the view. So my conclusion was that it is worth writing down the table. So what is actually happening when the sql-string is sent is that the “CREATE TABLE” and a table name is put in front.

The table gets the name map1, map2 or map3 depending on what button the user clicked. To make it working with many users the session is is also added after mapX, for example map139853657d34373592f51e5e8c45f06b7.

What also happens is that when a user sends his first sql-string the session id is stored in table adm.sessions with a timestamp. then the timestamp is updated for each new query the user runs. That makes it possible for a function in the database found in file functions.sql to drop all tables belonging to session id’s that has not been used for half an hour.

On the Mapserver side of the story there is a mapfile with definitions of 9 different layers. One for Map1 point, one for Map1 linestring, one for Map1 polygon, one for Map2 point and so on. To make it possible to run queries that return geometry collections that contains different types I use the new function in PostGIS 1.5, ST_CollectionExtract(geometry collection, integer type). It gives som overhead but makes the system working with results containing mixed types and as mentioned even mixed geometry collections. So each geometry is checked both for points, linestrings and polygons. The DATA definition for points in Map1 looks like this for instance:

DATA “the_geom FROM (
select ST_CollectionExtract(ST_Force_Collection(the_geom),1) as the_geom,
idmap1 from userdata.%relation%) as a
USING UNIQUE idmap1 USING SRID=%srid%”

As you see on the above example there is also an unique id added to the table. That id, here called idmap1, is never showed to the user and is just added to the table after it was created to make Mapserver happy.

What I think from this maybe is most noticeable is the last part, how to make Mapserver work with geometry collections. As said earlier it gives some overhead, but can be usable anyway.

That’s it for now. Any comments or suggestion for improvement is very welcome.

Testing PostGIS 2.0 on PostGIS online

Now it is possible to try and test new functionality soon after it is added in PostGIS trunk. How up to date the version on the site is might vary because there will be changes on the road towards PostGIS 2.0 that can not be automatically updated on the site.

How to use the trunk version you can learn from this tiny tutorial.

To see what new functions is there take a look at the SVN documentation snapshot

Ship to Gaza 2

I have hijacked my own blog to write some updates of what is happening in Mediterranean sea right now.

From what I have understood from media, what has happened is that Israeli military has boarded all the ships, killed 19 (now they talk about “only”  9 killed) and wounded 60 persons on International water. Boats coming with humanitarian aid and supply.

It looks from pictures on internet like there might have been activists on the ships that started to hit the Israel soldiers when the boarded from helicopter but at the other hand Israel had no right what so ever to board the boats on international water.

Ship to Gaza

This is not a blog post within my usual area, but since I think this initiative Ship to Gaza seems like an truly humanitarian project I would like to do what I can to put the light on what is happening right now in the Mediterranean sea. On board the ships are besides supplies like medical equipment and building material,  many hundreds of people supporting the sake like parliamentarians, and activists, both Muslims and Jewish among others.

There is a lot of blogs out there following what is happening, but I don’t know which of the English blogs to consider trustworthy so do a search yourself to get more information.

What is happening right now is that Israeli military are following the ships and have promised to stop them. That I have read on a Swedish blog post from one of the boats, but the same blog also send posts in English. A lot of things are happening right now.