Foss4G in Boston and TilelessMap

Foss4G 2017 Boston

Are you going to Boston to attend Foss4G 2017?
Lucky you!
I don’t have the possibility to go, but I have put together a map of Boston for TilelessMap.

It is maps from MassGIS packed for the TilelessMap client.

If you want to you can try it out on an Android device, or compile it on Linux (should be fairly easy to compile on other platforms too).
Then you will have a fully offline map (for avoiding roaming)

The Boston Map
You can download the map (a sqlite database with both map data and map project included) from here:
boston.tileless (approx 13 mb)

The application
For Android:
Download the apk:
tileless.apk
and install it.

When you start the app it will point to your Download directory where your Boston map is expected to have landed when you downloaded it.

For Linux:
Go to https://github.com/TilelessMap/TilelessMap and follow compile instructions

What you will get
You should get a quite detailed, fully offline map of Boston.

It should show your pgs-position as a green dot.

You can open the Layers menu, select Foss4G in the right column of boxes and then use the info tool to get some information about the Foss4G locations on the map.

Some words about the map
As said above the maps are from MassGis.
What I have done is that I have packed them with the pg_tileless tools.
That packs the geometry data as twkb. That means it has reduced precision on the coordinates but are intact in other ways.
I have reduced precision to 1 meter exept for the buildings where I thought it makes sense to keep 1 dm precision.

tileless Boston map

Last words for now
This is not the best way to show off TilelessMap. It is not built primary for city use. That means it lacks for instance rotatable texts which is important for showing street names.
But when you are out in the forest, rotated street names isn’t first priority :-). It is not a very big thing to implement, but it is not on the top 10 features I would like to implement.

Back again

I haven’t written a blog post for a long time. But now I think the TilelessMap project starts to look so promising so I have to tell about it.
I hopefully will find time to write a series of posts about what it is and how to use it.

I have also changed host for this blog and after a long time a moved over the old posts to here. I haven’t edited all the image links so most posts are wthout images (might do that for some old PostGIS posts where I guess the images are important).

Most of the old posts about the TilelessMap project are out of date. A lot has happened since last autumn.

noTile – The project database

Here I will describe how the project database defines the map project. As we have seen earlier data is stored in SQLite databases. So is also the information about the map project. That is information about where to find the databases with map data. At what scales to render the different layers. What layer is on when project loads. How to style the layers and some more.

The idea about a project database

Many map clients uses config-files or javascript files to define the mapping project. That is convenient. But here is my 2 reasons for not storing the project definition in a text file:

  1. We already handle SQLite in the project, so I didn’t have to search and evaluate any other formats and techniques when chosing SQLite.
  2. It is easy to maintain from many different interfaces. By SQL-query updating values direct. Or by sending another SQLite DB to the client, attach it to the one in production and copy rows between them. Or just exchange the whole db of course.

So, when starting the noTile client it takes the project db as an argument. By reading that project db it finds the other databases with data in.

 

The tables in the project database is:


Tables to manipulate when adding a new layer to the project

table “dbs”:
In this table paths to databases with data is stored.

table “layers”:
In this table a lot of information about the layers is stored. There is information about geometry type as integers as:

1 – Point
2 – Line
3 – Polygon

In this table is also information about at what scales the layer shall be rendered. For simplicity the value is just “units per pixel”. That means that if you use a meter based projection and sets maxScale to 30, the layer will only show when the scale gives less than 30 meters per pixel.

This table also keeps information about what dataabse to get data from of course, and the name of the table to read from. It also keeps information about the geometry column and if polygon about the triangle index column (more about that later).

We also keep information about what “program” to use for rendering. A program in this context is a combination of vertex shader and fragment shader in openGL. More about this later too.

The layer table also have information about what field to use for styling and in what order the layers shall be rendered (first layers gets in the back)

table “styles”

Here all styles is stored. Styles is very simple so far. It is just a color, an outline color, a linewidth and on what value the particular style is to be used. The value referenced here is the value in the column that is given in the layer table for styling. So, for now the client supports very few options for styling.


Tables that most times don’t need to be modified

table “shaders”

This is a table holding the source code for the shaders. maybe this is overkill to make this flexible. But when I have been reading about openGL I have understood that shaders is the heart of openGL. Shaders is small programs that gets loaded into the gpu at runtime. Those programs is what the gpu actually execute in the rendering process.

So, what I have done is to put the source code of the shaders into this project db. That means that there can be different shader programs for different layers. Maybe that can be usefull. But so far I have ily used simpliest possible shaders to reproject the vertex points to pixels and to set the right colors.

But, anyway, the possibilities is there to do something crazy with it, for someone with that knowledge.

table “programs”

Here we combine vertex shaders and fragment shaders to a “program”. Since vertex shaders and fragment shaders always is compiled together it makes sence to give the combination a unique id. This id is what the layers get. So a layer using “program 1” gets sources to vertex shader and fragment shader from table shaders referenced in table programs where programID is 1.


Ok, that was the story about the project database.

You can find an example project db in the repository https://github.com/nicklasaven/noTile
It is called “norge_proj.sqlite and is used to render the data in norge.sqlite in http://twkb.jordogskog.no/maps/

Next post will probably be about how the clients selects and uses geometry data and how it gets rendered.

noTile – intro

I haven’t written a blog post in a few years. But I think it is time to give it a try again.

Why?

Well, I have spent too much time in a map client technique that will need some explanation.

Some history, in non-chronological order

This goes back to 2011 or 2012 when I started to think about a new format for transporting geometry data between a server (PostGIS) and a client (a web client). I wanted a compact but fast format to send data around. The result is TWKB  (Tiny WKB) now implemented in PostGIS. When I started this there was no MapBox Vector Tiles (at least I had not heard of it), so compact geometry formats was quite unknown.

The background for this wish for a compact format originated from a decade before. When I worked at Skogsstyrelsen in Sweden (Swedish Forest Agency) we started to use pda with ArcPad. Back then I was able to put shapefiles with all data I needed for may daily work into that pda. Detailed data for Torsby kommun, about 4000 km2. A few years later things like that was just a dream. ArcPad maybe existed, but everything went online, and as maps became something everyone demanded, the user focus changed from field work to “search for a restaurant”. On the new platforms like smartphones and tablets there was no good tools for detailed offline work. Then there came some clients that could download “offline areas”. But mostly just smaller areas.

Knowing that the pda back in 2003/2004 had 64 mb of RAM and a sd-card of 1 GB, I think it s quite strange that we don’t have a lot of techniques for storing whole countries with detailed data today.

But I have some more requests than just rendering. I also want to be able to keep data updated, without changing whole data sets. With the future in mind I also think that access to geometries in an easy way for analyzing and processing is a quite natural request.

The command from the daydream part of me to the doer part of me

So, from this, my conclusion is that we need structured data, compressed and fast to access. That is great but how to balance the trade offs? Some basic rules:

First thing to trade away is “eye candy”. If you have been out in the bush trying to do a job (of course in the rain, and hungry) eye candy is one of the most provoking things in the world as all you see is a well designed circle going around. I am supprized that anyone using a computer professionally asks for smoth transitions before snappy transitions. That is like a carpenter asking for the color of the power drill instead of tourqe. So,performance is more important than eye candy

When performance is descent more performance is classified as eye candy. This means that when we have enough performance, optimizing for small storage is more important.

Data can never be too small. There is always more data to bring that might be “good to have”.

So, to summarize:
Forget about eye candy. It just eats the power of your device, and melts down our planet. Enjoy snappy over smooth!

A funny thing I have found when working with this. There is no contradiction between performance and smallish. Often the smaller size in itself gives a better performance, even when local. Both reading and writing big amounts of data from disc have a cost.

Disclaimer of what is coming

All this is still in a very early stage. Big parts of it is also in domains that I have only learnt about from tutorials. So ideas and techniques is very naive and some parts is just a result of trial and error. So, here is a lot of room for more competent brains to take a look.

Diving into

This is it. Data is stored in SQLite at the client. The layers and styling is defined in a separate SQLite db. There “project”- db can get data from many “Data” db, to form a project.

systemet

The client is written in C.
SDL2 is used to abstract away platform differences and to easily get user input.

The code for the client and link to prepacked map-data can be found at github:
https://github.com/nicklasaven/noTile

 

I will get back with info describing the project technically. I will also write about how to pack the map-data.

If this sounds interesting, please join in the effort to make something good out of it.

I have put the GPL v2 license on the code. But that can be discussed if there is other opinions.

 

Bitten by an exotic bug

I have been tearing my hair for a couple of evenings not understanding anything about what is happening in my code. It has been very frustrating, but now I think I just understood the whole problem.

It might be obvious, but to me it is very exotic and I am still not sure that I have solved the problem.

So, my problem was this:

I am rewriting the memory handling in the twkb function in PostGIS. Why I am doing that is another story that I will tell when things are working as expected.

I have a structure that looks like this:

typedef struct
{
uint8_t *buf_start;
uint8_t *buf_end;
uint8_t **buf;
} BUFFER_STORAGE; 

Then I wrote a small helper function to do the allocations in this structure.

It looks something like this a little simplified:

int alloc_buffers(BUFFER_STORAGE *buffers, size_t size)
{
uint8_t *bulk;
bulk=(uint8_t*) lwalloc(size); /*a PostGIS-specific allocator function*/
buffers->buf=&bulk;
buffers->buf_start=bulk;
buffers->buf_end=bulk+size;
}

Ok, do you see the problem?
Well, it bites me in a very subtle way. Nothing happens until, when I am back to the calling function, initiates a new variable. Then things go totally wrong. In my struct the buf-pointer goes bananas.

What I just realized is why.

This is what happens (I think):
In my helper function, when I initialize the bulk-variable it is created on the stack. I mean the pointer variable. The bulk itself is allocated on the heap. Then, when I put the address to the local pointer variable in my structure as **buf, I store a local short lived address, in my long living structure. Then, when I get back to the calling function that address is not valid any more, and is reused when I initialize a new variable in the stack.

So, what I saw was that my new variable in the calling function had the same address as I had stored in my structure. When I used my buf-address it just pointed to a pointer to my new variable instead of the bulk I was expecting.

I cannot see any warning telling me that I am doing something stupid.

Does this makes sense, or will I find tomorrow evening that I still haven’t found my problem?

 

Comparing TWKB with compressed geoJSON

One question about TWKB is if there is any gain compared to just compressing geoJSON. That question is worth some investigation. In php for instance you can compress all data on the fly before sending it to the client. Then the browser will decompress the message and use it as normal. This works really good and fast. So, the question is if it is worth the effort to build a twkb-geometry.

I have three reasons why it is.

  1. twkb seems to actually be smaller than compressed geoJSON.
  2. Compressing takes cpu time and power
  3. If you take the geometry directly from the database, there will be a lot more data to send from the database to the web server.

To demonstrate this I have published a web page http://178.79.156.122/twkb_test.
It is a little bit messy so I will guide you.

I find the dev tools in Google Chrome nice to compare the sizes and timing of the compressed data.

First press the button “Get available Layers”. That will send a websocket message to the server to return what layers we can use. Then you should get a list in the list box “Choose layer”.

  1. Ok, now choose “Areal Types” (or whatever you want, but just to follow my numbers).

Now you have 4 buttons to choose from. See the upper most row on the picture above. The two buttons to the left will both give you twkb geometries. The websocket button to the left sends twkb “as is” uncompressed, and the second button sends the twkb geometries compressed through php. To the right you get the corresponding buttons for geoJSON.

Behind the twkb-buttons is a query against the database that looks like this:

SELECT ST_AsTWKB(geom,5) FROM prep.n2000_arealtyper;

and the geoJSON query looks like this:

SELECT ST_AsgeoJSON(geom,5) FROM prep.n2000_arealtyper;

So, we are querying the same data in both cases and asks for 5 decimals in both cases. The SRID of the table is 4326, so it is lat lon coordinates. that is why 5 decimals makes sense.

Comparing the web socket and php implementations in timing makes no sense. There is too many unknown in that. But it is interesting to compare the sizes. To see how much it gets compressed.

The sizes of the uncompressed gets displayed in the table under the button. But I have found no way to get the compressed size in javascript. But in Coogle Chrome dev tools you can see it under the Network tab.

Then you see that the compressed version of areal types in geoJSON is 1.2 mb. If you test the uncompressed version of TWKB you will see that it is about 720 kb. geoJSON compresses from 3.6 mb so it gets compressed quite a lot to 1.2 mb. TWKB only compresses to 660 kb from 720 kb so geoJSON compresses much better. But anyway, uncompressed twkb is quite a lot smaller than compressed geoJSON.

The differences will vary depending on dataset and number of decimals, but it seems like TWKB gets smaller all over.

The two second arguments I had in the beginning, why twkb is a better choice is about the server load. I have no good way in investigating what takes time on the server, but how long we have to wait before we get any data from the server says something about the work the server needs to do. In the network tab in the chrome dev tools you can hold the mouse over the bar to the left showing the time spent on getting the data. Then you will get the timing divided in “waiting” and “receiving” like in the picture below.

The first thing you can note is that my connection is quite slow. Getting 1.2 mb takes about 7 seconds today. So, reducing size of content to send over internet is still important, even if you as a developer is sitting on a Gb line. What you also can see, and is more important is that you have to wait 1.37 seconds before you start to get anything back, no matter how fancy internet connection you have. If you do the same thing with twkb, by asking for compressed twkb data you will probably get a timing around 0.8 seconds. So there is a difference of at least 0.5 seconds.

in both cases all the data gets sent from the database to php before anything gets sent. So what we compare is how long it takes for twkb vs geoJSON to:

  1. Be read from disc and twkb vs geoJSON gets constructed in the database
  2. The data sent from the database to php
  3. Php builds the page
  4. It gets compressed

In all those steps 3.6 mb needs to be handled in the geoJSON case compared to 720 kb in the TWKB case. In the TWKB case the size gets already when the twkb geometry gets constructed.

Maybe 0.5 seconds doesn’t sounds that much. But this load is not only affecting me sitting waiting for a map showing up. It affects resources shared by anyone asking for a map from the same server.

You can play around with the different data sets and compare timing and sizes of twkb vs geojson and compressed vs uncompressed.

Some TWKB updates

Last week I gave my first talk about TWKB. One good thing about presenting what you are doing is that it makes you actually do it.

So, what is new?

  • TWKB is in PostGIS trunk! That means it will be included in PostGIS 2.2.
  • ID is optional in the spec. In the PostGIS implementation that means that if you don’t ad an ID there will be no space occupied for an ID. A minimal point is then only 4 bytes.
  • The one and only serialization method is VarInt, the same way as integers gets serialized in proto buffers.
  • I have added a quite generic javascript example of how to read twkb into a geoJSON object
  • The demo page http://178.79.156.122/twkb_test is refreshed and php-examples is added to see the effect of compressing geoJSON compared to twkb. More about that in a separate blog post

Call for brain power

I believe that TWKB can be something good. It can be a very fast format for moving geometries around.

If TWKB is going to be something more than a few demos like here and here, more brain power is needed.

My vision for TWKB is thatin 2013 it will:

  • be supported in Leaflet and OpenLayers 3
  • be in the trunk for relaese in PostGIS 2.2
  • be supported by OGR
  • have more than 5 contributors to the specification

What I have so far with TWKB is collected here. It is a github, divided in 3 parts.

First part is the specification.

Second part is the PostGIS implementation of the spec (Type 1 to 24 is implemenetd)

The last part is the web related scripts, like webserver and client-implementation

All this above is just meant as something to start the discussion from. The goal is to find a very efficient and flexible binary format for geometries.

For me myself I also hope that someone will hire or employ me so I can work with things like this on day time 🙂 I have a lot of ideas I would like to test.

TWKB aggregates

TWKB (Tiny WKB) can be aggregated and nested. The result is a special type. Since the TWKB-geometry holds it’s own ID (like many text-based gis-formats) , the result of an aggregation of many TWKB-geometries also nests the ID’s into the new aggregated TWKB geometry.

This gives us possibilities like creating a type of vector tiles on the fly. I have tried to demonstrate it here, but I didn’t get it as visual as I had hoped.

Those types is described as type 21-24 in the first draft of TWKB-specification.