Geocoding using OSM data and SQL/Server 2008

I’ve just completed importing the OpenStreetMap data into SQL/Server 2008 using the new geography data types. Originally I linked the data to Manifold GIS to verify the import process but have just started on my first real project using the data which is address geocoding for a GPS tracking site located in Australia. The OSM data doesn’t contain enough street number information to be useful for most areas but the aim is to provide nearest street names at no charge, versus some local commercial geocoding services that charge around ten cents per address.

Loading the OSM ‘ways’ as a series of around 250,000 linestrings seems to be giving a performance of around 40 geocoding operations per second. As you’d expect performance of the STDistance method depends heavily on the distance, so in the SQL/Server stored procedure I first start by locating streets within 100 meters, followed by 1Km, 10Km and finally 100Km. The series of queries only attempts to find street names not matched by the previous query, and for Australia these numbers seem to give a match rate of about 90%, 9%, 1% and not much for the final distance that results in good performance.

For optimisation of the spatial index I found that changing the default of medium for each level in the index grid to high gave approximately a 100% increase in performance. It makes sense that the finest level of granularity would work best when looking for the nearest distance to points. The following was what I ended up using for the spatial index definition, changing the cells per object didn’t seem to have a great deal of effect for this dataset / application so I left it at the default value of 16.

CREATE SPATIAL INDEX [spatial_IX] ON [dbo].[Ways] 
(
	[LineString]
) USING  GEOGRAPHY_GRID 
WITH (
    GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH), 
    CELLS_PER_OBJECT = 16, PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, 
    ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) 
ON [PRIMARY]

JScript to automatically retrieve / check web page

Recently I discovered a problem with Manifold WMS where if the page wasn’t accessed for a while it would cause the w3wp.exe IIS worker thread to loop. Until I find out the underlying cause I’ve written the map_keepalive.wsf script below to automatically retrieve a web page and scheduled it to run every 5 minutes. I assume something gets spun down and the concurrent WMS requests cause a problem, the map_test.asp page that this script retrieves is just a default IMS page generated from within Manifold.

<package>
  <job id="js">
     <script language="JScript">
	var finished = false;
	var timeout = 0;
	var timeoutSecs = 30;
	var forAppending = 8;
	var xmlhttp = WScript.CreateObject("MSXML2.XMLHTTP");
	xmlhttp.open("GET", "http://www.satsleuth.com", false);
	xmlhttp.onreadystatechange = function() {
		if(xmlhttp.readyState == 4)
			finished = true;
	}
	try {
		xmlhttp.Send("");
	} catch (e) {
	}
	while ((!finished) && (timeout < timeoutSecs))
	{
		timeout++;
		WScript.Sleep(1000);
	}
	var msg = ""
	if (!finished)
		msg = "Timeout occurred";
	else if (xmlhttp.status != 200)
		msg = "Error retrieving map_test.asp, error code: " + xmlhttp.status;
	if (msg != "") {
		var oFSO = WScript.CreateObject("Scripting.FileSystemObject");
		var oFile = oFSO.OpenTextFile("map_keepalive.log", forAppending, true);
		var dt = new Date();
		oFile.WriteLine(dt.toLocaleString() + ": " + msg);
		oFile.Close();
	}
      </script>
   </job>
</package>

Manifold GIS IMS (Internet Map Server) error 0x80004005

I’m working on an iPhone 3GS mapping application using Manifold as a map tile server via the Web Map Service (WMS) protocol. I did initial development using a Vista x64 development machine on my intranet via wifi but had considerable difficulty moving to my Windows Server 2008 64-bit machine so that I could access the map tiles remotely via the 3G network. It’s been a while since I’d deployed a Manifold IMS solution but I remembered the usual steps:

  • Web site must be running under 32-bit mode because the COM object is 32-bit only
  • Map files, config.txt and any linked resources must be available to the ‘NETWORK SERVICE’ account
  • Same account needs access to Windows\serviceprofiles\networkservice\AppData\Local\Temp

    After my initial attempts failed I did numerous searches of the Manifold site and other resources and worked through the Manifold “Problems with the Internet Map Server” guide but kept getting the 80004005 error that I thought pointed at a permissions error. I thought to install the excellent SysInternals Process Monitor utility and the first error that became apparent was a file permissions error writing to the file “c:\Manifold.log”. I created a blank file with that name and granted full control permissions to the NETWORK SERVICE account used by IIS 7 and found several errors including the following:

    Invalid component name
    Incompatible component type
    Can’t generate unique name for temporary file

    Creating a new blank map file solved the problem so it appeared to be a problem within the Manifold map file itself, I decided to copy back a fresh copy of the map from my development machine so I could try removing a few elements until it worked but found it worked first go with the fresh copy. I didn’t modify anything with the map file so can only conclude it got corrupted at some initial stage while I had configuration problems. Anyway to avoid similar problems in the future I’ll make sure that during Manifold IMS deployment I always include a blank “c:\Manifold.log” file with appropriate permissions and be careful to archive any changes no matter how small they seem in source code control as a recovery point.

    I love Manifold GIS in general and it represents great value for money, but it would be nice if the object model was structured so that it would throw back a response code in the form of a text message on the call to MapServer.Create so it could be passed back to the client browser. Additionally rather than the non-standard practice of writing a text file to the root directory it would also be nice to see it write a meaningful errors to the Windows application event log.

  • GPS / Asset tracking AJAX web page

    I’ve just completed an AJAX based GPS tracking page for a client based on the OpenLayers Javascript library using the OpenStreetMap map tile server for data, although it can be easily be adapted to other data sources. The following page offers a login to the demonstration system for evaluation:

    GPS tracking page

    Data is served from an SQL database with a modular set of stored procedures translating data from its raw format allowing integratation with a wide range of data sources and GPS tracking devices. Columns displayed can be customised on a per-user basis and dynamically generated SQL queries allow great flexibility in the data generated. Please contact me if you are interesting in licensing the software for your own site.

    OSM planet.osm disk requirements

    After loading the planet.osm file (planet-090916.osm.bz2) into an Ubuntu 9.04 virtual machine the total disk usage was 133GB after the import. The total dropped to 115GB after doing a VMWare shrink operation however it looks like to account for future near-term expansion of the OSM data I should allow for at least 200GB of storage. I’d originally hoped to store the data on a small but fast SAS 15K RAID array on my server but the current version will only barely fit so I’ll need to consider other alternatives. It looks like costs are dropping fairly fast for some of the larger 15K SAS drives so I might defer the decision for a few months as the system isn’t required for production until the start of the year. Meanwhile I’ll just slip a 1TB SATA drive into the server for testing.

    OSM (OpenStreeMap) planet.osm statistics

    I’ve recently started importing the planet.osm file into a postgresql database using the osm2pgsql utility. When I started I couldn’t find any up-to-date statistics on the number of primitives in the database to estimate the time it would take so I’ve posted them here. I’m running on a machine with only 2GB RAM allocated under VMWare and a slow 7200RPM 1TB drive and it looks like the process will take around 5 days to complete. The final stages of creating indexes on the tables seems to be taking as long as the initial import.

    Reading in file: /home/mapnik/osm/planet-090916.osm.bz2
    Processing: Node(434807k) Way(33142k) Relation(212k)
    Node stats: total(434807934), max(497541099)
    Way stats: total(33142281), max(40900619)
    Relation stats: total(212012), max(253016)

    SatSleuth Electronic Circuit Diagrams

    I maintain a list of electronic circuit diagrams at http://www.satsleuth.com/schematics.htm and have just added links to the individual categories on the blogroll. As of today it contains 2876 circuit diagrams and with everything on a single page it’s a convenient resource for searching or just for a general browse for your next Sunday afternoon project. On the page there is also now a way to add links to your own circuit diagrams, although the current hit rate is about 98% SPAM to 2% genuine submissions.

    Soon I’ll need to update the other web pages to indicate the project status. The Satsleuth GPS tracking pages were a placeholder for a new company / product but we ended up offloading marketing to another company that sell it under a different product name. Other than the circuit diagrams the pages have remained unchanged for quite some time so I’ll update the pages to showcase some turnkey GPS tracking and asset management solutions that I have available for licensing to solution providers.

    ACARS Reception with a WinRadio G305e

    During an idle moment I downloaded the PlanePlottter software published by COAA and hooked it up to my WinRadio G305e receiver. Much to my suprise I’m getting fairly decent results with an antenna just sitting inside without a proper ground plane, not a great deal of activity because the Hobart (Tasmania) airport is hardly buzzing with activity but I have managed to pick up one signal close to 200KM away. When the weather fines up a bit I’ll try a proper antenna installation and see how it goes, at the moment I noticed I’m getting quite a few CRC errors and coverage is fairly directional as I’d expect with its current location.

    So far the best ACARS settings for the WinRadio seem to be 131.550MHz (Australian / International ACARS frequency) in AM mode with the IF bandwidth set to 14KHz, smaller IF bandwidths seemed to get less reliable. I’ve got the RF section set for fast AGC with the preamp enabled but disabled the SAGC feature and any form of audio filtering. Setting the audio level also seems fairly critical for best performance, using the signal view in PlanePlotter I have it set so the noise floor is not a great deal larger than the graticules on the chart. Here’s an example message received earlier that contains GPS information:

    —[2009-09-07 08:33:24]—
    {SYN}{SYN}{SOH}
    M {mode}
    .VH-JQG {rego}
    {NAK} {ack}
    3L {unknown message type}
    4 {block}
    {STX}
    M55A {seq}
    JQ0711 {flt}
    S 41.272/E146.515 /UTC 0834
    {ETX}

    PlanePlotter has displayed all positions so far without altitude, course or speed information. I’m not too sure what that is about, I thought especially the altitude would have been critical information. Anyway I’ll get it setup a bit better and keep my eye on it, the PlanePlotter software also supports Mode-S receivers which I haven’t looked into but maybe the standard ACARS transmissions don’t include the additional information.

    Firefox AJAX requests from different domain / site

    Recently I’ve been writing a GPS mapping web page based on the OpenLayers JavaScript library that retrieves GPS data from the server via AJAX. For development convenience I was just storing the HTML file on my local drive for development and fetching the XML data from a different SQL/Server 2008 machine. This worked fine from IE but from FireFox the XMLHttpRequest I was getting a status of 0 (zero) returned and no data.

    It turns out this is a security measure to avoid cross-site scripting which is quite sensible but not that convenient for testing, and in this application the authentication happens at a different level so cross-site scripting isn’t an issue. I found the solution is to add an HTTP header of Access-Control-Allow-Origin: * to the response. Of course you should consider the implications carefully outside of a test environment. As my code is written as a traditional ASP file this just consisted of adding the following line:

    response.AddHeader “Access-Control-Allow-Origin”, “*”

    TomTom V3 GPS feed finally resolved

    After believing I had the problem solved with the TomTom GPS feed after some quick bench testing I was looking forward to gathering some good test data during a business trip to Victoria that involved quite a bit of driving around. Unfortunately the problem cropped back up during mobile testing and I didn’t have time to troubleshoot before leaving. Presumably the extra processing the TomTom navigator application was doing while moving and calculating routes caused the timing of everything to alter a little and rendered the data I collected uselesss.

    After further investigation today despite trying a few different techniques the C++ stream failbits and eof bit I was relying on don’t seem to get set correctly for the GPS feed. I’m not sure yet if this by design, a bug in the GCC ARM C++ standard libraries or a problem with the TomTom GPS driver. Once I knew the underlying problem it was easy to change the application logic to cope with the blocking that not having the eof indicator caused. I’ve now confirmed during further driving tests I can collect 100% of data to progress the KML and HTML data generation. Soon I will change my GPS C++ library to use threads rather than relying on the eof bit, a side-effect of my current workaround is that the signals I use to shut down the application gracefully won’t be received in a timely manner.