Installing PostGIS packaged address_standardizer on Ubuntu

One of the changes coming to you in PostGIS 2.2 are additional extensions. Two ones close to my heart are the address_standardizer (which was a separate project before, but folded into PostGIS in upcoming 2.2) and the SFCGAL extension for doing very advanced 3D stuff (was just an sql script in older versions, but made an extension in 2.2 and new functions added). We had a need to have address standardizer running on our Ubuntu box, but since PostGIS 2.2 isn't released yet, you can't get it without some compiling. Luckily the steps are fairly trivial if you are already running PostGIS 2.1. In this article, I'll walk thru just building and installing the address_standardizer extension from the PostGIS 2.2 code base. Though I'm doing this on Ubuntu, the instructions are pretty much the same on any Linux, just replacing with your Linux package manager.

Compiling and installing address_standard 2.2 on PostGIS 2.1

If you don't have PostGIS already, you can install it via PGDG. Instructions are here: PostGIS Ubuntu 9.3 Apt. You shouldn't really need PostGIS anyway except possibly to get past the PostGIS configure step.

  1. In order to build PostgreSQL extensions, you need the PostgreSQL dev package which you install with:

    apt-get install postgresql-server-dev-9.3
  2. In order to get past the configure step of PostGIS, you need these additional dev packages

    apt-get install libxml2-dev libgeos-dev libproj-dev libpcre3-dev
  3. Next download the PostGIS 2.2 dev tar ball and follow below steps to make and install the address_standardizer extensions:
    Updated 2016-01-20 tarball link to reflect released PostGIS 2.2.1
    wget http://download.osgeo.org/postgis/source/postgis-2.2.1.tar.gz
    tar xvf postgis-2.2.1.tar.gz
    cd postgis-2.2.1
    ./configure --without-raster

    You should have an output that looks something like:

    PostGIS is now configured for x86_64-unknown-linux-gnu
    
     -------------- Compiler Info -------------
      C compiler:           gcc -g -O2
      C++ compiler:         g++ -g -O2
      SQL preprocessor:     /usr/bin/cpp -traditional-cpp -w -P
    
     -------------- Dependencies --------------
      GEOS config:          /usr/bin/geos-config
      GEOS version:         3.4.2
      PostgreSQL config:    /usr/bin/pg_config
      PostgreSQL version:   PostgreSQL 9.3.5
      PROJ4 version:        48
      Libxml2 config:       /usr/bin/xml2-config
      Libxml2 version:      2.9.1
      JSON-C support:       no
      PCRE support:       yes
      PostGIS debug level:  0
      Perl:                 /usr/bin/perl
    
     --------------- Extensions ---------------
      PostGIS Raster:       disabled
      PostGIS Topology:     enabled
      SFCGAL support:       disabled
      Address Standardizer support:       enabled
    
     -------- Documentation Generation --------
      xsltproc:
      xsl style sheets:
      dblatex:
      convert:
      mathml2.dtd:          http://www.w3.org/Math/DTD/mathml2/mathml2.dtd
  4. Now we are ready to compile and install:
    cd extensions/address_standardizer
    make && make install

    If all goes well, your final output looks something like:

    /usr/bin/perl mk-sql.pl 'PostgreSQL 9.3.5' address_standardizer.sql > address_standardizer--2.2.0dev.sql
    /usr/bin/perl pagc-data-psql lex lexicon.csv > us-lex.sql
    /usr/bin/perl pagc-data-psql gaz gazeteer.csv > us-gaz.sql
    /usr/bin/perl pagc-data-psql rules rules.txt > us-rules.sql
    /bin/mkdir -p '/usr/lib/postgresql/9.3/lib'
    /bin/mkdir -p '/usr/share/postgresql/9.3/extension'
    /bin/mkdir -p '/usr/share/postgresql/9.3/extension'
    /bin/mkdir -p '/usr/share/doc/postgresql-doc-9.3/extension'
    /usr/bin/install -c -m 755  address_standardizer-2.2.so '/usr/lib/postgresql/9.3/lib/address_standardizer-2.2.so'
    /usr/bin/install -c -m 644 address_standardizer.control '/usr/share/postgresql/9.3/extension/'
    /usr/bin/install -c -m 644 address_standardizer--2.2.0dev.sql us-lex.sql us-gaz.sql us-rules.sql '/usr/share/postgresql/9.3/extension/'
    /usr/bin/install -c -m 644 README.address_standardizer '/usr/share/doc/postgresql-doc-9.3/extension/'

Using Address Standardizer

In order to enable the address standardizer in a specific database, connect to the database and run:

CREATE EXTENSION address_standardizer

If you prefer the GUI guided tour, once you install address_standardizer binaries (as we did), you should see it in pgAdmin extension drop down options

I'm still in the middle of packaging the standardization data sets into a separate data extension for easier consumption by end-users. For now you can just load the us-lex.sql, us-gaz.sql, us-rules.sql files via PSQL that get installed in the /usr/share/postgresql/9.3/extension/ folder.

If you want to experiment with using the extension, refer to the PostGIS 2.2 dev manual: Installing and using Address Standardizer, which is still a bit of a work in progress.

More books coming

Leo and I are still very busy writing PostgreSQL related books, so I haven't had quite as much time to devote to PostGIS documentation as I would have liked. If you didn't know - our PostgreSQL: Up and Running, 2nd edition recently came out in Print. PostGIS in Action 2nd Edition is due out late February / Early March. We have also started our 3rd PostgreSQL/PostGIS book which we will announce once we've gotten further into it. If you've seen me on #postgis IRC and wondered Why is Regina so engrossed in Graphy theory, you can probably guess what that book is about. Part of the joy of writing is learning new things as you go along and pushing yourself to experiment in different ways with technologies you love.