Table Of Contents
OGR foreign data wrapper on Windows first taste
Foreign Data Wrappers for PostgreSQL 9.4 Windows
Using PostgreSQL Extensions
Oracle FDW 1.1.0 with SDO_Geometry PostGIS spatial support
Oracle FDW is a foreign data wrapper PostgreSQL extension that allows you to read and write to Oracle database tables from a PostgreSQL database. You can get it via the PGXN network or the main website http://laurenz.github.io/oracle_fdw/.
What is new about the latest 1.1.0 release is that there is now support for the Oracle SDO_GEOMETRY type that allows you to map the most common geometry types POINT, LINE, POLYGON, MULTIPOINT, MULTILINE and MULTIPOLYGON to PostGIS geometry type. Much of the spatial plumbing work was done by Vincent Mora of Oslandia. If we have any Windows Oracle users out there, yes there are binaries available for windows for PostgreSQL 9.1- 9.4 for both 32-bit and 64-bit. The FDW does have a dependency on the OCI.dll which I think comes shipped with Oracle products. Unfortunately, we are not Oracle users so can't kick the tires.
Using PostgreSQL Extensions
OGR foreign data wrapper on Windows first taste
This christmas I received something very special from Paul Ramsey and Even Roualt as detailed in Foreign Data Wrappers for PostGIS. It's been something I've been patiently waiting for for 4 years. I think it has a few issues I'm working to replicate, but overall it's much faster than I expected and pretty slick.
So why is ogr_fdw so special, because GDAL/OGR is an avenue to many data sources, NOT JUST GEOSPATIAL. It's the NOT JUST that I am most excited about. Though the focus is geospatial you can use it with non-geospatial datasources, as we described a long time ago in OGR2OGR for data loading
Windows Bag O' FDWs for PostgreSQL 9.4 coming soon
To celebrate this new gift, I've compiled ogr_fdw binaries for PostgreSQL 9.4 windows (both 32-bit and 64-bit). I'll be doing the same for other PostgreSQL versions in due time and hope to include this in upcoming PostGIS 2.2 windows bundle. For now, we'll be including the ogr_fdw in our bag-of-fdws for both 64-bit and 32-bit PostgreSQL 9.4. I'm still testing these out before I release, and will provide download links for these in a separate article.
Now this new bag o fdws, contains my other favorite FDWs -- the file_textarray_fdw, which we can't imagine living without for the kind of work we do. I also included www_fdw. Both file_textarray_fdw and www_fdw are show-cased in our new PostgreSQL: Up and Running, 2nd edition book, which just came out in print. Our PostGIS In Action 2nd edition, is still sadly one or 2 months away from hard-copy print and in final stages of proofing.
PostgreSQL 9.4 Windows OGR_FDW
At first pass, I used the same GDAL library (currently 1.11.1) as we use to build PostGIS 2.1 with. This unfortunately was not good enough for our needs, so I recompiled GDAL 1.11.1 to include Expat 2.1 and FreeXL 1.0.0h, as well as Curl (since I had compiled that for www_fdw) dependencies. With expat dependency, you get access to being able to read Microsoft Excel 2007+ XSLX workbooks and OpenOffice spreadsheets via this FDW and with FreeXL you get access to reading older Microsoft spreadsheet binary format .XLS files.
For those interested in the gory details of my compile, I have it on my ogr_fdw_depends_build.sh gist which I'm fine tuning a bit.
I wanted to include ODBC, but it seems under mingw the ODBC is called odbc32 and haven't dug in enough to figure out how to trick GDAL into using that instead of the odbc it is looking for. With ODBC you get a lot of other stuff like any data source you can access via ODBC.
This version of GDAL I tested with the PostGIS 2.1.5 windows binaries we recently released on application stackbuilder and seems to be a simple drop and replace affair (for the packaged libgdal-1.dll) (just make sure you also copy the included expat dll, lib-curl,and libfreexl as well) and keep in mind, if you use these, they may be overwritten by a PostGIS upgrade you do.
When we built GDAL for PostGIS, we took only the defaults, because we didn't want to deal with extra dependencies. This meant really useful common drivers like ODBC, Geospatial webservices, OpenOffice Spreadsheet, Excel XLSX, Excel XLS, and SQLite/SpatiaLite will not be accessible with the version of GDAL packaged with PostGIS windows stackbuilder install. I'm rethinking this for PostGIS 2.2. Right now, PostGIS doesn't need the OGR (vector) api of GDAL, and GDAL is only used by the PostGIS raster api to access the raster drivers. So the GDAL vector (OGR) bindings have just been conveniently hybernating waiting for someone to wake them up. Also since this FDW doesn't actually rely on PostGIS, you don't need PostGIS to use it.
I also don't compile with PostgreSQL support, which at first glance seems like a real oddity. The main reason for that is that I use the same GDAL for multiple versions of PostgreSQL, but I don't distribute libpq.dll since that comes packaged with PostgreSQL EDB installers. So I didn't want to run into weird dependency issues with GDAL trying to hook into a different version of libpq.dll than what it was compiled with. If you do compile with PostgreSQL, then you get PostgreSQL (and for raster -- ability to read PostGIS raster), which are very useful for client side use, but of marginal interest for server side (when you've got postgres_fdw already).
The unhighlighted vector drivers are what you get when you build GDAL with no extra dependencies. Still a very rich list. The yellow highlighted ones are the additional ones you get compiling with expat, freexl and Curl. The green highlighted ones are the additional you get compiling with ODBC.
Supported Formats: -> "ESRI Shapefile" (read/write) -> "MapInfo File" (read/write) -> "UK .NTF" (readonly) -> "SDTS" (readonly) -> "TIGER" (read/write) -> "S57" (read/write) -> "DGN" (read/write) -> "ODBC" (read/write) -> "VRT" (readonly) -> "REC" (readonly) -> "Memory" (read/write) -> "BNA" (read/write) -> "CSV" (read/write) -> "GML" (read/write) -> "GPX" (read/write) -> "KML" (read/write) -> "GeoJSON" (read/write) -> "GMT" (read/write) -> "WAsP" (read/write) -> "PGeo" (readonly) -> "MSSQLSpatial" (read/write) -> "PCIDSK" (read/write) -> "OpenFileGDB" (readonly) -> "XPlane" (readonly) -> "AVCBin" (readonly) -> "AVCE00" (readonly) -> "DXF" (read/write) -> "Geoconcept" (read/write) -> "GeoRSS" (read/write) -> "GPSTrackMaker" (read/write) -> "PGDump" (read/write) -> "GPSBabel" (read/write) -> "SUA" (readonly) -> "OpenAir" (readonly) -> "PDS" (readonly) -> "WFS" (readonly) -> "HTF" (readonly) -> "AeronavFAA" (readonly) -> "Geomedia" (readonly) -> "EDIGEO" (readonly) -> "GFT" (read/write) (Google Fusion tables) -> "GME" (read/write) (Google Map tables) -> "SVG" (readonly) -> "CouchDB" (read/write) -> "Idrisi" (readonly) -> "ARCGEN" (readonly) -> "SEGUKOOA" (readonly) -> "SEGY" (readonly) -> "XLS" (readonly) -> "ODS" (read/write) -> "XLSX" (read/write) -> "ElasticSearch" (read/write) -> "PDF" (read/write) -> "Walk" (readonly) -> "CartoDB" (readonly) -> "SXF" (readonly)
Compile with more, and you could get as many as this. I'll be putting the MapInfo, ESRI Shapefile and MS Office excel drivers to use right away.
Taking ogr_fdw for a test drive
How you connect to datasources varies a bit from driver to driver. For example an excel workbook is considered a server, and each worksheet is a layer (table in PostgreSQL speak). For datasources like ESRI Shapefile or CSV files, the server is the folder that the files sit in and each individual file is a table.
Paul already demonstrated some examples of using Web Feature service (WFS) and ESRI OpenFileGDB (ESRIs geodatabase) examples https://github.com/pramsey/pgsql-ogr-fdw. Those examples may be exciting to folks who get high on spatial data, but for most people, I imagine they are yawn material. But what about good'ol spreadsheets or other relational databases. Something most people work with?
To take ogr_fdw for a ride
- I downloaded a random xlsx file from: Student weight status category reporting results: beginning 2010 and saved to my C:/fdw_data/excel folder on my PostgreSQL server.
- Then using the command line tool: ogr_fdw_info that comes packaged with ogr_fdw, I checked to see what spreadsheets the xlsx file had in it with this command:
ogr_fdw_info -s "C:/fdw_data/excel/Student_Weight_Status_Category_Reporting_Results__Beginning_2010.xlsx"
and got this:Student Weight Status Category
Sadly the government had no excel workbooks I could find that had more than one worksheet; a big waste of a workbook. I did try with some of the beauties that people send us and confirmed that yes each worksheet gets listed as a separate layer.
- To get the table structure for the spreadsheet I do this:
ogr_fdw_info -s "C:/fdw_data/excel/Student_Weight_Status_Category_Reporting_Results__Beginning_2010.xlsx" -l "Student Weight Status Category"
Which outputs:
CREATE SERVER myserver FOREIGN DATA WRAPPER ogr_fdw OPTIONS ( datasource 'C:/fdw_data/excel/Student_Weight_Status_Category_Reporting_Results__Beginning_2010.xlsx', format 'XLSX' ); CREATE FOREIGN TABLE student_weight_status_category ( fid integer, location_code varchar, county varchar, area_name varchar, region varchar, school_years varchar, no__overweight real, pct_overweight real, no__obese real, pct_obese real, no__overweight_or_obese real, pct_overweight_or_obese real, grade_level varchar, area_type varchar, street_address varchar, city varchar, state varchar, zip_code varchar, location_1 varchar ) SERVER myserver OPTIONS ( layer 'Student Weight Status Category' );
Now this isn't a geospatial file, but GDAL has this narrow-minded view of the world that everything worth recording involves location or space. Unfortunately you can't simply remove that geom column, but you can ignore it. Just set the type to bytea and call it a day and ignore that column when you query. Also I'm going to change the name of the server to reflect its a single workbookThis was a bug in the outdated version of ogr_fdw_info I was using. There is no geometry column output if you feed it a non-spatial file
So when all is said and done, you should run a script that looks something like:
CREATE SERVER student_weight_workbook
FOREIGN DATA WRAPPER ogr_fdw
OPTIONS (
datasource 'C:/fdw_data/excel/Student_Weight_Status_Category_Reporting_Results__Beginning_2010.xlsx',
format 'XLSX' );
CREATE FOREIGN TABLE student_weight_status_category (
fid integer,
location_code varchar,
county varchar,
area_name varchar,
region varchar,
school_years varchar,
no__overweight real,
pct_overweight real,
no__obese real,
pct_obese real,
no__overweight_or_obese real,
pct_overweight_or_obese real,
grade_level varchar,
area_type varchar,
street_address varchar,
city varchar,
state varchar,
zip_code varchar,
location_1 varchar )
SERVER student_weight_workbook
OPTIONS ( layer 'Student Weight Status Category' );
Then you can run any sql and even join your new table with any local table like this:
SELECT county , sum(no__overweight) As tot_overweight
FROM student_weight_status_category
WHERE region ILIKE '%North%'
GROUP BY county
ORDER BY county;
and get an output like:
county | tot_overweight ------------+---------------- ALBANY | 8084 CLINTON | 2638 COLUMBIA | 1903 DELAWARE | 1581 ESSEX | 943 FRANKLIN | 2570 FULTON | 1209 GREENE | 1805 HAMILTON | 46 MONTGOMERY | 1794 N/A | 18927 OTSEGO | 2039 RENSSELAER | 5366 SARATOGA | 6747 SCHENECTADY | 4700 SCHOHARIE | 893 WARREN | 2714 WASHINGTON | 2646
I did run into one issue that on occasion if I started with just querying a subset of the columns, I got weird errors like: value too long for type character varying(11)
which would mysteriously disappear the second time I run the query or if I do a select of the whole table twice the issue would disappear. I haven't determined if its an issue with the specific data source driver, the ogr_fdw, or somethng about my compile. Haven't had quite the same issue with ESRI shapefiles, though I have had others with things like counts not being the same as first time I ran if I run two simulatenous queries or server crashing if I run two queries and cancel one out. So I suspect there is a dirty memory issue somewhere, but not sure where at the moment.
Using PostgreSQL Extensions
Foreign Data Wrappers for PostgreSQL 9.4 Windows
As stated in last article, I've packaged FDW binaries for PostgreSQL 9.4 windows 32-bit and 64-bit and added in the ogr_fdw one. These we've tested with the standard EDB VS built PostgreSQL windows installs and work fine with those.
FDW binaries for PostgreSQL 9.4
Below are the zip files that contain the binaries and dependency files for our favorite FDWs (that aren't normally available for windows):
FDWs in the 9.4 downloads
- file_textarray_fdw - this one is still our favorite because it has no extra depenendencies and handles delimited files (even large ones very fast) and can handle delimited data with varying columns in each row. We've turned a few of our clients that have to deal with third-party flat files onto this one. This is one of those that I really feel should be part of the core PostgreSQL distribution because it's more useful than the packaged file_fdw and fairly trivial to compile.
- https://github.com/cyga/www_fdw - for querying web services
- ogr fdw - A monster of an FDW which has already won my heart. Have some weird data source/file you've never heard of before (especially spatial)? There's a good chance OGR FDW can query it. Note this GDAL build has more drivers than what gets packaged with the PostGIS version. I have also included ODBC, so full set of drivers supported is (read-only since this fdw only supports read at the moment):
Supported Formats: -> "ESRI Shapefile" (read/write) -> "MapInfo File" (read/write) -> "UK .NTF" (readonly) -> "SDTS" (readonly) -> "TIGER" (read/write) -> "S57" (read/write) -> "DGN" (read/write) -> "VRT" (readonly) -> "REC" (readonly) -> "Memory" (read/write) -> "BNA" (read/write) -> "CSV" (read/write) -> "GML" (read/write) -> "GPX" (read/write) -> "KML" (read/write) -> "GeoJSON" (read/write) -> "GMT" (read/write) -> "ODBC" (read/write) -> "WAsP" (read/write) -> "PGeo" (readonly) -> "MSSQLSpatial" (read/write) -> "PCIDSK" (read/write) -> "OpenFileGDB" (readonly) -> "XPlane" (readonly) -> "AVCBin" (readonly) -> "AVCE00" (readonly) -> "DXF" (read/write) -> "Geoconcept" (read/write) -> "GeoRSS" (read/write) -> "GPSTrackMaker" (read/write) -> "PGDump" (read/write) -> "GPSBabel" (read/write) -> "SUA" (readonly) -> "OpenAir" (readonly) -> "PDS" (readonly) -> "WFS" (readonly) -> "HTF" (readonly) -> "AeronavFAA" (readonly) -> "Geomedia" (readonly) -> "EDIGEO" (readonly) -> "GFT" (read/write) -> "GME" (read/write) -> "SVG" (readonly) -> "CouchDB" (read/write) -> "Idrisi" (readonly) -> "ARCGEN" (readonly) -> "SEGUKOOA" (readonly) -> "SEGY" (readonly) -> "XLS" (readonly) -> "ODS" (read/write) -> "XLSX" (read/write) -> "ElasticSearch" (read/write) -> "PDF" (read/write) -> "Walk" (readonly) -> "CartoDB" (readonly) -> "SXF" (readonly)
We'll be updating the 9.3 ones next with the latest code and adding in the ogr_fdw.