ODBC FDW now supports 9.5 and 9.6

A while ago when Foreign Data Wrappers in PostgreSQL was a fairly new thing, we talked about the ODBC_FDW foreign data wrapper. Since then, people have been asking us how to get the ODBC FDW to work on newer PostgreSQL. Sadly the ODBC_FDW was stuck in time not having updated to newer FDW API standards. Our recommendation was just to use OGR_FDW, which many distributions both Linux and Windows have compiled OGR_FDW with ODBC support. True that OGR_FDW is coined as a spatial data wrapper, but the reality is spatial data rarely lives apart from regular attribute data so a good spatial vector driver supports both vector data and bread and butter data types. OGR_FDW is still our go to for working with spreadsheets and folders of CSV files.

Recently the fine folks at Carto patched the ODBC FDW to work with PostgreSQL 9.5. I do hope they accept my modest patch to make it work with PostgreSQL 9.6 as well. So now 2 FDWs to choose from for connecting to ODBC datasources. Which one is better? The answer as most always is IT DEPENDS.

For users on windows, we've compiled binaries for PostgreSQL 9.5, 9.6 both (32/64 bit). The 64-bit I've tested on both EDB and BigSQL distributions and I compiled them using Mingw64-w64 gcc 4.9.2.

We've also updated the FDW bag of extensions to include this extension

If you are on windows compiling with Mingw64 and want to use Native windows ODBC, you can more or less follow the compile instructions on https://github.com/CartoDB/odbc_fdw, except for the Makefile, make the change:

SHLIB_LINK = -lodbc

Change to:

SHLIB_LINK = -lodbc32

This applies both for 64-bit and 32-bit compiles. It always seemed counter intuitive to me that the windows 64-bit odbc is also called odbc32.

For this exercise, I'll connect to a SQL Server 2008 R2 with the FDW ODBC driver.

-- this is an example connecting to a SQL Server.  
-- The DSN name should be one registered in 64-bit (or 32-bit if you are using PostgreSQL 32-bit) ODBC as a system dsn
CREATE SERVER sql_server

-- The user mapping should be an account on SQL Server
  SERVER sql_server
  OPTIONS (odbc_UID 'sa', odbc_PWD 'whatever');
-- change this to a database that exists on your SQL Server
  FROM SERVER sql_server
  INTO public
    odbc_DATABASE 'AdventureWorks',
    table 'fdt_columns', -- this will be the name of the created foreign table
    sql_query 'SELECT * FROM information_schema.columns'
-- should list all columns in that database
SELECT * FROM fdt_columns;

The odbc_fdw preserves the casing of column names which is sometimes nice, but often not nice. Because now to select columns from the foreign table, you've got to do this quote nonsense.


If you don't want to be bothered with the hassle of quoting when you query the foreign table, you could define a query for your table like so:

  FROM SERVER sql_server
  INTO public
    odbc_DATABASE 'AdventureWorks',
    table 'fdt_columns_subset', -- this will be the name of the created foreign table
    sql_query 'SELECT table_name AS "table_name", column_name AS "column_name" 
        FROM information_schema.columns'

With my new subsetted foreign table, I can now do:

SELECT column_name FROM fdt_columns_subset;

Which is better for querying ODBC datasources, OGR_FDW or ODBC_FDW?

Getting back to the "It depends" statement, I'll qualify that with a, but OGR_FDW is generally better.

Here is why:

Lighter weight, no additional dependencies beyond ODBC and PostgreSQLRequires GDAL which might have a whole chain of dependencies. If you're using PostGIS, you've probably already got GDAL, so no additional requirement.
Push down, if it exists is pretty weak, try to do something like a simple LIMIT or a WHERE something='whatever' against a big table, and you could be waiting for days.Push down support for LIMIT and simple WHERE clauses like WHERE x > 80 or x = 'whatever', so pretty fast even for big tables. I've tried on a SQL server table with 14 million records and was not disappointed.
You can define a remote query as a foreign table - YES, number one win. This makes it ideal where you've got a complex query you want to run only on the remote server (ala SQL Server's OPENQUERY)No query as a foreign table support. Only remote tables and views can be foreign tables.
IMPORT FOREIGN SCHEMA, while existent, seems to just save in defining columns for a single table, can't pull in a whole load of tables. Also always preserves table names. On upside you can choose name for new table. Downside, no quick way to launder columns, so if you have an ODBC with upper case or mixed case, be prepared to write a lot of "sNot only does it pull in a bunch of tables, but you can use the schema part as a table filter e.g IMPORT FOREIGN SCHEMA "dbo.Finance" will pull in all tables in dbo schema that start with Finance. Can't choose table names, but you can choose to launder table names and columns, so you don't have to quote the columns every time you use them.
maintains both basic data type and modifiers in foreign table so nvarchar(50) becomes varchar(50). No support for spatial columns.It's a spatial driver first and foremost, so it can understand things like MSSQL geometry column. Doesn't maintain data type modifiers, so nvarchar(50) will come in as just varchar.

Our PostgreSQL Up and Running 3rd edition is now available in early preview.