Feature or Frustration

Lately I'm reminded that one person's feature is another person's frustration. I've been following Paul's PostGIS Apologia detail about why things are done a certain way in PostGIS in response to Nathaniel Kelso's: A friendlier PostGIS? Top three areas for improvement. I've also been following Henrik Ingo: comparing Open Source GIS Implementation to get a MySQL user's perspective on PostGIS / PostgreSQL. Jo Cook has some interesting thoughts as well in her PostGIS for beginners amendment to Paul's comments. I have to say that both Nathaniel, Henrik, Jo and commenters on those entries have overlapping frustrations with PostgreSQL and PostGIS. The number one frustration is caused by how bad a job we do at pointing out avenues to get a friendly installation experience. I do plan to change this to at least document the most popular PostGIS package maintainers soon.

One of the things that Henrik mentioned was his frustration with trying to install PostGIS via Yum PostgreSQL repository and in fact not even knowing about the PostgreSQL Yum repository that has current and bleeding versions of PostgreSQL. I was surprised he didn't know because as a long-time user of PostgreSQL, I dismissed this as common knowledge. This made me realize just how out of touch I've become with my former newbie self and I consider this a very bad thing. I was also very surprised about another feature he complained about - CREATE EXTENSION did not work for him because he accidentally installed the wrong version of PostGIS in his PostgreSQL 9.1. The main reason for his frustration was something I thought was a neat feature of PostGIS. That is that PostGIS is not packaged into PostgreSQL core and you can in fact have various versions of PostGIS installed in the same PostgreSQL cluster. This unlike the other OGC spatial offerings of other databases (SQL Server, Oracle, MySQL) allows the PostGIS dev group to work on their own time schedule largely apart from PostgreSQL development group pressures. It also means we can take advantage of breaking changes introduced in PostGIS 2.+ for example without impacting existing apps people have running 1.5 and also allow people to take advantage of newer features even if they are running an earlier PostgreSQL version.

There is a frustrating side to this. PostGIS is built on other software and gives you the option of leaving features out. So it's a pluggable system designed to be installed on yet another pluggable system. On the good side many of the PostGIS developers are also developers of said core dependencies; GEOS and GDAL in particular and PostgreSQL, mostly in support, but Mark and Paul have been known to throw some money and elbow grease on the PostgreSQL code side to get supporting features implemented faster). This good also hurts us because we assume people are intimately aware of how all these packages work together. It also means PostGIS isn't available out of the box like the other spatial features of other databases. So you'll see Sandro for example plugging in some feature in latest GEOS development version and then plug in logic in latest PostGIS version to take advantage of this new feature. We have GEOS, Proj, and now GDAL (and did I mention GDAL has dependencies too which are optional). This GDAL entanglement drove Devrim nuts in fact when I begged him to include raster support in his packages and I am extremely greatful to him (and still owe him that tutorial). All that wouldn't be so bad except for the fact that for GEOS and GDAL depending on which version you use and what you have packaged in, what you can do with PostGIS is different. Some functions in PostGIS are disabled for example if you are using GEOS < 3.3 in 2.0 and similar goes for PostGIS 2.1, if you are using GEOS < 3.4 (which is not even released yet), you are missing out on some pretty slick stuff. You can compile without GDAL if you don't want raster support, but you also won't get CREATE EXTENSION if you didn't compile with raster.

If you are a savvy Linux user like Sandro Santilli who can't see why he should upgrade his PostgreSQL to beyond PostgreSQL 8.4 and is very suspicious of anything as magical looking as CREATE EXTENSION, you might marvel in this wonderful freedom we offer the end user. Sorry for picking on you Sandro :). Many users don't like freedom especially if they are casual users of a piece of software.

If you are a package maintainer you like things that have no dependencies or at least things you can isolate so that it doesn't screw up any other software people might be running. Gathering from the frustrations people have had packaging, I have to say I suspect making these kinds of isolations is easier under windows than it is on most other platforms. I was under the assumption that packaging under windows was the hardest task and if we, as windows package maintainers can do it, then any maintainer can. Watching Devrim's frustration made me realize this may not be the case. In windows we had no issue isolating GDAL dependencies by changing our path variables during compile and leaving out extra raster drivers that required additional dependencies. While we have binaries for both x64bit and x32bit for various versions of PostgreSQL, the same set of binaries work from Windows 2003/Windows XP - Windows 2008/Windows 7. Seems other people especially Mac have to worry about different versions of Mac and stuff.

The balancing act

In any project, particularly open source, you have four key user groups and its really hard to satisfy one without pissing another off.

For a Project, the two main groups you have to satisfy are Package Maintainers and Newbies. Package Maintainers are the life-blood of your project and newbies are your next generation. Maintainers are your manufacturing department. If your software requires compilation, all you have is useless code for the large majority of users if you don't have package maintainers. Maintainers can make your software easy to install for the Newbies and provide an expedient and pleasurable upgrade path for your Intermediate and Power Users. As I said before you want everyone using the latest version of your software just for simplicity of maintainence. Although I think Package Maintainers are the most important group, and their contributions are sadly often overlooked, you can't listen to them blindly. You want their packaging to have the bells and whistles you expect a lot of users to demand. Sometimes that additional packaging is hard, and requires guidance from the development group and user community. You don't want a user for example saying Hey I got raster support on windows, why don't I have that when I deploy on CentOS or Ubuntu, or how come I have this function on windows and not on RedHat install. On the other side of the fence, you have those people that complain: I only use geometry and geography, why do I have to put up with these raster functions to confuse me. In fact I don't need anything but geography because I just want simple location search..

I admit as a PostGIS project, I think we've pretty much sucked at making package maintainers lives easier by not guiding them on topics such as requisite dependencies , consistent user experience, things to watch out for with dependencies, how important we feel a feature is to package in, and MOST IMPORTANTLY broadcasting their package availability. We've essentially as Paul said dismissed it as someone else's problem and users should bug their maintainers if they want newer PostGIS supported. This especially depresses me since that's the group I feel most a part of.

Your software has too many features

The best kind of software is software that doesn't need explanation or documentation. The more features you pile on, the harder it is to hide those features until the user is ready to digest them. You'll gain some new converts with more features , but you'll also get a lot of people leaving you or dismissing you because your software is too complicated to operate or much too much baggage for the little they want to do.

To a large extent as much as we try with making PostgreSQL or PostGIS easy to use and understand by newcomers, it's just going to be too complicated for a large majority of users and they'll pick something else that doesn't require reading 200 pages to wrap their mind around or that is more aligned with the mode of thinking they are used to. The only undisputable feature is speed. Speed may not be the sexiest thing but it sells if you can showcase it well. You'll win converts because they will be willing to put up with having too many features, and a somewhat cumbersome install, if you can convince them they'll get more speed than they can with any other product.