PostgreSQL Array: The ANY and Contains trick

One of the main features I love about PostgreSQL is its array support. This is a feature you won't find in most relational databases, and even databases that support some variant of it, don't allow you to use it as easily. It is one of the features that makes building aggregate functions wicked easy in PostgreSQL with no messy compiling required. Aside from building aggregate functions, it has some other common day uses. In this article, I'll cover two common ways we use them which I will refer to as the ANY and Contains tricks.

I like to think of this approach as YeSQL programming style: how SQL can be augmented by more complex data types and index retrieval mechanisms. Arrays and many other data types (spatial types, keyvalue (hstore), ltree etc) are far from relational structures, yet we can query them easily with SQL and can even relate them.

Using comma separated items in an SQL ANY clause

Ever have a piece of text like this: apple,cherry apple,avocado or a set of integer ids like this 1,5,6 which perhaps you got from a checkbox picklist? You need to find out the details of the chosen products.

Well the first steps is to convert your string to an array like so:

SELECT '{apple,cherry apple, avocado}'::text[];

If it's a set of integers you would do.

SELECT '{1,4,5}'::int[];

Which converts your list to an array of: apple,"cherry apple",avocado or 1,4,5

Now you combine it with your detail query

Let's say your data looked something like this:

CREATE TABLE products(
    product_id serial PRIMARY KEY 
    ,product_name varchar(150)
    ,price numeric(10,2) ) ;


INSERT INTO products(product_name, price)
    VALUES 
        ('apple', 0.5)
        ,('cherry apple', 1.25)
        ,('avocado', 1.5),('octopus',20.50)
        ,('watermelon',2.00);

Now to get the details about the products the user selected

SELECT product_name, price
    FROM products 
    WHERE 
        product_name = ANY('{apple,cherry apple,avocado}'::text[]);
or if you were using ids
SELECT product_name,price
    FROM products
    WHERE 
        product_id = ANY('{1,4,5}'::int[]);

Let us say we needed to convert the id array back to a comma delimeted list. This will work in pretty much any version of PostgreSQL

        
SELECT array_to_string(ARRAY(SELECT product_name
    FROM products
    WHERE 
        product_id = ANY('{1,4,5}'::int[]) ), ',') As prod_list;

Which will give you an output: apple,octopus,watermelon

Now if you are using PostgreSQL 9.0 or higher. You can take advantage of the very cool string_agg function and combine that with the even cooler ORDER BY of aggregate functions if you want your list alphabetized. The string_agg approach comes in particularly handy if you are dealing with not one user request but a whole table of user requests. The string_agg equivalent looks like below --NOTE: it's not much shorter, but is if you are dealing with sets and can employ the GROUP BY person_id or some such clause. It will also be faster in many cases since it can scan the data in one step instead of relying on sub queries:

SELECT string_agg(product_name, ',' ORDER BY product_name) As prod_list
    FROM products
    WHERE product_id = ANY('{1,4,5}'::int[]) ;

Using arrays in column definitions and Contains @> operator

Of course for casual lists, arrays are great for data storage as well. We use them with caution when portability to other databases is a concern or referential integrity is a concern. Still they have their place.

Let's say we have a restaurant matching system where the user selects from a menu of foods they'd like to eat and we try to find a restaurant that has all those foods. Our very casual database system looks like this. Our objective is to write the shortest application code we can get away with that is still fairly efficient. The foods table would be used just as a lookup for our pick list:

CREATE TABLE food(food_name varchar(150) PRIMARY KEY);
INSERT INTO food (food_name)
    VALUES ('beef burger'),
        ('veggie burger'), ('french fries'), 
        ('steak'), ('pizza');
CREATE TABLE restaurants(id serial PRIMARY KEY, 
    restaurant varchar(100), foods varchar(150)[]);

--yes you can index arrays  
CREATE INDEX idx_restaurants_foods_gin
   ON restaurants USING gin (foods);

INSERT INTO restaurants(restaurant, foods)
    VALUES 
        ('Charlie''s', '{beef burger,french fries,pizza}'::varchar[]),
        ('Rinky Rink', '{beef burger,veggie burger}'::varchar[]),
        ('International', '{beef burger,spring rolls,egg rolls,pizza}'::varchar[]);

Our application allows users to pick foods they want for a meal and matches them up with all restaurants that have those items. For this trick we use the array contains operator to return all restaurants that contain in their food list all the foods the user wants to partake of in this meal. Our resulting query would look something like:

SELECT restaurant
    FROM restaurants 
    WHERE foods @> '{beef burger, pizza}'::varchar[];

Which outputs the below. Our dataset is so small that the index doesn't kick in, but if we have a 10000 or more restaurants we'd see the GIN index doing its magic.

  restaurant
---------------
 Charlie's
 International

As a final food for thought, PostgreSQL supports multi-dimensional arrays as well which has some interesting uses as well. Perhaps we'll delve into some examples of multi-dimensional arrays in another article.