Problem You are developing a consultant search application where by a user looking for a consultant can check all the items in the list they require a consultant to have. You have 2 tables. consultant and consultant_qual. The consultant_qual has primary key formed by 2 fields consultant_id and qual_id. qual_id contains a constrained list with things like 'OpenLayers', 'PostGIS', 'Python', 'C#' etc.
How do you devise a query that given a list of checked options by the user, only returns consultants with not 1 but all of those qualifications?
Solutions
When looking at a problem like this, I am always amazed at the number of ways you can solve this in SQL. Some very convoluted and slow, some very convoluted looking and fast, and some pretty fast and pretty simple. How can that be when SQL is a declarative language? It all boils down to what angle you look at the problem or rather how you restate the problem and also the fact that like in other languages there are a myriad of ways to say the same thing and some databases and dataset patterns do better with one than another.
To demonstrate:
-- This will work in databases that support ANSI INTERSECT (e.g. PostgreSQL and SQL Server 2005/2008)
--but not in MySQL, SQL Server 2000 for example
SELECT C.consultant_id, C.consultant_name
FROM consultant AS C
INNER JOIN consultant_qual AS Q ON C.consultant_id = Q.consultant_id
WHERE Q.qual_id = 'OpenLayers'
INTERSECT
SELECT C.consultant_id, C.consultant_name
FROM consultant AS C
INNER JOIN consultant_qual AS Q ON C.consultant_id = Q.consultant_id
WHERE Q.qual_id = 'PostGIS';
--This will work in any database that supports EXISTS (which is a lot)
-- and uses several correlated subqueries (that is just variant -- I can count at least one more)
SELECT C.consultant_id, C.consultant_name
FROM consultant AS C
WHERE EXISTS(SELECT Q.consultant_id
FROM consultant_qual As Q
WHERE Q.qual_id = 'OpenLayers' AND C.consultant_id = Q.consultant_id)
AND EXISTS(SELECT Q.consultant_id
FROM consultant_qual As Q
WHERE Q.qual_id = 'PostGIS' AND C.consultant_id = Q.consultant_id);
--This will work in any database that supports IN (which is even more than EXISTS)
-- and uses several non-correlated subqueries
SELECT C.consultant_id, C.consultant_name
FROM consultant AS C
WHERE C.consultant_id IN(SELECT Q.consultant_id
FROM consultant_qual As Q
WHERE Q.qual_id = 'OpenLayers')
AND C.consultant_id IN (SELECT Q.consultant_id
FROM consultant_qual As Q
WHERE Q.qual_id = 'PostGIS');
-- This will work in any database that supports subselects in JOINS
-- which is a lot
SELECT C.consultant_id, C.consultant_name
FROM consultant AS C
INNER JOIN
(SELECT Q.consultant_id
FROM consultant_qual As Q
WHERE Q.qual_id = 'OpenLayers') AS Q1
ON C.consultant_id = Q1.consultant_id
INNER JOIN
(SELECT Q.consultant_id
FROM consultant_qual As Q
WHERE Q.qual_id = 'PostGIS') AS Q2
ON C.consultant_id = Q2.consultant_id ;
The nice thing about the gulp solutions is that regardless of what the user checks, your query is more or less the same and always has the same number of sub queries. There are lots of permutations we can think of here too, but these are just 3 to demonstrate.
-- This will work in pretty much any relational database I can think of
SELECT C.consultant_id, C.consultant_name
FROM consultant AS C INNER JOIN consultant_qual As Q
ON C.consultant_id = Q.consultant_id
WHERE Q.qual_id IN('OpenLayers', 'PostGIS')
GROUP BY C.consultant_id, C.consultant_name
HAVING COUNT(Q.qual_id) = 2;
-- This will work in pretty much any relational database I can think of
-- If it supports subselects in WHERE it will work
SELECT C.consultant_id, C.consultant_name
FROM consultant AS C
WHERE 2 = (SELECT COUNT(Q.qual_id)
FROM consultant_qual As Q
WHERE Q.consultant_id = C.consultant_id
AND Q.consultant_qual IN('OpenLayers', 'PostGIS')) ;
--This on first glance looks redundant,
-- but takes advantage of the fact that the first condition is easy for most relational dbs to optimize
-- and the second will never be run if the first check fails
SELECT C.consultant_id, C.consultant_name
FROM consultant AS C
WHERE EXISTS(SELECT Q.qual_id FROM consultant_qual As Q
WHERE Q.consultant_qual IN('OpenLayers', 'PostGIS')
AND Q.consultant_id = C.consultant_id) AND
(2 = (SELECT COUNT(Q.qual_id)
FROM consultant_qual As Q
WHERE Q.consultant_id = C.consultant_id
AND Q.qual_id IN('OpenLayers', 'PostGIS') ) ) ;