Sunday, February 14. 2010Regular Expressions in PostgreSQLPrinter FriendlyRecommended Books: Regular Expressions Pocket Reference Mastering Regular Expressions Sed and Awk
Comments
Display comments as
(Linear | Threaded)
Great post!
Modern regex engines, including PostgreSQL's, allow for an expanded regex syntax like: SELECT 'foo123' ~ '(?x) f o{2} [[:digit:]]{3} '; can make complex expressions less confusing :)
. Regular expressions are simpler than the longer and, sometimes, more resource-intensive operators LIKE and SIMILAR TO
I've been surprised by the interaction of ~* with UTF-8. This is supposed to work like ~ on a case-insensitive basis. It does not appear to work at least for German strings. It cannot convert umlaut-u for example:
> select name from ti_titles where name ~* 'FRÜHER'; name ------ (0 rows) But using the lowercase character: >select name from ti_titles where name ~* 'FRüHER'; name ------------------------------------ Wer früher stirbt, ist länger tot (1 row) Anyone else have problems with it? I've decided to explicitly upper() and lower() and use like '%...%' rather than ~*.
Yup!
And the other problem is the Constraint Escapes "\m" and "\M" which is "matches only at the beginning/end of a word" when working with UTF-8 data. For an example: when you made an query: > select name from ti_titles where name ~* E'\\mher\\M'; name ------------------------------ Wer früher stirbt, ist länger tot I love her so (2 row) It'll so great if it support for more Encoding.
Hi,
i have the problem with SOUNDEX() in postgreSQL with german umlaut letters too. Did you solve this problem? Can you give me a hint how? Bests & thx
Thank you very much! That is very important for me that you have posted this tutorial. The big advantage using regular expressions in PostgreSQL, besides the fact that you already know about regular expressions, are two words: power and speed. Regular expressions offer the DBA the ability to compose queries using highly complex criteria in a compact and sophisticated manner. Better, they can make your statements execute faster. Regular expressions are simpler than the longer and, sometimes, more resource-intensive operators LIKE and SIMILAR TO. Many SQL statements also need to couple traditional operators with user-defined functions to accomplish the same query as a single statement using regular expressions. That is of a great value, I believe)))
I thought of another handy scenario for using regular expressions in Postgres. This time not within the conditions, but in the returned results. Lets say for instance you have a column with URL, (maybe blog comments or something) and you would like to get a list of all the domain names, not the full paths trailing it nor the http://, how would you do that? Without regular expressions you would be forced to use a combination of various string functions to look for the domain name, and you will also aquire a headache free of charge. However with a simple regular expression you could do a lot of things.
Thank you for this excellent resource. SQL can be tricky and this article has certainly done its job of helping ease that.
Regular expressions is very powerful for sure. But I am more concerned with the performance part of it. Regex matching could not enjoy the power of indexes, that is the dreadful disadvantage. Still this article seems very informative and useful!
One of the new features in 9.3 is index support for regular expressions. We haven't tried it yet but will try to give an example once we try it out.
http://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.3 |
Entry's LinksQuicksearchCalendar
Categories
Blog Administration |
This post was mentioned on Twitter by planetpostgres: Leo Hsu and Regina Obe: Regular Expressions in PostgreSQL http://tr.im/O7p9
Tracked: Feb 16, 21:31
You have found the 179th edition of Log Buffer, the weekly review of database blogs. Welcome. Enjoy your stay. We begin with . . . SQL Server Merrill Alrich gets going with a fresh juxtaposition–his thoughts on motorcycles a...
Tracked: Feb 19, 14:47
Tracked: Jul 24, 00:52