Sunday, August 31. 2008How to determine if text phrase exists in a table columnPrinter FriendlyTrackbacks
Trackback specific URI for this entry
No Trackbacks
Comments
Display comments as
(Linear | Threaded)
does this take into account that the email address might be in someone elses data?
say you were storing all emails in pg. john wants all references to him deleted but jack and john have been talking. you can just delete all emails that jack has that has john's email in it headers.
What it really doesn't take into account is that in many countries it is now illegal to delete any emails that have passed through your system within the last 5-10 years. (Yeah, that's a business issue, not so much a code issues).
One solution we developed at OmniTI was to use plperl and some perl modules to ingest, parse, and then store a breakdown of all email messages. One example portion of that search's all parts of the email (envelope, headers, and body) to stores any email addresses found in a table linked back to the original message. Hmm... I wonder if that code is hiding on labs...
Caleb,
Actually this script just returns the table field names that the email address appears in your database (or search phrase appears in) and will limit the search to only the pattern of tables, schemas, fields you specify. It also by the way doesn't do a case insensitive search, so would miss JOHN@hotmail. That would be easy enough to change by doing an ILIKE or upper,lower check or a regular expression check at a potentially significant speed penalty. As Robert said it is illegal to delete emails in many countries and it gets even more messy as each government agency/company has thier own rules too. I guess the main point of this exercise for us - was to say are we culpable (e.g. is this person simply blaming us because someone is using our address to spam people which annoyingly happens a lot and a lot of mail servers ignore SPF rules). If we find him in our system then we are likely to blame, but if not - we should investigate further or kindly tell him he is wrong.
I know it's not very databasey, but why not just grep the output of pg_dump, if you're trying to find whether something exists in your database?
David,
You would say that wouldn't you? Its a very fetterish thing to say :). I think people are all caught up in the stupid example I provided. The reason I wouldn't grep my pg_dump is because 1) I only have grep on my linux box not my windows box. 2) I'm not a grep hacker and I only want to search certain specific fields in my database. For example I may not care to check bodies of messages and only check short fields we use to spam people with. 3) Not really using this for email that much. I just thought that was an example people would be more likely to relate to than my real sinister intensions :). |
Entry's LinksQuicksearchCalendar
Categories
Blog Administration |