dist-git help wanted: write me a regex!

James Cassell fedoraproject at cyberpear.com
Mon Dec 21 16:07:40 UTC 2009


On Mon, 21 Dec 2009 01:53:42 -0500, Bruno Wolff III <bruno at wolff.to> wrote:

>
>> /((Mon|Tues?|Wed|Thu(rs?)?|Fri|Sat|Sun)\s+(Jan|Feb|Mar|Apr|May|June?|July?|Aug|Sep|Oct|Nov|Dec)\s+[0-3]?[0-9]\s+(19|20)[0-9][0-9]\s+[A-Za-z0-9\s]+<[^\s@]+@[^\s@>]+>\s+2.[4-6].[0-9.-]+\s*)/
> I don't think this will catch a period in the comment part of the email
> address (as people often do after initials). Also if anyone is using  
> hyphenated
> names, I don't think those will get picked up. Since those entries are  
> utf-8,
> you need to worry about nonascii letters in the name. I am not sure how  
> those
> collate compared to ascii letters, but it might be safer to use [^<]+
> (instead of [A-Za-z0-9\s]+)

You are correct.  Here's the improved version:
/((Mon|Tues?|Wed|Thu(rs?)?|Fri|Sat|Sun)\s+(Jan|Feb|Mar|Apr|May|June?|July?|Aug|Sep|Oct|Nov|Dec)\s+[0-3]?[0-9]\s+(19|20)[0-9][0-9]\s+[^<]+<[^\s@]+@[^\s@>]+>\s+2.[4-6].[0-9.-]+\s*)/

-- 
James Cassell




More information about the devel mailing list