New subject: Riddle me this: grep / regx experts

Friday, 2 February 2018

...
> On Fri, Feb 02, 2018 at 11:04:01AM -0500, R. G. Newbury wrote:
> A bug in regx handling???
> 
> I am cleaning up some html code .....
...
> # grep -h '[0-9]*s[0-9]*">' temp
> Returns the example line with the 's[0-9]">' highlighted. 
...
> Can anyone explain what is happening?. This isn't politics so
the group
> [0-9] should not equal [0-9"#]. Or even [0-9\"\#].
  .
...
 Fri, 2 Feb 2018 10:14:37 -0600 From: Chris Adams
<linux(a)cmadams.net&gt;  
...
 A * in a regex is "0 or more of the previous", so basically
you are just
 matching 's[0-9]*">' (because there will always be at least 0 of the
 [0-9] part at the start).

 If you really mean "1 or more", you can use an extended regex (the -E
 argument to grep/sed) and use + instead of *, so '[0-9]+s[0-9]*">'.

 Fri, 02 Feb 2018 16:15:37 +0000 From: Patrick O'Callaghan 
 In grep, * matches any number of instances, including 0. You want to
 use + rather than * to guarantee at least one digit. 
...
 Date: Fri, 2 Feb 2018 11:26:02 -0500 > From: Jon
LaBadie&lt;jonfu(a)jgcomp.com&gt; 
...
 You are misunderstanding the "*".  It means any sequence of
the
 associated character including a ZERO length sequence.

 So [0-9]*s matches "s (actually just the s) as is is a zero length
 sequence of digits followed by an s.  When you grep for [0-9]s, there
 must be at least one digit before the s (but any extra digits are not
 part of the match).  Sometimes the sequence [0-9][0-9]*s is useful to
 say "one or more digits before the s".

 jl
 Thanks to all for the quick responses. I *tried* to RTFM but that was  not clear,
even on a re-read.  I took [0-9]* as multiple instances of 
[0-9] but NOT zero instances..

Geoff

Re: Riddle me this: grep / regx experts