regular expression (User to User help)

Tips on writing regular expressions for searching the post list

Moderators: Quade, dexter

regular expression (User to User help)

Postby kewakl » Sun May 20, 2001 2:36 am

Contiguous and non-contiguous articles can be found by
regexpressions like the following:


Suppose that you are looking for several files
of a RAR set named 'zzz'
and you are missing RAR files .r39-.r43 and file .r51.
In the 'Find In Subject:' box enter one of the following
Regular Expressions:

zzz\.r(39|40|41|42|43|51)
or
zzz[.]r(39|40|41|42|43|51)
or the FULL, bulky expression
zzz.r39|zzz.r40|zzz.r41|zzz.r42|zzz.r43|zzz.r51
All three expressions are functionally equivalent.

-------------------ALSO----------------------

Assume that you are looking for the following files from
RAR set named 'yyy'
.r20-.r24 and .r31-.r35 and .s00-.s04 and .s11-.s15
Use the following expression.

-------------------
yyy\.((r((2[0-4])|(3[1-5])))|(s((0[0-4])|(1[1-5]))))
**Note that the 'r' and the 's' are in ().**

-------------------
xxx\.r(0[9]|1[0|1|2|3|4|5|6])
would find all _existing_ RAR set 'xxx' files .r09 - .r16

-------------------
vvv\.((r|s)((0|1)[0-9]))
would find .r00 - .r19 and .s00 - .s19

-------------------
BE SURE: in cases where the file/extension separating period
is used in a distributive expression, you must ESCAPE '\.' or
SQUARE-BRACKET '[.]' the dot '.' or it could be assumed to be a
REGEXP operator.

zzz\.r(39|40|41|42|43|51) 'distributive - as the "." is
_distributed_ (according to the distributive law) among the
various entities in the (39-51) portion of the expression.'

more on the distributive law -
http://kids.infoplease.lycos.com/ce6/sci/A0815650.html

** For readability and ease of future understanding I
recommend escaping the dot '\.' to avoid confusion with SETs
of expression strings which are enclosed in square brackets. **

** In cases where multiple partial expressions are used, check
your parenthesis nesting levels if your expression does not
work the first time! **

More regexp info *used* to be at
http://www.newsbin.com/regexp.htm
I guess that they haven't moved that page yet. Give 'em time.

I like the forum!
This post is a bit lengthy, but this subject CANNOT be explained in few words!

--If I have misstated something, someone please post a revision--
kewakl
 

RE: regular expression (User to User help)

Postby NewsBinLover » Sun May 20, 2001 3:55 am

Very impressive that you can remember such complicated expression!
NewsBinLover
 

RE: regular expression (User to User help)

Postby kewakl » Sun May 20, 2001 7:04 am

Remember: No
Build: Yes
kewakl
 

RE: regular expression (User to User help)

Postby dexter » Sun May 20, 2001 12:50 pm

Great info! Thanks for the reminder about the RegEx page, I just hooked up again. It is available again at http://www.newsbin.com/regexp.htm. Hopefully this forum will cut down our email technical support so we have more time to update the help pages!
User avatar
dexter
Site Admin
Site Admin
 
Posts: 9511
Joined: Fri May 18, 2001 3:50 pm
Location: Northern Virginia, US

Registered Newsbin User since: 10/24/97

comments on your post

Postby Brian Caulfield » Sat Jun 09, 2001 1:19 pm

You wrote:
---------------------
Suppose that you are looking for several files
of a RAR set named 'zzz'
and you are missing RAR files .r39-.r43 and file .r51.
In the 'Find In Subject:' box enter one of the following
Regular Expressions:
zzz\.r(39|40|41|42|43|51)
or
zzz[.]r(39|40|41|42|43|51)
or the FULL, bulky expression
zzz.r39|zzz.r40|zzz.r41|zzz.r42|zzz.r43|zzz.r51
All three expressions are functionally equivalent.
-----------------------

You forgot to escape your . characters in last example. Should be:
zzz\.r39|zzz\.r40|zzz\.r41|zzz\.r42|zzz\.r43|zzz\.r51

And the shortest version of this is:
zzz\.r(39|4[0-3]|51)

Also there are some unnecessary uses of parentheses in your examples, either in places where precedence makes them unnecessary, or where the more efficient and concise character class would suffice. You wrote:
-----------------
.r20-.r24 and .r31-.r35 and .s00-.s04 and .s11-.s15

yyy\.((r((2[0-4])|(3[1-5])))|(s((0[0-4])|(1[1-5]))))
-----------------

Simplify this to:

yyy\.(r(2[0-4]|3[1-5])|s(0[0-4]|1[1-5]))

And simplify:
vvv\.((r|s)((0|1)[0-9]))
would find .r00 - .r19 and .s00 - .s19

to:
vvv\.[rs][01][0-9]
Brian Caulfield
 

RE: comments on your post

Postby kewakl » Sun Jun 10, 2001 10:55 am

Thank you for spotting this!
-----------------------zzz.r39|zzz.r40|zzz.r41|zzz.r42|zzz.r43|zzz.r51
All three expressions are functionally equivalent.
You forgot to escape your . characters in last example. Should be:
zzz\.r39|zzz\.r40|zzz\.r41|zzz\.r42|zzz\.r43|zzz\.r51

-----------------------
It *did* work although I *did* forget my own instruction!

Thanks for the *concise* listings...


From the small amount of posts in this *conference* I think that many people do NOT use RegEx.
If we could post more *examples/uses* maybe others could realize the potential!

Do you know of a good in-depth *online* tutorial? We could benefit from it.
kewakl
 

RE: comments on your post

Postby dexter » Mon Jun 11, 2001 10:09 am

I agree that a small percentage of NewsBin users actually use the RegEx capabilities. This is pretty much a power users feature. There does seem to be many reads of these posts so maybe people are at least learning it exists.

Most of what I learned about RegEx was from an excellent O'Reilley book, "Mastering Regular Expressions" (the Owl book).

As for online resources, try http://www.perldoc.com/perl5.6/pod/perlre.html. Perl's implementation of RE is pretty much the same as what we use in NewsBin.

Your expression worked even though you did not backslash the "." because it matched "any single character" instead of just a dot.
User avatar
dexter
Site Admin
Site Admin
 
Posts: 9511
Joined: Fri May 18, 2001 3:50 pm
Location: Northern Virginia, US

Registered Newsbin User since: 10/24/97

Re: RE: regular expression (User to User help)

Postby kewakl » Sun Jul 06, 2003 11:05 am

dexter wrote:Great info! Thanks for the reminder about the RegEx page, I just hooked up again. It is available again at http://www.newsbin.com/regexp.htm. Hopefully this forum will cut down our email technical support so we have more time to update the help pages!


now at http://www.newsbin.com/nb33help/regexp.htm
User avatar
kewakl
Seasoned User
Seasoned User
 
Posts: 855
Joined: Sat May 26, 2001 7:26 pm
Location: Way Out There - Even farther now!

Registered Newsbin User since: 04/01/03


Return to Regular Expressions

Who is online

Users browsing this forum: No registered users and 2 guests