Page 1 of 1

special characters don't really understand

PostPosted: Thu Mar 20, 2008 7:04 pm
by PNPman
i'm trying to create a reject filter to reject certain words and any special character used as a spacer

mikeslideshow
mike.slide.show
mike_slide_show
mike_slideshow
etc

Will either of these filters work and what is the difference?

mike.?slide.?show or mike[.]?slide[.]?show

PostPosted: Thu Mar 20, 2008 9:18 pm
by Quade
"mike.*slide.*show" will catch all of those. It'll catch more more than that though.

How about trying that and see if it does what you want.

PostPosted: Thu Jul 10, 2008 10:05 am
by bobkoure
If it's a single spacer char (or nothing), maybe use ? rather than *
So...
mike.?slide.?show

Or if you know what the spacer characters might be, put them in brackets rather than using the dot. So, if the spacer chars were - _ and space
mike[-_ ]?slide[-_ ]?show

PostPosted: Mon Jul 28, 2008 7:50 pm
by Kiltme
You're mixing regex and dos wildcard filename matching which is part of whats confusing.

mike.slide.show
is the regex equivalent of the dos
mike?slide?show
file name search.

mike.+slide.+show
is the regex equivalent of the dos
mike*slide*show
file name search.

mike[ \._-]slide[ \._-]show

is the filter for specific separators space, dot, underline, hyphen.
The dot needs to be escaped with \ for it to be a dot and not a wildcard

Note that none of these will match mikeslideshow (neither the regex or the dos wildcard filters).

PostPosted: Mon Aug 18, 2008 10:12 am
by bobkoure
Kiltme wrote:mike.+slide.+show
is the regex equivalent of the dos
mike*slide*show
file name search.

Actually, no. In DOS, (or at least in the current iterations of what used to be the DOS command shell) the character '*' means "zero to any number of any character", and '?' means "one of any character".
Open a command window and try it with dir. I think you'll find that dir f*oo.txt will indeed match foo.txt.
That may be something new-ish and not in the old DOS command shell. It seems to me that the actual DOS shell was incapable of using *o*.txt to find foo.txt, where '*' had meant 'any number of any characters', but would then "swallow" them all. I think that this is the way that many flavors of CP/M worked as well, but they varied a lot. Intel came out with a version that used bits of something like PL/1 as a shell language.

The dot needs to be escaped with \ for it to be a dot and not a wildcard

Again no - not inside brackets. The only characters that need escaping inside brackets is ']' (for obvious reasons) and '\' itself. And I think in some flavors of regex you don't need to escape '\' unless it's directly to the left of one of the characters that, outside of brackets, would have needed escaping.

Note that none of these will match mikeslideshow (neither the regex or the dos wildcard filters).

Actually any of the expressions using ? or * will match that.
? = zero or one character
* = any number of characters, including zero

PostPosted: Mon Aug 18, 2008 11:46 am
by Quade
You will have to use brackets to escape explicit spaces in the next rev. Spaces are going to mean AND instead of space. I use brackets over \ alot. Since I program, slashes mean more to me so, brackets keep things clearer.