Page 1 of 1

How to filter out filenames containing "index" etc

PostPosted: Sat Jun 14, 2008 3:10 am
by andy
I set filename reject filters thus
index
idx
ndx
!index
... and several other variations ...

and I still get hundreds of filenames containing those words.
I read the FAQs and went to the abysmally unhelpful regularexpressions.crap without a clue. How about a simple answer to what should be a fair description of my simple problem? Thanks.


ps
I can't find a button to add my avatar (maybe automatic?)

PostPosted: Sat Jun 14, 2008 5:09 am
by itimpi
You did not make it clear if these files were actually being downloaded or merely displayed? You DO realise that filename reject filters only take place AFTER the download starts as that is the only point at which the filename is known for certain? If you want filters to take place prior to download then you need to use Subject filters.

Files downloaded

PostPosted: Tue Jul 01, 2008 1:53 am
by andy
Pardon my inexactitude.

I set the filters to filter out filenames with "index" or parts thereof. Hundreds of unwanted files are actually downloaded. I'm not trying to keep them off the list of available files; I'm trying to NOT download them.

There isn't enough time in a day to individually reject what amounts to thousands of "index" files in the 50 or so image groups I want to monitor.

I don't want ANY of the damned "index" files to download.

thanks.

PostPosted: Tue Jul 01, 2008 4:37 am
by itimpi
I am not sure what to suggest :(

I just did a test where I set up a reject filter for "index" as both a subject and filename reject filter. I then assigned the filter as a profile to my selected group. I got no filenames containing index downloaded, and quite a few rejected so I know the filter was working.

subject AND filename filter together is very bad

PostPosted: Wed Jul 02, 2008 12:51 am
by andy
In image groups, the subject line often contains the phrase "see index", so filtering using the subject line is a very bad approach. That eliminates many desirable files.

I reinstalled the program, and on today's download of 33,000 images I got none of the index files, so I suppose the reinstall did the trick.

Problem solved. Thanks for the help.

PostPosted: Thu Jul 10, 2008 10:00 am
by bobkoure
You could use
(?<!see.{1,4})index
to match "index" but not "see index"

Or you could use
\.index
to match ".index"
Note that, without a backslash, "." means "any character" so if you were using
.index
that was basically matching any line with "any char" and then "index", so, yeah, lines with "see index" were being matched.

There are a lot of resources on the web. If you don't like one, try another. Different teaching approaches work better for different folks.
To be fair, the (?< stuff is advanced, but \. is not - just a matter of knowing what characters are "special" and need to be preceded with a \

PostPosted: Thu Jul 10, 2008 10:46 am
by andy
I'll try your suggestion. Thanks!