post/reject spam filters?

Tips on writing regular expressions for searching the post list

Moderators: Quade, dexter

post/reject spam filters?

Postby corsairius » Fri Jan 11, 2002 4:39 pm

LAST EDITED ON 01-11-02 AT 03:21 PM (EST) Here is my reject filter for multimedia.babylon5:

(gathering)|([Tt][Gg][123])|(itb)|(beginning)
(s?vcd)?.*(10[23457]|11[01234789]|12[12]|20[12345789]|21[123567]|30[34567]|31[01234589]|32[012]|40[12]|416).*(s?vcd)?

Here is my accept filter:

bat
divx
nfo
par

Right now, only the reject filter is checked, since I don't know what the action would be if I activated both filters.

In what order are the filters applied to the subject headers?

ie, Does Newsbin absolutely reject those in the reject filter except when there is an entry in the accept filter that matches?

Are the two selections mutually exclusive, or is one the exception of the other when both are selected?

Does case matter in the filter expression?

Will an "i" appended to a filter grouping ignore case, if so?
never mind appeneding the "i", I think that is used in modperl.


Thanks!
--Joe.
corsairius
Active Participant
Active Participant
 
Posts: 57
Joined: Mon Jan 07, 2002 1:59 pm

Registered Newsbin User since: 04/15/03

RE: post/reject spam filters?

Postby dexter » Sat Jan 12, 2002 2:51 am

LAST EDITED ON 01-12-02 AT 00:53 AM (EST) I don't know all the answers here but I think I can help.

First off, I think you may be better off using the "Find in Subject" feature on the main screen. It bypasses all filters and will allow you to adjust your RegEx to show the headers you want so you can mark them for download.

I'd have to play with the accept/reject stuff to remember the order of precedence. Maybe Quade will know off the top of his head. The file filters are applied when the post is actually downloaded because we don't know the true name of the file until we download the first part. I filter on .exe, .scr, .vbs, stuff like that which may contain viruses. The subject filters are applied during header download but you can toggle by clicking the "Show Filtered Posts" checkbox just to see if they are doing the job.

Matching is case insensitive, in fact a RegEx entered in the "Find in Subject" field will be forced to upper case after entered but case is ignored.

BTW, we do support character ranges so you can use [1-5] instead of [12345]. Could save some typing.

Hope this helps.
User avatar
dexter
Site Admin
Site Admin
 
Posts: 9511
Joined: Fri May 18, 2001 3:50 pm
Location: Northern Virginia, US

Registered Newsbin User since: 10/24/97

RE: post/reject spam filters?

Postby corsairius » Sat Jan 12, 2002 3:20 am

>First off, I think you may be better off using the "Find in >Subject" feature on the main screen. It bypasses all filters >and will allow you to adjust your RegEx to show the headers you >want so you can mark them for download.

I was wondering if there was a facility to check regex patterns ... thanks!

>
>I'd have to play with the accept/reject stuff to remember the >order of precedence. Maybe Quade will know off the top of his >head. The file filters are applied when the post is actually >downloaded because we don't know the true name of the file >until we download the first part. I filter on .exe, .scr, .vbs, >stuff like that which may contain viruses. The subject filters >are applied during header download but you can toggle by >clicking the "Show Filtered Posts" checkbox just to see if they >are doing the job.
<
<Matching is case insensitive, in fact a RegEx entered in <the "Find in Subject" field will be forced to upper case after <entered but case is ignored.

Much easier when it's case insensitive. Actually thought that NB _was_ case sensitive, but my problem was just an ungrouped filter pattern, "xyz|abc" instead of "(xyz)|(abc)". The example page for the RegEx on the Newsbin site indicates the use of parentheses in an un-escaped format that indicated a pattern for parentheses instead of grouping.

<
<BTW, we do support character ranges so you can use [1-5] <instead of [12345]. Could save some typing.

It makes sense to do so. I just listed out the eps already downloaded, and enumerating them made it easier to check them off. I also had to go straight to the FILTERS.XML file to edit the pattern, since the gui wouldn't let me enter a trailing asterisk, etc. Well, I really DON'T know what it wouldn't let me enter, but I couldn't add a mask that was similar to other entries and differed only by the enumeration described above.


<
<Hope this helps.

It did. Thank You. Still curious to find out how the reject/accept filters interact.
corsairius
Active Participant
Active Participant
 
Posts: 57
Joined: Mon Jan 07, 2002 1:59 pm

Registered Newsbin User since: 04/15/03

RE: post/reject spam filters?

Postby Quade » Sat Jan 12, 2002 11:50 am

LAST EDITED ON 01-12-02 AT 09:52 AM (EST)
Here's the sequence. It does 1 then 2 in sequence

The spaces were pulled out

1) Accept Filters?
->a) Yes
->->Matched Accept Filters?
->->->No, Drop the post.
2) Reject filters?
->a) Yes,
->->Matched Reject Filters?
->->->Yes, drop the post
Post is OK by me!
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97


Return to Regular Expressions

Who is online

Users browsing this forum: No registered users and 2 guests

cron