How to Optimize Filter DB for Faster Processing

Technical support and discussion of Newsbin Version 6 series.

How to Optimize Filter DB for Faster Processing

Postby insolution » Mon Apr 22, 2019 11:31 am

Hi Quade,

So I monkeyed around with the filters.db3 file using SQLite Studio and now filter processing is *really* slow. I realize I have screwed with perfection, but my aim is worthy :-)

I have a pretty extensive list of filters that I've massaged over the years. In frustration at not being able to easily edit the filters inline (now fixed in 6.81+) I resorted to exporting the filter.db3 file using SQLlite Studio, exporting to Excel, sorting and editing the filter text in to individual SQLite records (versus using the pipe character to combine terms), and then re-importing them back to the filters.db3 file. That worked OK but was a lot of work.

So a list of filters that looked like:

Subject Accept filter3|filter1|filter2
Subject Reject filter5|filter4


Becomes:

Subject Accept filter1
Subject Accept filter2
Subject Accept filter3
Subject Reject filter4
Subject Reject filter3

The reason I did this is to make troubleshooting long filter lists easier. i can enable/disable 50% of the filter entries at a time and winnow out which filter is causing undesirable results. That has worked well. It is almost impossible to locate a filter term using the "as entered" list of filter text when you have more that, say 20 entries. Having the entries sorted (at least initially) makes finding specific entries much easier.

Now Newsbin processes really slowly when loading a GOG from the Groups tab, or selecting filters using the drop-down menu. I suspect this has to do with the index in the filters.db3 file not being optimized?

So, my questions:

1. Am I correct that this is an indexing issue? if so, can you offer a suggestion as to how to re-create the proper index to speed up filter processing? Is there a SQL command to recreate the index properly?

2. Am I better off using the pipe to combine terms, thus shortening the list of entries for a given filter as I had before, or does it not matter that the entries are now split in to individual records in the filters.db3 file? Does this matter in terms of filter processing performance?

I'd like to get back to speedy filters!

Thanks, I appreciate the help. I really do.

Scott
insolution
Active Participant
Active Participant
 
Posts: 65
Joined: Fri Aug 08, 2003 11:06 am

Registered Newsbin User since: 09/01/02

Re: How to Optimize Filter DB for Faster Processing

Postby Quade » Mon Apr 22, 2019 12:51 pm

It might have nothing to do with the database. The filters get converted into a regular expression engine. The more complicated the filters the slower they'll filter. It's possible what you've done by spreading out the filters into more entries has made the re engine more complicated. I'm not sure.

If you really want to speed things up. I'd suggest reducing the "display age" so it doesn't load up as many days worth of headers. Only "Load all" when you really want to look at historical records. Using "Delete all posts by poster" to actually delete entries from the header database and using size filters to pre-filter before the subject filters apply would also help.

You could probably trouble-shoot this by loading up an empty group then applying filters to the group. If it seems to stall the GUI while dropping down and selecting new filters, it's probably the DB or actually generating the RE engine. If dropping the new filters into the empty list is still fast, then the slowdown is probably in the actual filtering the files part, not in the loading and generating the filters part. I might experiment with your normal groups by only loading a couple days worth of headers with "load special" and see how it acts.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: How to Optimize Filter DB for Faster Processing

Postby insolution » Mon Apr 22, 2019 5:29 pm

I check the groups (that is, download the latest headers, and clear out anything unwanted) almost every day. I don't think it's the post age/range.

The performance issue cropped up when I messed with the filters DB, so that must be where the issue is. I just closed NB, moved my filters.db3 file to another directory, and then restarted NB. Back to normal (speedy) performance. So it seems the DB is the issue. if you want to take a look, let me know where to post the filter.db3 file and I'll send it along.

It affects not only filtering but selecting a GOG in the Groups tab as well. 5-10 second delay before I double click one GOG to load it and NB allows me to select the next GOG. Removing my filters.db3 file fixed the delay.

Just to be clear, i don't think this is a NB issue at all--I think the filters I have are messing up NB's normal speedy processing.

If you don't want to look at my filters.db3 file "as is" I'll try disabling some of the filters and see if I can find a particular area where the issue stems from. My gut tells me that the RE engine is choking on the filters when there are hundreds of entries in a particular filter.

Scott
insolution
Active Participant
Active Participant
 
Posts: 65
Joined: Fri Aug 08, 2003 11:06 am

Registered Newsbin User since: 09/01/02

Re: How to Optimize Filter DB for Faster Processing

Postby Quade » Tue Apr 23, 2019 1:11 pm

I explained how to troubleshoot it. Just moving the filters DB doesn't tell you much. It's a given that no filters will always be faster than some filters.

If you're trying to header filter (during header download) with a complex filter sets, it's expected to be slow. Filtering a post list might filter 1/60th or smaller of the actual post count because header filters happen for each post and a single rar might end up being 60 posts.

It's not really clear to me what you're filtering, header downloads or the display of headers.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: How to Optimize Filter DB for Faster Processing

Postby insolution » Tue Apr 23, 2019 2:08 pm

I'm filtering the display of headers. My only download header filters are for size.
insolution
Active Participant
Active Participant
 
Posts: 65
Joined: Fri Aug 08, 2003 11:06 am

Registered Newsbin User since: 09/01/02

Re: How to Optimize Filter DB for Faster Processing

Postby Quade » Tue Apr 23, 2019 3:53 pm

My only download header filters are for size.


Hopefully this is pretty small. Because header filters happen at "post" sizes, typically you never see a post bigger than 600K to 1 meg. You have to be careful you're not filtering the end of a files out. The last chunk of a file has a random size and they can be as little as a couple lines.

Post 1:[600K],Post 2:[600k]:Post 3:[1K]
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97


Return to V6 Technical Support

Who is online

Users browsing this forum: Google [Bot], Majestic-12 [Bot] and 3 guests

cron