External database server

Make suggestions for new features.

Moderator: Quade

External database server

Postby Rith » Sun Mar 29, 2020 10:30 pm

Any chance of getting support for an (optional) external database server in lieu of sqlite? Sqlite does not do well at all with a few hundred million records and a 10GB database file. Meanwhile, PostgreSQL could handle this far better. Assuming you're using sqlite in a somewhat standard fashion, should be relatively minor incremental development effort. I've spent the last 3 days importing headers I downloaded in a few hours, and that just seems out of whack.
User avatar
Rith
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Thu Jan 22, 2004 9:05 pm

Registered Newsbin User since: 01/13/04

Re: External database server

Postby Quade » Mon Mar 30, 2020 11:32 am

I could but I doubt you'd get the speed benefit you think you would. First the way Newsbin does header storage is "a database per group" so, it's not like there's a single monolithic Sqlite file that contains all the records. The reason I do it that way is so that corruption in one group database file doesn't trash all the groups.

What's causing database issues today with the groups, is that some groups simply don't work for headers anymore. The posters are posting randomly named chunks of posts with no organization so, each chunk 750k post of a file is a single entry in the database.

The way it works in a proper groups is:

20000 Posts-> 30 Files-> 1 Group of Files - single entry in the database.

With the new postings it's:

20000 Posts->Single Entry in the database per post.

So the database is getting flooded with garbage posts. If you load a group and see a bunch of random 300-700K posts, that's what I'm talking about. I'm working on a new beta which will simply drop these random posts before before they hit the database.

I've toyed with the idea of adding MySQL support since, I already have code that supports MySQL. In my experiments though, it's really not much if any faster than local Sqlite database for regular files.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: External database server

Postby Rith » Mon Mar 30, 2020 11:40 am

I'm aware of the per-group sqlite database today. The numbers a mentioned in terms of scalability were for the sqlite db for a single group; I was not aware of the specifics of how the obfuscated post names are impacting the DB storage. That makes a lot of sense. I would certainly welcome MySQL/MariaDB support (even though I think postgres has some advantages). In your testing, did you wrap all of the inserts for a given header import file in a transaction block for MySQL, or was each insert/update auto-committing? I think deferring and aggregating some of the disk I/O by grouping the activity into transaction blocks *should* provide a notable speed boost over sqlite, especially if MySQL has sufficient heap memory made available to it.

I would welcome the opportunity to assist with beta testing either or both of the mentioned improvements (dropping the randomized names, external MySQL support).
User avatar
Rith
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Thu Jan 22, 2004 9:05 pm

Registered Newsbin User since: 01/13/04


Return to Feature Requests

Who is online

Users browsing this forum: No registered users and 2 guests