Page 1 of 1

6.80 RC7 - Header download - *.txt files instead of *.gz

PostPosted: Sat Mar 17, 2018 10:31 am
by Calahan
I'm now testing RC7 and saw that instead of creating *.gz files in the import folder, RC7 now creates *.txt files.
I only updated from B11 to RC7, no settings changed. Is that by (new) design or is the header download not compressed anymore?

Here's a part of my import folder:
old files:
group.name-Newsservername-000039034397-000039548100.gz
group.name-Newsservername-000039547100-000039548144.gz
new file after updating:
group.name-Newsservername-000039547144-000039548156.txt

Re: 6.80 RC7 - Header download - *.txt files instead of *.gz

PostPosted: Sat Mar 17, 2018 10:56 am
by Quade
Yeah, we believe it imports faster as txt than as GZ. They get deleted after import so, they don't stick around. You could turn on folder compression for the import folder.

Re: 6.80 RC7 - Header download - *.txt files instead of *.gz

PostPosted: Sat Mar 17, 2018 11:42 am
by Calahan
Good to know. Thanks for the info, Quade.

Re: 6.80 RC7 - Header download - *.txt files instead of *.gz

PostPosted: Sat Mar 17, 2018 2:06 pm
by stavros
Hi Quade,

something I have noticed since the switch to *.txt is that the disc seems to be getting fragmented a lot more (and more quickly) now.

I have the whole newsbin directory on a single 2TB hard drive (with the data folder on another physical drive and no other activity than newsbin), that puts the Import and Spool_v6 directories on the same drive.
With the huge increase in file size of the *.txt files in the import directory, I suspect that it is the cause of the fragmentation of both the *.txt files and the spool *.db3 files.

I took the time to completely defragment the drive, but within a day, nearly half of the storage*.db3 files (over 150 of them) had more than 100 fragments each. This is with an auto header download every 1hour of 360 groups. The -wal files are also getting fragmented during the import processing , especially of the larger, more volatile, groups. And, of course, the fragmentation is occurring at the hot spots of the *.db3 files, where subsequent search and update activity is going to occur once the import is completed.

This may be less of an issue for installations on an SSD, but for a hard drive it is not so good.

Do you have any ideas on how to improve the situation (other than defragging daily) ?

regards
Stavros.

Re: 6.80 RC7 - Header download - *.txt files instead of *.gz

PostPosted: Sat Mar 17, 2018 4:17 pm
by Quade
DB3's will fragment. They get extended in dribs and drabs so, you're not going to get contiguous runs of disk space. It's just the nature of the beast.

As I mentioned before, you could turn on folder compression for the import folder if you want some compression of these files.

Re: 6.80 RC7 - Header download - *.txt files instead of *.gz

PostPosted: Sat Mar 17, 2018 7:14 pm
by stavros
Thanks, I'll try compressing the import folder and see if that helps.

Is there a variable I can set in the NBI that will allow me to move the spool_v6 directory tree on to a different drive than the rest of the newsbin installation - similar to the data and download?

I've seen a SpoolPath variable, but that is pointing to my newsbin directory, while the data path is pointing to newsbin\data, which in turn contains the spool_v6....

[SETTINGS]
...
DataPath=N:\Newsbin\Data\
...
DownloadPath=F:\DL\$(GROUP)\
...
PosterFile=N:\Newsbin\POSTERS.DAT
...
SpoolPath=N:\Newsbin\

Then, of course, there's the [AUTORAR] bits as well, so I'm not sure if things will break if I change SpoolPath.
That would limit fragmentation to *.db3 growth and -wal & -journal temporary storage (I don't think it is possible to relocate these elsewhere than the *.db3 directory?).

regards
Stavros.

Re: 6.80 RC7 - Header download - *.txt files instead of *.gz

PostPosted: Sat Mar 17, 2018 7:23 pm
by dexter
You can move the entire Newsbin Data folder but not just the spool folder. That's the largest part anyways. The procedure is here: http://help.newsbin.com/index.php/V660- ... r_location

Re: 6.80 RC7 - Header download - *.txt files instead of *.gz

PostPosted: Sat Mar 17, 2018 7:31 pm
by Quade
That would limit fragmentation to *.db3 growth and -wal & -journal temporary storage (I don't think it is possible to relocate these elsewhere than the *.db3 directory?).


Alternatively you could just ignore this issue since it's always been this ways since DB3 first started being used (which is a long time now).

Re: 6.80 RC7 - Header download - *.txt files instead of *.gz

PostPosted: Sat Mar 17, 2018 7:48 pm
by stavros
True about the -wal & -journal files, but volume of data and usenet has chnaged a lot since 'the old days' :-)

I can't currently compress the import folder as the drive has a cluster size of 64K (folder compression is only available on <= 4K).

I'll try the suggested info, thanks.

Re: 6.80 RC7 - Header download - *.txt files instead of *.gz

PostPosted: Sun Mar 18, 2018 1:58 am
by syshog
Quade wrote:DB3's will fragment. They get extended in dribs and drabs so, you're not going to get contiguous runs of disk space. It's just the nature of the beast.

As I mentioned before, you could turn on folder compression for the import folder if you want some compression of these files.



I wanted to say didn't even think of enabling folder compression that's the good one for the import folder. I haven't experienced any weird processor usage with RC6 like i did RC5. On to RC7 I'll bitch and moan if I run into a problem ;D Good work work will RC6.