NewsBin generates duplicates many times over

This is the place to help test and discuss Version 6 Beta releases.

NewsBin generates duplicates many times over

Postby RayMark » Sun Dec 22, 2019 11:22 pm

Encountered with 6.82 beta 3, but probably the same in RC4 (which still did not show up to test).

PROBABLY IGNORE ALL THIS:

Imagine this:
NewsBin is extracting a rar and it keeps getting these messages:
Verifying authenticity information ... Failed
but files inside the rar are OK, their CRCs are OK.
So then NewsBin creates those extracted files, it does not delete them (because their CRC OK), but still it begins extracting the rar again (because Verifying authenticity failed), and then it creates duplicates with -(0001),
and then it starts again and creates duplicates with -(0002), and then -(0003), and then -(0004), etc.
I don't even know how many times it does that, (is it 10?) but, if I remember it correctly, it once made perhaps 300 GB of output files out of a 50 GB rar, until I moved those rar parts to a different folder and deleted the post from the download list.
Then, when extracting manually I saw those
Verifying authenticity information ... Failed
and CRC OK messages.

A variety of this situation is when a rar contains multiple files and some of the files pass the CRC check and others - don't.
Then also multiple copies of files are created in multiple attempts to extract the rar.
This happens more frequently but it is less unpleasant, because usually the OK files are the small ones, such as .nfo files, and the large failed files are deleted because they failed the CRC check.
So the result is many -(000n) small files, but not much hard drive space consumed. But if the OK files are big and some tiny file fails - then a big problem again.

I don't know if it is difficult to fix those problems, but a workaround seems to be not to allow multiple attempts to extract the same rar file.
Perhaps I myself can control that by some flag in the NewsBin settings?
If I c'an't - then either fix the issue (do not extract again those files that had OK CRCs in the first attempt), or please add such a flag to allow only one rar extraction attempt.
Frankly, I don't even understand why multiple extraction attempts are needed. How can you expect a different result?
Isn't it the famous definition of insanity?

UPDATE:
Oh sorry, I now think that in the first case, with the failed authenticity verification, one of multiple big files inside the rar failed the CRC check after all.
So the authenticity verification perhaps is just a red herring.
There is only one case then - the second one. But sometimes it manifests itself in the more unpleasant form, when big files are created many times over, not just small ones.

END OF PROBABLY IGNORE ALL THIS

DO NOT IGNORE THIS:

If a big rar has multiple big files inside it and (at least) one of them (or some small file in the same rar) fails the CRC check (even if par2 check passes OK), all the good big files are extracted multiple times consuming more and more of the hard drive space (when auto-rename is on).

A less dramatic case, this one occurs rather frequently:
If a rar contains multiple files and one of them (usually the biggest one) fails the CRC check, then the rest of the files (usually the small ones) are extracted and created multiple times (when auto-rename is on).

So, the general case is:
If a rar contains multiple files and at least one of them fails the CRC check, the ones that pass the CRC check are extracted and created multiple times, assuming that auto-rename is on.
RayMark
Seasoned User
Seasoned User
 
Posts: 468
Joined: Sat Jul 21, 2007 10:40 pm

Registered Newsbin User since: 07/21/07

Re: NewsBin generates duplicates many times over

Postby Quade » Mon Dec 23, 2019 10:02 am

It does it once for each PAR and repair attempt. This is basically a situation where someone generated a PAR set over a set of bad RAR files so, there's no way it can ever complete properly. Newsbin just keeps trying.

It's too late to do anything about this in 6.82. I wrote a note to look at this next time around.

Renaming the existing files out of the way is desirable so, I'm not going to change that. I guess the only question is whether it's better to save a partial unrar that ultimately fails or not and whether an unrar failure when you still have PARS to potentially repair should be handled differently.

The question ultimately is whether these rare cases are significant enough to potentially break the good features that we normally want.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: NewsBin generates duplicates many times over

Postby RayMark » Mon Dec 23, 2019 9:06 pm

I also did not think it was important with multiple small files accumulating, but when it happened with big files, and the hard disk space was consumed, then it looks different.
There is no need to keep trying to repair with new and new par2 files, if par check/repair already returned OK. Unless you are expecting that the next par2 file will be from a different, fixed par2 set - yes, then I see why you are continue trying.
But I am not sure if I ever saw such a case. If somebody reposts fixed pars, the subject is probably different. Again, unless you are able to aggregate par2s together even with different subjects.
What I am asking is a flag in the settings or in the .nbi file so that I can switch off such retrying.Because I never saw such a case where it was useful. Although, again - I don't notice it if it is successful. Only failure is visible.
RayMark
Seasoned User
Seasoned User
 
Posts: 468
Joined: Sat Jul 21, 2007 10:40 pm

Registered Newsbin User since: 07/21/07


Return to Newsbin Version 6 Beta Support

Who is online

Users browsing this forum: No registered users and 2 guests