Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with -p for large files (> 2 GB) #97

Open
simsong opened this issue Dec 7, 2012 · 0 comments
Open

Problem with -p for large files (> 2 GB) #97

simsong opened this issue Dec 7, 2012 · 0 comments

Comments

@simsong
Copy link
Collaborator

simsong commented Dec 7, 2012

Converted from SourceForge issue 3526071, submitted by jessekornblum

On WinXP (using the v3.9 binary from sf.net) with: "md5deep -p 2M < 8GB_file" I'm getting

HASH-VALUE... stdin offset 2139095040-2141192191
HASH-VALUE... stdin offset 2141192192-2143289343
HASH-VALUE... stdin offset 2143289344-2145386495
HASH-VALUE... stdin offset 2145386496-2147483647
HASH-VALUE... stdin offset 18446744073709551615-2097150
HASH-VALUE... stdin offset 18446744073709551615-2097150
HASH-VALUE... stdin offset 18446744073709551615-2097150
HASH-VALUE... stdin offset 18446744073709551615-2097150

(and so it continues until the end of the file). BTW, the same was not observed on a Linux system.

Furthermore there seems to be some other difference between Windows and Linux which might be related:
on WinXP (using the Windows "dd" tool from www.chrysocome.net/dd):
C:>dd if=/dev/zero bs=2M count=1 2> nul: | md5deep -p 2M
b2d1236c286a3c0704224fe4105eca49 stdin offset 0-2097151
d41d8cd98f00b204e9800998ecf8427e stdin offset 0-18446744073709551615
on Linux (e.g. TinyCoreLinux v3.6):
tc@box:~/md5deep-3.9$ dd if=/dev/zero bs=2M count=1 2> /dev/null | md5deep -p 2M
b2d1236c286a3c0704224fe4105eca49 stdin offset 18446744073709551615-2097150
d41d8cd98f00b204e9800998ecf8427e stdin offset 18446744073709551615-18446744073709551614

Date: 2011-05-19 11:45:37 PDT
Sender: jessekornblum

Hide
Ok. Glad to know we're at least partially there. There is a code freeze on
right now as part of a total re-write, but I will readdress this version
before putting out version 4.0. It may be a little while though! Sadly
md5deep is not my day job. [grin]. Sorry you're still having problems.

I think the problem is that, on WIndows, the function to say where each
read started, ftello, is actually a 32-bit function. It may be working
incorrectly when dealing with a 64-bit value.

Date: 2011-05-18 17:52:00 PDT
Sender: nobody

Hide
Thanks for the quick turn around.

I've done a re-test with v3.9.1-002 on WinXP and found that using STDIN
(i.e. input redirection or via a pipe) does now report correct piece
offsets.

BUT using the large file directly (i.e. "md5deep -b -p 2M 8GB_file") still
leads to
HASH-VALUE... FILE-NAME... offset 2143289344-2145386495
HASH-VALUE... FILE-NAME... offset 2145386496-2147483647
HASH-VALUE... FILE-NAME... offset 18446744073709551615-2097150
HASH-VALUE... FILE-NAME... offset 18446744073709551615-2097150
I'm sorry that I did not mention this as well in the original bug report,
as it was already happening then as well. Running the same test on Linux is
still looking good.

Likewise repeating the other test (i.e. "dd if=/dev/zero bs=2M count=2 2>
/dev/null | md5deep -b -p 2M") on both WinXP and Linux results now in
correct offsets in both cases.

So all up a clear improvement, but still one more bug left.

Date: 2011-05-17 11:37:04 PDT
Sender: jessekornblum

Hide
This problem is related to a bug with calling ftell() on standard input.
I've patched it in SVN, but have not released a new version just yet. Check
out the beta versions at http://jessekornblum.com/tmp/ and see if they fix
the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant