You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been using locar to synchronize and manipulate data on a huge vast cluster which can only be operated at full I/O speed with lots of parallel transactions. By quickly parallelizing over my deep and wide directory structure, I was able to get speedups just as you advertise...
BUT. I just discovered that locar is sometimes missing a huge 30%-90% of the files.
Turns out, it appears that the vast NFS implementation only fills in DT_TYPE in the dirents structure for the first ~10,000 entries in a directory, and locar does not correctly handle this case (it unfortunately needs to fall back to stat() with the extra syscall). See https://stackoverflow.com/a/39430337/381313 for a description of the caveats of using DT_TYPE in dirents.
This means locar may be failing to descend into some directories, though it seems that at least in my case all these DT_UNKNOWN entries are indeed regular files, permitting use of --all as a workaround for me.
The text was updated successfully, but these errors were encountered:
I have been using
locar
to synchronize and manipulate data on a huge vast cluster which can only be operated at full I/O speed with lots of parallel transactions. By quickly parallelizing over my deep and wide directory structure, I was able to get speedups just as you advertise...BUT. I just discovered that
locar
is sometimes missing a huge 30%-90% of the files.Turns out, it appears that the vast NFS implementation only fills in
DT_TYPE
in the dirents structure for the first ~10,000 entries in a directory, andlocar
does not correctly handle this case (it unfortunately needs to fall back tostat()
with the extra syscall). See https://stackoverflow.com/a/39430337/381313 for a description of the caveats of usingDT_TYPE
in dirents.This means
locar
may be failing to descend into some directories, though it seems that at least in my case all theseDT_UNKNOWN
entries are indeed regular files, permitting use of--all
as a workaround for me.The text was updated successfully, but these errors were encountered: