Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match by hash in addition to filename #8

Closed
ryanfb opened this issue Mar 13, 2020 · 7 comments
Closed

Match by hash in addition to filename #8

ryanfb opened this issue Mar 13, 2020 · 7 comments

Comments

@ryanfb
Copy link

ryanfb commented Mar 13, 2020

The first time I tried this utility, I was curious as to why it didn't find any matches, since DAT files have hash information—it seems like it only uses filenames to find matches in the DAT file. I looked briefly at just trying to make a pull request for this, but I'm not sure if you think it would be out of scope. Python does have SHA1 functionality in hashlib, so it shouldn't introduce an external dependency.

In the meantime, I've written a very simple companion utility that will use the SHA1 hashes in a DAT to copy files into a new directory with the correct filenames that this utility expects: https://github.com/ryanfb/copydatrom

@andrebrait
Copy link
Owner

First things first: your utility's name is awesome 😎

Yes, you're right. It only matches file names and that's all. It's not out of scope per se, but it would indeed be an expansion over what this tool aimed to achieve in the first place. Which is not bad at all (feel free to open a PR if you want 😄).

Now, yes, I have thought about matching with the hash (and I even attempted something). The only issue is that there's a huge variety of ways people can organize and archive ROMs and I found it to be a bit hard to code something that would allow most people to use it. As far as I could see, it would have to:

  1. Be able to deal with:
    1. ZIP files (ClrMamePro style)
      1. With one ROM file inside
      2. With multiple ROM files inside (for games with multiple files, like PSX ones)
    2. Uncompressed ROMs inside a single directory
      1. Possibility of user keeping games with multiple files in the same directory
      2. Possibility of the user keeping games with multiple files in folders
    3. One-folder-per-game (ClrMamePro style)
      1. With one uncompressed ROM file inside
      2. With multiple uncompressed ROM files inside (for games with multiple files, like PSX ones)
  2. Still be relatively fast (so no O(n^2) scanning, I think)

While this is all easy to do (except maybe ZIP files), it takes some testing to get it right, and I lack the time right now.

@andrebrait
Copy link
Owner

Well, I kinda just did it.

It was easier than I thought, tbh.
I'll commit the changes in a bit

@andrebrait
Copy link
Owner

Well, would you do the honors of testing what I made? 😉
You should use the --use-hashes option.

@ryanfb
Copy link
Author

ryanfb commented Mar 13, 2020

I get the following error:

Traceback (most recent call last):
  File "generate.py", line 1224, in <module>
    main(sys.argv[1:])
  File "generate.py", line 832, in main
    file = file_relative_to_input(file, input_dir)
  File "generate.py", line 939, in file_relative_to_input
    return file.replace(input_dir, '', count=1).lstrip(os.path.sep)
TypeError: replace() takes no keyword arguments

Running under Python 3.8.0.

@andrebrait
Copy link
Owner

@ryanfb it should work now

@ryanfb
Copy link
Author

ryanfb commented Mar 13, 2020

Thanks! Seems to work fine now.

@ryanfb ryanfb closed this as completed Mar 13, 2020
@andrebrait
Copy link
Owner

I have refined the hash processing and sped it up too. And I also fixed a couple issues with the copied files's names.

I released it as 1.6.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants