Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sherlock Restructure (Take 2) #590

Merged
merged 51 commits into from
May 7, 2020
Merged

Sherlock Restructure (Take 2) #590

merged 51 commits into from
May 7, 2020

Conversation

hoadlck
Copy link
Contributor

@hoadlck hoadlck commented Apr 23, 2020

This is take 2 for the Sherlock restructure.

Because this restructure has been going on for so long, I think it would be wise to merge it now. There are some improvements, and it would be good to have this on the main line. Also, since it is incrementally closer to the final design, it would be easier for others to help.

I may have a couple of minor changes to make after this, but a logical set of the changes are done. The main aspect that has to be developed is the abstraction for the queries. However, the notify abstraction is done, and that is a big piece.

Some areas to note:

  • The command to invoke Sherlock from the main directory has changed
    The command is python3 sherlock instead of python3 sherlock.py. This is part of making
    sherlock a proper module.
    You can cd into the "sherlock" directory and run the same old command.
  • When sherlock is installed as a proper module, the "sherlock" sub-directory will be
    what is actually packaged up. This will make the structure of the packaged and the
    unpackaged content similar enough that people will understand what is happening.
  • Tests require that you cd into the "sherlock" directory.
  • The data.json file is now located in a lower directory.
    It is in the sherlock/resources directory.
  • The new QueryNotifyPrint() object still needs some work.
    To make the changes more incremental, I moved the loose function from the main
    sherlock.py file to the notify.py file. But, these functions will be removed and the
    content pulled into QueryNotifyPrint().
    The only thing not supported in this refactor is the exception text for an error status
    when we are in verbose mode. This is an area of future work: I think exception
    information like this would more properly be handled by the logging module.

I expect that the structure changes are going to cause a wave of chaos, as people who were used to using sherlock in a certain way will now need to invoke it differently. And, if anyone is referencing the data.json file remotely, their link will no longer work as the file has moved around. But...it has to happen some time. The future changes will not be noticeable to users on the outside.

…ry will look very much like what the packaged version of Sherlock will look like when it is installed in the site-packages area.

No real restructuring of the code has happened.  This just gives a view of the directory structure.
…reating new one for Tor requests. This just wastes time.
…t timing information. Fix problem where timing hook would not be installed properly if the hooks for the request was already filled out with a tuple. I am not sure if that is even possible, but if it does happen, then I just convert the tuple to a list and go on from there.
…aky things during leap seconds or daylight savings times jumps.
… detection. Remove special check for GitHub: everything works fine without it.
While doing the restructuring, I am testing in more depth as I change the code. And, I am trying to grok how the proxy options work. Specifically, how the proxy list works. Or, does not work.

There is code in the main function that randomly selects proxies from a list, but it does not actually use the result. This was noticed in #292. It looks like the only place where the proxy list is used is when there is a proxy error during get_response()...in that case a new random proxy is chosen. But, there is no care taken to ensure that we do not get the same proxy that just errored out. It seems like problematic proxies should be blacklisted if there is that type of failure.

Moreover, there is a check earlier in the code that does not allow the proxy list and proxy command line option to be used simultaneously. So, I can see no way that the proxy list has any functionality: if you do define the proxy list, then there is no way to kick off the general request with a proxy.

I also noticed that the recursive get_response() call does not pass its return tuples back up the call chain. The existing code would never get any good from the switchover to an alternate proxy (even if the other problems mentioned above were resolved).

For now, I am removing the support.  This feature may be looked at after the restructuring is done.
…so, print out social network for error messages.
…object contains an enumeration for the possible status about a given username on a site, and additional error information that might be handy. Rework all code to use this object instead of the "exists" key in the result dictionary that was used previously.
…eption will be thrown, instead of using the previous site's results.
… the information loaded from the JSON file. For now, use the new SitesInformation() object to calculate the original JSON dictionary: the rest of the code will be updated in the future.
# Conflicts:
#	sherlock/resources/data.json
…e with statement for results file, as that is more graceful on errors. Use try block for result directory creation: this has a smaller window for a race condition.
…e list of names of the sites (sorted by alphabetical or popularity rank).
# Conflicts:
#	README.md
#	sherlock/sherlock.py
… (previous order was confusing). Also use "claimed" and "available" terms when reporting test results.
…llow whoever defines a Query Notify object to have all of the context required to do their notifications.
Added start and finish methods to base QueryNotify() object in order to get the same type of output.

The only thing not supported in this refactor is the exception text for an error status when we are in verbose mode.  This is an area of future work: I think exception information like this would more properly be handled by the logging module.
@sdushantha
Copy link
Member

@hoadlck Sorry for the delay, I thought this PR still had the "do not merge" tag, thats why I did not look at it for a while.

I will merge it now 👍

@sdushantha sdushantha merged commit 7a0047b into master May 7, 2020
@sdushantha sdushantha deleted the restructure_take2 branch August 5, 2020 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants