Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST] Add robots.txt to complement -e #137

Closed
mzpqnxow opened this issue Nov 24, 2020 · 2 comments · Fixed by #163
Closed

[FEATURE REQUEST] Add robots.txt to complement -e #137

mzpqnxow opened this issue Nov 24, 2020 · 2 comments · Fixed by #163

Comments

@mzpqnxow
Copy link
Sponsor

You already support -e which is a really nice way to dynamically produce the search list. Might it be worth grabbing robots.txt and parsing that and either just making sure the paths are in the wordlist, or maybe even treating them in some special way in terms of priority?

@mzpqnxow mzpqnxow added the enhancement New feature or request label Nov 24, 2020
@epi052
Copy link
Owner

epi052 commented Nov 25, 2020

i like it.

leaving this here for now, will give this more thought and a proper response a bit later.

(?:^User-agent: (?<UserAgent>.*?)$)|(?<Permission>^(?:Allow)|(?:Disallow)): (?<Url>.*?)$

@mzpqnxow
Copy link
Sponsor Author

Add to this one, sitemap.xml as a possible input as well. Nobody likes to parse XML (let alone XML that references external XML as many sitemap.xml files do) but it could be of tremendous value compared with robots.txt

It's a much bigger and more annoying effort though when compared with robots.txt since, as you pointed out, robots.txt can be "pared" using a basic regex and sitemap.xml is both XML and references additional XML files. Depending on how you like to plan and track work, it might be better as a separate issue

Or depending on priorities, maybe it should be indefinitely deferred. I realize you have you work cut out for you :)

I very much regret not having picked up Rust else I would be pitching in :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants