Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set default user agent #201

Merged
merged 1 commit into from
Apr 20, 2015
Merged

Set default user agent #201

merged 1 commit into from
Apr 20, 2015

Conversation

benbalter
Copy link
Contributor

This pull request sets the default user agent as Mozilla/5.0 (compatible; HTML Proofer/2.1.0}; +https://github.com/gjtorikian/html-proofer). The user agent screen is based off of GoogleBot's user agent format. This should allow us to bypass Typhoeus-specific blocks on WordPress and Drupal.

Via @gjtorikian over in #200 (comment):

What are the ethics involved in my changing the User-Agent to something non-static to avoid being blacklisted again?

I believe as long as we're honest, and identify as HTML-Proofer (which can be blocked if that is in fact the desired behavior), we're fine. I don't think we should e.g., have a random string as a user agent hash. I suspect the services are blocking Typhoeus to prevent scraping, which is not what we're doing here.

Fixes #197. Fixes #200.

@gjtorikian
Copy link
Owner

Thanks man. I'll talk about the test failure over in #202.

gjtorikian added a commit that referenced this pull request Apr 20, 2015
@gjtorikian gjtorikian merged commit c6f5ab3 into gjtorikian:master Apr 20, 2015
@benbalter benbalter deleted the user-agent branch April 20, 2015 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

failed: 0 Server returned nothing (no headers, no data) 403 error for drupal.org domain
2 participants