Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This code file: crawl4ai/web_crawler.py needs minor changes #133

Open
vignesh1507 opened this issue Oct 4, 2024 · 1 comment
Open

This code file: crawl4ai/web_crawler.py needs minor changes #133

vignesh1507 opened this issue Oct 4, 2024 · 1 comment

Comments

@vignesh1507
Copy link

  1. Redundant kwargs in fetch_pages: The kwargs being passed in executor.map seem redundant, as they are being unpacked in the same format for every call. You can simplify this by passing **kwargs directly to the fetch_page_wrapper.

  2. Potential for None Values in process_html: When calling process_html, if html is None (for instance, if the crawl fails), you may run into issues. Ensure that html is valid before passing it to process_html.

  3. Missing import json: You use json.dumps in your code but haven't imported the json module. Make sure to add this import at the top:

import json

@vignesh1507
Copy link
Author

#134 fixed the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant