Stars
The official Roboflow Python package. Manage your datasets, models, and deployments. Roboflow has everything you need to build a computer vision application.
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
🍰 Desktop utility to download images/videos/music/text from various websites, and more.
Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.
FUSE-based file system backed by Amazon S3
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…
A UiAutomator on android, does not need root access(安卓平台上的JavaScript自动化工具)
Hydro - Next generation high performance online-judge platform - 新一代高效强大的信息学在线测评系统 (a.k.a. vj5)
The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis
curl-impersonate: A special build of curl that can impersonate Chrome & Firefox
用于提取github-code-zip文件的内容,并保存为jsonl格式
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
经济学人(含音频)、纽约客、卫报、连线、大西洋月刊等英语杂志免费下载,支持epub、mobi、pdf格式, 每周更新
Megvii FILE Library - Working with Files in Python same as the standard library
远控免杀系列文章及配套工具,汇总测试了互联网上的几十种免杀工具、113种白名单免杀方式、8种代码编译免杀、若干免杀实战技术,并对免杀效果进行了一一测试,为远控的免杀和杀软对抗免杀提供参考。
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
Python version of the Playwright testing and automation library.
Python binding for curl-impersonate via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.