CNB docs spider
This workspace contains a small async spider to check links on https://docs.cnb.cool/zh/ using httpx with HTTP/2 and the specific headers requested.
Files:
crawl.py - the crawler script. Produces report.json by default.requirements.txt - Python dependencies.Quick start (Linux / zsh):
python3 -m pip install -r requirements.txt python3 crawl.py --start-url https://docs.cnb.cool/zh/ --output report.json
The script will print a short summary and write a JSON report containing checked links and broken links.