💻 Epigrams on Web Scraping
First Published: 2023-04-20
- Never climb the tree when you can shoot down it's apples
- Web scraping is a cat and mouse game
- A status sidecar is the only reasonable way to monitor continued effectiveness
- Regex will never parse all of html
- Regex will always parse a subset of html
- In web scraping, all invariance is ephemeral. Embrace it
- Don't build too high on fragile grounds
- Talking over the network, algorithmic speed becomes irrelevant
- Web scraping is the combination of web requests and html data extraction
- Web drivers are a sign of weakness, spiders are a sign of strength