Webscraper 1
Requirements, basic design decisions, and technology choices.
2024LinkWebscraper 2
Create first gRPC service; will just echo for now.
2024LinkWebscraper 3
Start logging and reporting
2024LinkWebscraper 4
Create Linux service for scraper. Shop for DBs. Setup Prometheus.
2024LinkWebscraper 5
Change db to write-then-read with cache before returning response. Later will have a different worker do this.
2024LinkWebscraper 6
Pause implementation to consider different API patterns.
2024LinkWebscraper 7
After a break, added some extra documentation and tweaked some existing commands.
2024LinkWebscraper 8
Create backend worker. Played more with logging, trying fluent-bit. Switched from LevelsDB to sqlite to fix parallel access.
2024LinkWebscraper 9
Implemented proxy pools. Switch from expiration to freshness pattern.
2024LinkWebscraper 10
Migrate everything over to docker. Use an ELK stack.
2024LinkScraper 11
Switched from sqlite to file directory. Added compression.
2024Link