Webscraper 1

Requirements, basic design decisions, and technology choices.

2024Link

Webscraper 2

Create first gRPC service; will just echo for now.

2024Link

Webscraper 3

Start logging and reporting

2024Link

Webscraper 4

Create Linux service for scraper. Shop for DBs. Setup Prometheus.

2024Link

Webscraper 5

Change db to write-then-read with cache before returning response. Later will have a different worker do this.

2024Link

Webscraper 6

Pause implementation to consider different API patterns.

2024Link

Webscraper 7

After a break, added some extra documentation and tweaked some existing commands.

2024Link

Webscraper 8

Create backend worker. Played more with logging, trying fluent-bit. Switched from LevelsDB to sqlite to fix parallel access.

2024Link

Webscraper 9

Implemented proxy pools. Switch from expiration to freshness pattern.

2024Link

Webscraper 10

Migrate everything over to docker. Use an ELK stack.

2024Link

Scraper 11

Switched from sqlite to file directory. Added compression.

2024Link