How I improved consistency and performance in a Go crawler with retry logics and network tuning

Introduction wfind is a simple web crawler for files and folders in web pages hyerarchies. The goal is basically the same of GNU find for file systems. At the same time it’s inspired by GNU wget, and it merges the find features applied to files and directories exposed as HTML web resources. In this blog we’ll go through the way I improved consistency in this crawler, by implementing retry logics and tuning network and transport in the HTTP client....

September 4, 2023 · 12 min