@aartaka I take tbe mozilla "readability" approach in @agregore
It also listens to the user preferences in width and color scheme / font which gets set browser wide.
Readability isn't ideal though and sometimes misses content. In my experience.
Been thinking of making my own based on this web scraper tool I made recently.
@aartaka I'm currently doing everything in JS with the built in DOMParser API. 😅
I think if I come up with a decent algo it'd be easy enough to parse to other formats.
At one point I had code to convert from HTML to markdown for a TUI browser which was silly and fun. 😸
@mauve yes, making a portable algorithm / recipe for page debloating is the priority in that. Implementations come second.
@mauve yes, Readability is imperfect due to its focus on plain long form articles. Needs remixing if we’re to do something more generic with it.
There is a niche for a “website cleanup” scraper / simplifier, and I keep stumbling into it. Maybe make a C library doing HTML simplification? 🤔