The Pew article apparently included (at the least) sites sourced in Wikipedia pages, news sites, tweets, and various government websites, including those from city or county level government. https://archive.ph/mFAJQ https://ghostarchive.org/archive/8istC
Here's part of their method:
To conduct this part of our analysis, we collected a random sample of just under 1 million webpages from the archives of Common Crawl, an internet archive service that periodically collects snapshots of the internet as it exists at different points in time. We sampled pages collected by Common Crawl each year from 2013 through 2023 (approximately 90,000 pages per year) and checked to see if those pages still exist today.
The Pew article apparently included (at the least) sites sourced in Wikipedia pages, news sites, tweets, and various government websites, including those from city or county level government. https://archive.ph/mFAJQ
https://ghostarchive.org/archive/8istC
Here's part of their method: