Probing the third-party infrastructure of digital news on the Web
Authors: Ido Sivan-Sevilla (The University of Maryland), Parthav Poudel (The University of Maryland)
Year: 2025
Issue: 2
Pages: 74–81
Abstract: The wide spread of disinformation across news websites makes fast, efficient, and reliable detection of untrustworthy content more important than ever. Conventional methods for detecting fraudulent content on the Web rely on content- or social network-based analysis. In contrast, we build on previous work to further explore whether the features and attributes of the third-party request structures of websites can be used at scale to distinguish between fake and real news content on the Web. We crawled 5,478 real and fake news websites that are already labeled by NewsGuard, on a daily basis, over the course of seven months, and collected data on their changing third-party structure, extracting static and temporal structural features. We show promising accuracy results for our Random Forest prediction model, solely based on structural features. We also reveal several key indicators in websites' structural trees, including (1) higher 7-day average of resource requests per node, and (2) a greater maximum breadth of resource request trees, that are likely to indicate trustworthy content. Our method can be used to complement current content- and social-network-related prediction methods when they are indecisive about fake news content on the Web.
Copyright in FOCI articles are held by their authors. This article is published under a Creative Commons Attribution 4.0 license.
