SilhouetteTell: Practical Video Identification Leveraging Blurred Recordings of Video Subtitles

Authors: Guanchong Huang (University of Oklahoma), Song Fang (University of Oklahoma)

Volume: 2026
Issue: 1
Pages: 470–485
DOI: https://doi.org/10.56553/popets-2026-0024

Download PDF

Abstract: Video identification attacks pose a significant privacy threat that can reveal videos that victims watch, which may disclose their hobbies, religious beliefs, political leanings, sexual orientation, and health status. Also, video watching history can be used for user profiling or advertising and may result in cyberbullying, discrimination, or blackmail. Existing extensive video inference techniques usually depend on analyzing network traffic generated by streaming online videos. In this work, we observe that the content of a subtitle determines its silhouette displayed on the screen, and identifying each subtitle silhouette also derives the temporal difference between two consecutive subtitles. We then propose SilhouetteTell, a novel video identification attack that combines the spatial and time domain information into a spatiotemporal feature of subtitle silhouettes. SilhouetteTell explores the spatiotemporal correlation between recorded subtitle silhouettes of a video and its subtitle file. It can infer both online and offline videos. Comprehensive experiments on off-the-shelf smartphones confirm the high efficacy of SilhouetteTell for inferring video titles and clips under various settings, including from a distance of up to 40 meters.

Keywords: Video inference, Subtitle analysis, Spatiotemporal feature extraction

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution 4.0 license.