Fueling AI with public displays? A feasibility study of collecting biometrically tagged consensual data on a university campus

Fueling AI with public displays?

A feasibility study of collecting biometrically tagged consensual data on a university campus

Simo Hosio1 Andy Alorwu1 Niels van Berkel2 Miguel Bordallo López1,3
Mahalakshmy Seetharaman1 Jonas Oppenlaender1 Jorge Goncalves4

1University of Oulu
2Aalborg University
3VTT Technical Research Centre of Finland
4University of Melbourne

[paper] (pdf) [bibtex]

Abstract

Interactive public displays have matured into highly capable two-way interfaces. They can be used for efficiently delivering information to people as well as for collecting insights from their users. While displays have been used for harvesting opinions and other content from users, surprisingly little work has looked into exploiting such screens for the consensual collection of tagged data that might be useful beyond one application. We present a field study where we collected biometrically tagged data using public kiosk-sized interactive screens. During 61 days of deployment time, we collected 199 selfie videos, cost-efficiently and with consent to leverage the videos in any non-profit research. 78 of the videos also had metadata attached to them. Overall, our studies indicate that people are willing to donate even highly sensitive data about themselves in public but that, at the same time, the participants had specific ethical and privacy concerns over the future of their data. Our study paves the way forward toward a future where volunteers can ethically help advance innovations in computer vision research across a variety of exciting application domains, such as health monitoring and care.

Key Contributions

  • A dynamic, easy-to install setup to collect media files that are tagged with biometric metadata.
  • A feasibility study that analyses the collected material and highlights important contextual aspects that must be considered in future deployments.
  • Commentary and analysis of perceived ethical issues and potential new consent models that may be necessary in the future digital research ecosystems that exploit public displays as citizen-facing data collection interfaces.

System Design

We used a made-to-order desk with adjustable height and a circular wooden tabletop that hosts three Android tablet mounts. This setup makes it possible for 1–3 people to use the desk at the same time. Users cannot, however, easily see the screens of other users without consciously making an effort to peek by moving aside. We purchased a prepaid SIM card with unlimited data plan and used our own router, so that the deployment depended only on access to power and would not su￿er from WiFi outages or poor connection quality.

VideoSourcing Application

We designed an Android application to facilitate the data collection: VideoSourcing. VideoSourcing is designed to be run on tablet devices that would later on act as our public kiosk-sized displays.

Results

Over the course of 61 days (4 days of pilot study + 57-day field study), we collected a total of 199 selfie videos, corresponding to 3 videos per day. Further, we received 78 metadata submissions to supplement the videos. 63 of those left their email addresses, and of those 22 proceeded to provide online questionnaire responses (a 35% conversion ratio)

In order to statistically assess the quality of the faces collected in the videos and their usability for face biometrics and computer vision in general, we analysed them using a state-of-the-art face detector, based on the SSD-framework and ResNet as implemented in OpenCV.
The results of the automatic analysis show that:

  • 179 videos (90% of the total) show a detected face during at least one full second, and are thus considered useful for several machine learning tasks as training data
  • 113 videos contain a detected face during 100% of the duration of the video, 145 over 90% and 155 over 80%
  • 20 videos do not contain a single detected face and could be discarded from a possible face database build from our results

Bibtex

@inproceedings{10.1145/3321335.3324943,
author = {Hosio, Simo and Alorwu, Andy and van Berkel, Niels and L\'{o}pez, Miguel Bordallo and Seetharaman, Mahalakshmy and Oppenlaender, Jonas and Goncalves, Jorge},
title = {Fueling AI with Public Displays? A Feasibility Study of Collecting Biometrically Tagged Consensual Data on a University Campus},
year = {2019},
isbn = {9781450367516},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3321335.3324943},
doi = {10.1145/3321335.3324943},
booktitle = {Proceedings of the 8th ACM International Symposium on Pervasive Displays},
articleno = {14},
numpages = {7},
keywords = {ethics, computer vision, field study, public displays},
location = {Palermo, Italy},
series = {PerDis '19}
}