New article: governments working with one hand tied when it comes to data on vulnerable groups

A new discussion paper in Policy Sciences, written by Leiden University researchers Annemarie Samuels and Sarah Giest, who is a core member of the Centre for BOLD Cities network, claims that governments are working with one hand tied when it comes to data on vulnerable groups. At the core of this paper is the idea that even though the volume of data has increased in recent years, the quality of the data in combination with potential known or unknown data gaps limits government’s ability to create inclusive policies. Simply put, having a lot of data does not necessarily mean that the data are representative and reliable or that governments are able to utilise them.

The primary data gap describes a scenario in which governments are aware of the fact that data is missing, but there are limited opportunities to fill this gap due to the lack of appropriate data. The paper gives examples for this by showing that the outputs from machine learning and other artificial intelligence analyses are limited to the accuracy of available data, which can have real-life effects in decision-making and public service delivery. The secondary data gap highlights a gap where data is available in different formats, such as social media data.

Sarah Giest
Sarah Giest (Leiden University) wrote the article
with her colleague Annemarie Samuels.

Giest and Samuels point towards issues with data quality and population representativeness using these datasets, exacerbating potential biases. Finally, hidden data gaps occur when datasets used for policymaking contain misrepresentation, bias or missing data without governments being aware. This is especially relevant in the context of machine learning outputs and artificial intelligence analyses. Given that vulnerable groups, such as ethnic minorities and elderly people, tend to produce less data and prove harder to access, they are especially affected by unawareness of data gaps in policymaking. 

Based on this, the paper highlights the fact that there is a danger that big data architectures potentially reproduce existing prejudices given the nature of the gaps and the awareness level of government towards them. This implies that in order to foster inclusive policymaking, governments need to understand existing gaps in the data as well as what they obscure and why in order to find solutions for adding additional knowledge through innovative and traditional ways of data collection. 

The full paper is open access and can be found here.

This article was originally published on the Leiden University website.