Ethical Data Design for Good Systems


We conceptualize data and its handling as the core elements in building good systems, recognizing that data and the systems that manage it are not neutral but rather implicated in a chain of decisions, organizational priorities, and social and professional norms.

Conventional practices of data and computer scientists do not always conform to the expectations of people and institutions handling data and making decisions based on their personal or institutional priorities. Designing and building good systems is an ongoing process fundamentally intertwined with what we call ethics data management.

Ethical values may mean different things to data producers, consumers, and to organizations implementing algorithmically-driven systems. The ethical frameworks that should guide data gathering, the “grist for the mill” of machine processes, and the systems that manage it, are spotty, subject to little oversight, few guidelines, and uneven monitoring and enforcement. Moreover, the complexities involved in large data aggregations, transformations, distribution, and reuse, and the limited capacity to validate ethical implications embedded in routine data practices make it difficult to track and prevent ethical breaches. We will investigate how data ethics can be a point of departure to designing and evaluating good systems.

By highlighting the contradictions and the pressure points in data practices from organizational, data management and systems perspectives our research charts new directions for ethical considerations. The research effort will yield concrete measurements of ethics-relevant evolution in a modern production code base.

Expected Outputs:

  1. A set of methods (toolkit) to identify ethical risks/components in data in relation to stakeholders.
  2. A set of methods (toolkit) to analyze ethical risks/components in open source data systems.
  3. Framework for ethical data design in AI applications including the complete AI Ethical Data Lifecycle Model and how to apply it and the methods across the lifecycle of an AI project.
  4. Research proposal building on our findings to extend analyses to other data sites.

Disinformation in Context: AI, Platforms and Policy


Our group will describe and evaluate aspects of social media systems involved in disinformation campaigns. It aims to provide a deep understanding of the historical, contextual and international processes of disinformation that move through AI and platforms such as Facebook and Twitter.

Our approach defines “good systems” as not only technological but also social, organizational and political. This notion of the “system” invokes an information environment with many moving parts. Artificial intelligence and machine-driven content creation and circulation are significant components of contemporary disinformation efforts.

Specific projects will include:

  1. two complementary studies–one quantitative, one qualitative–offering a deep analysis of the persuasive content and mimetic qualities in the Facebook ads released by the US Congress in 2018. Our early research suggests these ads have little to do with “fake news” and everything to do with evoking emotional responses that are amplified by social media platforms;
  2. a study that uses fieldwork to examine message circulation and reception properties from the perspectives of users who read FB ads in 2016-2017. The benefit of this study will be to better understand the triggers and environments that prompt people to actually act in response to social media ads; and
  3. an investigation of sampled Twitter data also linked to Russian accounts. Using a one year timeframe, this study will investigate emotional and mimetic qualities in these tweets and seek to explore the algorithmic qualities contributing to possible effectiveness.

Expected Outputs:

  • publishable conference and research papers from our studies
  • a semester-long course for undergraduates and graduate students
  • a campus-wide Disinformation Network that will enable a cadre of researchers to get to know one another and to share work
  • a workshop/conference in order to more broadly share our results and to contribute to the broader academic and industry discourse around AI-assisted disinformation
  • grant proposals to undertake work that can build on our findings

Finally, we note that Facebook, as one significant SM platform, is keenly interested in enlisting academic researchers to help them with problematic content and circulation possibilities and now works with the Social Science Research Council to provide access to its data. We plan to build on our contacts with Facebook in order to ensure our research yields usable, reliable, and valid results that link to the SSRC’s efforts.