• Aucun résultat trouvé

Issues in Ethical Data Management - Extended Abstract

N/A
N/A
Protected

Academic year: 2021

Partager "Issues in Ethical Data Management - Extended Abstract"

Copied!
3
0
0

Texte intégral

(1)

HAL Id: hal-01621687

https://hal.inria.fr/hal-01621687

Submitted on 25 Oct 2017

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Issues in Ethical Data Management - Extended Abstract

Serge Abiteboul

To cite this version:

Serge Abiteboul. Issues in Ethical Data Management - Extended Abstract. PPDP 2017 - 19th

International Symposium on Principles and Practice of Declarative Programming, Oct 2017, Namur,

Belgium. �hal-01621687�

(2)

Issues in Ethical Data Management Extended Abstract

Serge Abiteboul

Inria, ´ Ecole Normale Sup´ erieure, Paris, France October 23, 2017

Abstract Data science holds incredible promise of improving people’s lives, accelerating scientific discovery and innovation, and bringing about positive soci- etal change. Yet, if not used responsibly, this technology can generate economic inequality, destabilize global markets and worsen systemic bias. We consider issues such as bias and violation of data privacy in data analysis. We discuss desirable properties of data analysis such as fairness, transparency, neutrality, and diversity. Our goal is to draw the attention of the computer science com- munity to the important emerging subject of responsible data management and analysis. We present our perspective on the issue, and motivate research direc- tions.

The data management research field has traditionally been driven primarily by enterprise data and focused on developing more and more sophisticated data models, and tackling issues such as performance and reliability. We believe that, in the future, the field will be increasingly driven by personal and social data, and will need to deal with challenging ethical issues. We will have to design concepts and principles to guide us in determining which behaviors, in data management, help us and which are harmful.

To move computer science towards more responsible data analysis, the first issue is that of specifying desired properties. To make it operational in comput- ers automatic decisions, the policy has to be made precise, to be formally stated in technical language. The computers role is also to help control the properties of data analysis by providing: tools to collect and analyze data responsibly, and tools to verify that data analysis was performed responsibly. To check the behavior of a program, one can either analyze its code (in the style of proving mathematical theorems) or test its effect (in the style of the experimental study of phenomena such as climate or the human heart). Both approaches have been intensely investigated, e.g., for guaranteeing security or performance, but much less for enforcing ethical properties such as fairness.

1

(3)

The research on data provenance is certainly relevant, beyond transparency.

Also, the use of open data and open software (when possible) simplifies the task of verification.

Many societal and political disputes today are related to computer science and in particular to the management of data. Governments can ban improper behaviors and encourage ethical approaches via laws and regulations. If properly educated, users can leverage the power they have as consumers to choose the software they want to use, individually or via associations. An important role of computer scientists who understand the issues is to explain them to the general population, be they lawmakers, deciders, or simple citizens.

The computer science research community has brought fantastic new tools.

Now is the time to learn how to perform this in ethical ways, so that it can best benefit the entire society.

Acknowledgments

The ideas presented here originate primarily from works with Julia Stoyanovich.

In particular, they emerged from discussions with her and Gerome Miklau while preparing a tutorial for EDBT

1

, as well with Gerhard Weikum while organizing a Dagstuhl workshop

2

. Finally, they have been influenced by works around the Fides Platform

3

with Julia, Gerhard and Gerome, together with Bill Howe, and Arnaud Sahuguet. I would also like to thank Am´ elie Marian, Benjamin Andr´ e and Daniel Kaplan for discussions on Personal Data Management, and the members of the French Digital Council, in particular Val´ erie Peugeot, for lively discussions on topics such as neutrality, computer science education, and digital inclusion.

1

Data Responsibly: Fairness, Neutrality and Transparency in Data Analysis. EDBT 2016

2

Data, Responsibly, Dagstuhl Seminar 16291.

3

Fides: Towards a Platform for Responsible Data Science. SSDBM 2017

2

Références

Documents relatifs

uncertainty with flux transport modeling (e.g., such as the uncer- tainties in the meridional drift or di ff erent rotation rates or lack of observations on the solar far-side), the

In the natural and social sciences, where data analysis has long been the cornerstone of discovery, conclusions must be based on reliable data and robust analysis methods that are

We present the API for MUSICNTWRK, a python library for pitch class set and rhythmic sequences classification and manipulation, the generation of networks in generalized music and

Pro Fribourg demande à la ville de Fribourg pour la mise en zone constructible un plan offrant des garanties quant à la réversibilité du projet, aux qualités architecturales de

In the third model there is one contract per controller that is shared for multiple subjects, the con- tract includes only the general privacy conditions of each controller without

We thus have a relevant terrain, very broad, with information of all sorts, hypertext links towards the exterior of the Apple site, and internal links towards other software which

The reasons behind using a double quantum dot are twofold : 1) Since the two dots are separated by few hundreds nanometers (this is much larger than the displacement of the

3.43 [Simulation] Deux fonctions réalisables avec des lentilles à gradient d’indice SELFOC R. L’inversion de l’image est dé- montrée par les figures b) et d). Ces lentilles