Schedule | Crossroads of AI & Society Workshop

Program

Workshop Schedule

The workshop runs over two days, July 15–16, 2026. All times are local (Paris, CEST).

Day 1

Wednesday, July 15

Time	Activity	Location
08:15 – 09:00	Breakfast & Registration	Club de la Chasse ↗
09:00 – 09:15	Opening Remarks	Club de la Chasse ↗
09:15 – 10:00	Talk: Juba Ziani Data Sharing with Endogenous Choices over Differential Privacy Levels AbstractHide abstract Motivated by the rapid push to decentralize sharing of data, we study whether large-scale data sharing coalitions can form in a decentralized manner under differential privacy when players have heterogeneous privacy preferences. We first consider a fully decentralized data-sharing mechanism in which each player decides whether to participate and how much privacy noise to add locally to their sensitive data before sharing. Privacy choices induce a fundamental trade-off: higher privacy lowers individual privacy costs but reduces data utility and statistical accuracy for the coalition. These choices generate externalities across players, making both participation and privacy levels strategic. Our goal is to understand which coalitions are stable, how privacy choices shape equilibrium outcomes, and how fully decentralized data-sharing compares to a centralized, socially optimal benchmark when the number of players is large. We provide a comprehensive analysis across multiple privacy-cost regimes corresponding to different attack/observation models in differential privacy, showing that full decentralization is highly inefficient in terms of both social welfare and estimator accuracy. Surprisingly, we find that a simple partially decentralized mechanism (where players still retain participation agency, but a central designer chooses a fixed privacy noise level for everyone) closes this efficiency gap down to constant factors across all privacy-cost regimes.	Club de la Chasse ↗
10:00 – 10:45	Talk: Patrick Loiseau Geometry of Relaxed Fair Regression: A Unified Framework for Aware and Unaware Settings AbstractHide abstract Fairness-accuracy trade-offs are a central concern in the deployment of fairness-aware machine learning methods. When sensitive attributes are unavailable at inference time – the so-called unawareness setting, principled methods for obtaining accurate predictions under relaxed fairness constraints are largely missing. In this work, we address this gap by formulating regression under a demographic parity penalty as an optimal transport problem. Our framework unifies both the aware and unaware settings and characterizes optimal prediction functions via optimal transport maps, under both squared Wasserstein-2 and Total Variation penalties. These results reveal that the choice of penalty reflects fundamentally different fairness philosophies: the Wasserstein penalty induces a smooth, population-wide compromise, while Total Variation enforces exact parity for a subset of individuals. Building on these theoretical characterizations, we propose an algorithm that is simple to implement, computationally efficient, and consistently matches or outperforms state-of-the-art baselines on real-world benchmarks. Joint work with Marie Generali Lince, Vincent Divol, Rémi Flamary, Solenne Gaucher.	Club de la Chasse ↗
10:45 – 11:15	Morning Coffee Break	Club de la Chasse ↗
11:15 – 12:00	Talk: Raul Castro Fernandez Data Ecology: Understanding and Designing Data Ecosystems AbstractHide abstract Data shapes our world not only through personal data, but through every dataflow that determines what governments see, what firms model, what AI systems learn, and what organizations can do. Yet we lack a systematic account of what data does in these complex ecosystems. Without one, interventions remain partial or poorly targeted, while beneficial arrangements—rare-disease consortia, accountable government data access, compensation for data contributors—often fail to form. This talk introduces data ecology: a research program that studies and designs data ecosystems as systems of agents interacting through dataflows, drawing on tools from computer science, economics, law, and philosophy. The first part asks what data does, presenting the potential-effect function and the structural properties it implies: correlated spillovers, integration hubs, and dataflow dependence. The second part turns to design, focusing on three classes of data ecosystems: intra-organizational systems, illustrated by Pneuma, an agentic system for relational data work; cross-organizational systems, illustrated by a data escrow that makes data sharing programmable; and data-sharing markets, illustrated by consortia protocols for multi-party pooling. I will close by sketching the larger program and several pieces of in-progress work.	Club de la Chasse ↗
12:00 – 13:45	Lunch Break (on your own)	—
13:45 – 14:30	Talk: Federico Echenique Social choice, aggregation, and learning AbstractHide abstract I will describe three results related to social choice, interpersonal utility comparisons, and its implications for learning and alignment. The first result avoids interpersonal comparisons by introducing utilitarian aggregation of random choice models. I will state a choice-theoretic analogue of the Harsanyi utilitarian aggregation theorem. The second result proposes a principled common utility yardstick for making fair social choices. The common yardstick is, we argue, a practical solution when interpersonal utility comparisons are needed. The third uses response times to learn a utilitarian utility representation. By assuming the DDM, a behavioral model that gives response time an explicit role in decision making, we obtain the discipline needed to estimate the utilitarian average.	Club de la Chasse ↗
14:30 – 15:15	Talk: Kai Hao Yang Non-Discriminatory Personalized Pricing AbstractHide abstract A monopolist offers personalized prices to consumers with unit demand. Consumers differ in their values, costs, and protected characteristics---such as race or gender. The seller is subject to a non-discrimination constraint: consumers with the same cost, but different protected characteristics must face identical price distributions. Such regulations are present in markets like credit or insurance. We characterize the optimal pricing rule. Under this rule, surplus accrues to both protected groups, but only to those with intermediate values. Strengthening the constraint to cover transaction prices redistributes surplus, harming the low-value group and benefiting the high-value group. Meanwhile, prohibiting the use of protected characteristics as pricing inputs instead of regulating outputs harms the low-value group.	Club de la Chasse ↗
15:15 – 15:45	Afternoon Coffee Break	Club de la Chasse ↗
15:45 – 16:30	Talk: Ali Makhdoumi Data Markets in the Age of AI: Protecting Users Through Regulation and Agency AbstractHide abstract Modern AI runs on data, which is increasingly bought and sold as platforms collect information from their users and provide it to advertisers and other third-party buyers. This trade powers valuable services and fuels innovation, but it also imposes various costs on the users including privacy costs. This talk takes a game-theoretic and mechanism-design view of that tension and presents a few problems in data economics. I will then focus on two complementary routes to aligning data markets with user welfare. The first models a three-layer data market in which users, platforms, and data buyers interact in a multi-stage game. We establish that platform competition tends to benefit the buyers rather than the user, so the usual intuition that competition helps consumers may not hold here. We also establish that the regulation best for users is often not a complete sharing ban but a non-uniform privacy mandate. The second gives users agency directly: a mechanism-design approach in which privacy-sensitive users report how much they value their privacy, receive (differential) privacy guarantees tailored to that preference, and are compensated accordingly. Here, we characterize the optimal mechanism that finds both the optimal minimax estimator (which delivers the promised privacy guarantees while minimizing estimation error) and a scheme that incentivizes truthful reporting on the user side. We then compare central and local architectures to deliver privacy. Based on joint works with Daron Acemoglu, Saeed Alaei, Alireza Fallah, Michael I. Jordan, Azarakhsh Malekian, Ali Daei Naby, and Asuman Ozdaglar.	Club de la Chasse ↗
16:30 – 17:15	Talk: Ruta Mehta Fairness and Incentives in Federated Learning AbstractHide abstract With the advent of generative AI, the paradigms of data sharing become crucially important for both economic and welfare reasons. Federated learning (FL) offers an effective paradigm for sharing rich, distributed data while protecting data privacy. Nonetheless, the heterogeneous nature of distributed data makes it challenging to define and ensure fairness among local agents, creating incentive issues. For instance, intuitively, if not compensated properly, an agent with high-quality data may not be incentivized to participate if the data of others is of low quality. Furthermore, on the one hand, agents benefit from the global model trained on shared data. On the other hand, by participating in federated learning, they may also incur costs (related to privacy and communication) due to data sharing. In this talk, I will attempt to take a social choice and game theoretic perspective to address these fairness and incentive issues. In this process, I will show how FL and SCT can inform each other, leading to newer insights and avenues.	Club de la Chasse ↗
17:15 – 17:30	Walk to Rice Global Paris Center	—
17:30 – 19:00	Poster Session & Reception (Note: change of location)	Rice Global Paris Center ↗

Day 2

Thursday, July 16

Time	Activity	Location
08:15 – 09:00	Breakfast	Club de la Chasse ↗
09:00 – 09:45	Talk: Rasmus Pagh Consistent Release of Hierarchical Data Under Differential Privacy AbstractHide abstract Statistical data such as census data is often released in hierarchical form, with counts reported at multiple geographic levels such as blocks, municipalities, regions, states, and the nation as a whole. Differential privacy provides strong protection for individuals represented in such data by adding random noise to released counts. However, independent noise addition leads to inconsistencies: the reported count for a region need not equal the sum of the reported counts for its subregions. In joint work with Lebeda and Sejer, we show that consistency and improved accuracy can be achieved simultaneously through a direct recursive approach based on optimal matrix factorizations. Compared with the Gaussian mechanism at the same privacy guarantee, our method can reduce variance by up to a factor of three while producing consistent releases by construction. This improves upon previous “consistency by post-processing” approaches. We also present lower bounds suggesting that the method is optimal at least for some hierarchies, and show how it extends efficiently to sparse vector data. Finally, we discuss implications for practical deployments of differential privacy, including the release of U.S. Census statistics.	Club de la Chasse ↗
09:45 – 10:30	Talk: Edwige Cyffers Various Ways to Set Privacy Budgets AbstractHide abstract In Differential Privacy (DP), the privacy budget quantifies how strong the protection provided by the mechanism is. A recurring criticism is that the privacy budget is not easily interpretable, and that it is therefore hard to communicate and understand what is effectively guaranteed by DP. We argue that this limitation is not specific to DP, and that quantifying the risk does not lie only in the privacy budget, but also in the careful definition of the output and the adjacency relationship. A popular first mitigation is to consider personalized privacy budgets depending on user expectations. We show that, for mean estimation, the gain is limited in comparison with a unique, carefully chosen privacy budget. We then propose a new way to interpret the privacy budget by considering the long-term effect of privacy leakage on users' trust: we show that if data leakage induces users to drop out of training, a non-trivial amount of privacy may lead to higher accuracy in the long term.	Club de la Chasse ↗
10:30 – 11:00	Morning Coffee Break	Club de la Chasse ↗
11:00 – 11:45	Talk: Antti Honkela Towards Privacy Standards for AI in Health AbstractHide abstract Privacy is widely regarded as an essential requirement for AI in health applications, yet implementing it in practice at scale is far from trivial. In my talk, I will discuss challenges in the implementation of privacy in the context of secondary use of health data in the European Health Data Space (EHDS) as well as steps toward possible solutions. These include the calibration of the privacy–utility tradeoff, quantifying and communicating the privacy of common algorithms as well as external verification of privacy promises.	Club de la Chasse ↗
11:45 – 13:45	Lunch Break	On your own
13:45 – 14:30	Talk: Ravi Kumar Title & abstract — TBD	Club de la Chasse ↗
14:30 – 15:15	Talk: Eric Mazumdar A Behavioral Foundation for Multi-Agent Learning AbstractHide abstract Emerging applications in AI are fundamentally multi-agent, yet little guidance exists for designing agents for strategic settings. Indeed, classical game theoretic concepts like Nash equilibria are provably intractable and have failed to yield the algorithmic foundations needed for scalable multi-agent learning. In this talk, I will show, using both theory and experiments, that introducing strategic robustness into games provides principled foundations for multi-agent learning. First I will show how strategically robust agents give rise to new game-theoretic equilibria that are provably robust and computationally tractable across all games. This allows us to develop principled and scalable multi-agent learning algorithms. Beyond the computational benefits, these approaches also yield surprising free lunches. In cooperative settings, our agents can achieve outcomes strictly better than Nash, exhibit less free-riding, and collaborate more consistently with new partners—with empirical gains extending even to preliminary experiments on collaborative tasks between LLMs. These results suggest that moving beyond classical game-theoretic concepts can provide new foundations for principled and scalable strategic decision-making.	Club de la Chasse ↗
15:15 – 15:45	Afternoon Coffee Break	Club de la Chasse ↗
15:45 – 16:30	Talk: Steven Wu The Agentic Garden of Forking Paths AbstractHide abstract AI agents are reshaping data science by automating both machine learning research and data analysis in empirical research. This talk examines what happens when agents enter two classical sources of statistical concern: adaptive reuse of held-out data and the garden of forking paths. In the first part, we use autonomous research agents to explore why repeated benchmark use often does not lead to severe overfitting, showing that successful agent-discovered strategies can be compressed into very short prompts and reproduced by fresh agents. In the second part, we use many AI analysts to explore how different defensible analytical choices can lead to different conclusions, while also revealing how easily agentic analysis can be steered toward selective reporting. Together, these results suggest that agents can help map and stress-test scientific workflows, but also demand new tools, audits, and standards for validity in the agentic era.	Club de la Chasse ↗
16:30 – 17:15	Talk: Clément Canonne Verification of Statistical Properties from Sensitive Data AbstractHide abstract Analyzing (very) large datasets to build accurate models is the workhorse of machine learning and underlies most of the advances in AI/ML over the past decades. These datasets are increasingly seen as valuable assets, e.g., due to the difficulty in obtaining them (sensitive, regulated, or carefully curated user data), generating them (compute-heavy processes), or trusting them (poisoning attacks). For companies owning such datasets, this leads to a thorny issue: how to convince interested customers that a dataset is reliable and has the application-specific statistical properties they need, while revealing as little data as possible? In this talk, I will discuss a recent line of work, from the theoretical computer science community, aimed at designing principled approaches to address this problem.	Club de la Chasse ↗
17:15 – 17:30	Closing Remarks	Club de la Chasse ↗