2 Open Data
In the European Union, EU High Level Expert Group on Artificial Intelligence defined artificial intelligence for policy purposes and laid out its suggested ethical guidelines7. It has also defined “high-risk” applications of AI that will be regulated on EU-level by the Proposal of the Artificial Intelligence Act (AI Act). The Fundamental Rights Agency of the European Union created a very intelligent document on the challenges of putting the regulation into practice to protect fundamental rights. This document is perhaps the first hands-on guide on how to actually implement AI regulations by regulators and courts8.
The EU’s concept if ethically applied and regulated AI is trustworhty AI. Because AI applications train algorithms from data, a complementing concept is trustworhty data. If an AI system learns from bad data, its actions will be accentric and potentially harmful. The Data Governance Act was published less than a month ago, and it has both sector agnostic and sector-specific rules. Another important measure, the Open Data Directive, is transposed into various Lithuanian laws9.
The European Union, unlike the United States, chose to create general, and not sector-specific AI regulation. Because cultural policy is regulated on national level, and because the EU High Level Expert Group made the questionable definition of high-risk AI with the exclusion of copyrights (although they are important and EU-treaty protected personal rights), currently there is no comprehensive EU policy on AI in the cultural and creative sectors and industries, where the Music Tech and the broader Art Tech scene belongs.
In the United States, where the regulation is less wide-ranging, but very specific, some industries, like banking, already have very powerful tools to combat gender or racial biases of algorithms. We can learn a lot from the American experiences, which are more problem focused. Our demo project with the Slovak music industry, supported by the Slovak Arts Council, was heavily influenced by Cathy O’Neil: Weapons of math destruction: how big data increases inequality and threatens democracy, a book that started similar dialogues in both sides of the Atlantic Ocean, and Data Feminism by Catherine D’Ignazio and Lauren F. Klein, which applies a critical approach to data ethics informed by the ideas of intersectional feminism, and can be easily adopted to a small enterprise, small country, or ethnic bias scenario10. These very popular and critical books show how earlier inequalities in the treatment of vulnerable groups (women or racial/ethnic minorities) can be enhanced by data-driven and AI applications.
Their most fundamental insight is that big data creates inequalities. Only very strong global corporations, leading universities and powerful governments can organize and finance systematic, large, high-quality data collections. This leaves small states and their citizens, where there are no global corporations, globally leading, well-endowed universities present with less power to collect data and train algorithms. Lithuanian music is mainly sold on the U.S.-based and regulated global platforms of YouTube (Alphabet/Google), Apple or Spotify. Their AI systems are trained on various global data resources and on their own, mainly American/Western European audiences. Currently there are no means to check if they treat Lithuanian artists similarly to U.S.-based artists supported by the global databases of major artists. We cannot be sure either if they treat women artists, or non-binary artists similarly to male artists.
The EU data policies, and EU innovation and science policies try to balance these inequalities with promoting data interoperability. Data interoperability means that various data assets can be assembled, integrated into larger data assets, like databases or datasets, that can be used, among other things, to better train algorithms with machine learning and deep learning, or to better test the potentially harmful biases of existing commercial system. A critically important, interrelated concept is metadata, information about how data should be used.
The Open Data Directive aims to foster the re-use of data that the private sector cannot collect (i.e. taxpayer funded data) for any uses in an interoperable way. The Data Governance Acts tries to encourage data altruism, where people voluntarily share their data for ethical purposes. Getting open and voluntary data sharing to assemble large enough datasets in Lithuania for ethical, trustworthy use is critical to level the playfield for the country, and its cultural and creative sectors, or its fledgling tech industries.