Our solutions use open source scientific software. We bring with our contributors the methodology and ethos of open collaborations in the community of open source developers, because we belive that this will result in high quality, fast development, and it facilitates coopeation among small and large entities.
Our observatories are ideally suited to provide the evidence for evidence-based policy design and control. We aim to find to apply the
Open Policy Analysis Guidelines, originally developed in the United States2, in a way that incorporates the best practices of open and reproducible scientific practices within the European Union3 and in the United Kingdom. The OPA Guidelines grew out of open science initiatives, such as the Berkeley Initiative for Transparency in the Social Sciences, the Data Access and Research Transparency (DA-RT) group, the Center for Open Science and their TOP Guideline, the Meta-Research Innovation Center at Stanford University.
Our OpenMuse project is the first large scale European application of the Open Policy Analysis in the policy context of Music Moves Europe. It aims to fill in as much as possible of the 7 identified data gap areas in the Feasibility study for the establishment of a European Music Observatory (in short: EMO Feasibility Study)4, and at the same time, to provide a cutting edge prototype to this future, EU-recognized observatory. We invite all representative stakeholders into this project so that they can take owernship after the prototyping phase.
Not all our users have to adhere to
Open Policy Analysis Guidelines. We believe that even a partial compliance with these guidelines and the use of our tools that help their application will lead to higher quality policy design, and more evidene-based, factual policy control.
We are experts on open data. In a European concept, open data means taxpayer funded resources that can be reused for commercial or non-commercial purposes, governed by the Open Data Directive, is transposed into various Lithuanian laws5. In the United States, such data is governed by the Freedom of Information Act.
Open data often cannot be ‘downloaded’, because it was not originally intended for public release. Our data observatories are building often complicated data pipelines and navigate legal environments to bring valuable, but commercially or academically never exploited data to the sunlight.
Open data is not always cheaper then private data: sometimes it is the only available source of the data, because only governments have the ability or the mandate to collect it. Often its use results in lower data aquisition costs, or higher data quality. The combination of proprietary data and open data leads to better information, and increases the usability and value of your proprietary data.
Using our experience in the area, we can tap into data from Europe’s taxpayer funded satellites, international social sciences surveys, even anonymized tax data.
The EU’s concept if ethically applied and regulated AI is trustworhty AI. Because AI applications train algorithms from data, a complementing concept is trustworhty data. If an AI system learns from bad data, its actions will be accentric and potentially harmful. The Data Governance Act of 2022 has both sector agnostic and sector-specific rules6.
We go at great length to maintain the highest quality of our data and to keep its integrity in dissemination. All our research artifacts, including datasets, visualizations, codebooks, tutorials, and long-form documentation is placed into independent repositories. They are versioned, have a globally unique DOI identifier, and can be always compared with an authoritative copy.
Our research products aim for the highest quality with reviewability and replicability. This means that we provide our users with a documentation of the data lifecycle, from its source till release, and we also give access to the algorithms and the actual software code that processed the data. Not only our datasets, but our data processing code and our mathematical or statistical algorithms are open for peer-review. In fact, we are organizing an ecosystem of open source statistical and scientific software around our observatories to facilitate further improvements in the timeliness, quality, (re-)usability and presentation of our data at all times.
The benefits of our open source software is less dependency on vendors, quicker solution to problems, very high documentation standards, and verified quality.
Our users want to remain competitive in commercial innovation or in the fierce competition for new scientific discoveries. For us, openness is an important, inclusive quality control measure, not a self-serving goal. We are not open data activitist and we are not evangelists of open source software. We want to make sure that if you use IBM’s SPSS, Microsoft’s Excel, or Apple’s Numbers, you will get a seamless data use experience. Our open source software is continously tested on a Windows, Mac/BSD and various Linux distributions. We are releasing our datasets, visualizations and documents in various, portable formats.
We aim to synchronize our data with reliable statistical services, and maintain human- and machine centerred APIs. Therefore, our data products follow the tidy data principles, and the data cube model of the Statistical Data and Metadata eXchange (SDMX) and the W3C consortium that maintains the main world wide web infrastrucutre. Such data has very high usability: it imports logically and easily into spreadsheet or statistical applications.
In our visualizations we aim to provide a similarly seamless reusability with our users’s favorite software suites. We disseminate text in various, easy to read formats, including HTML and EPUB, but upon request we can even export automatically into Word documents.
One of our goals is that our data products can be easily imported into any database, but we go beyond this goal. We aim to participate in the web 3.0, or the web of data, when our databases can automatically connect to other, trusted databases of statistical authorities, global libraries, or national heritage centers, and synchronize with the latest sources. Furthermore, we want to make sure that we synchronize the research products of our partners with the global knowledge system for greater impact.
We are developing two types of APIs to our data.
Our APIs designed for heavy, human-users offer CSV download or direct SQL or dbplyr (R) querries.
Our APIs designed for automted connection with other knowledge graphs and databases will be open for SPARQL querries and apply RDF.
Our RDF services are under development and will be released throughout the second half of 2022. As they are intended to synchronize with any well-designed and connected database, they are not connected to any database schema. This, in turn, makes our data particularly easy to import to any, pre-existing relational database.