European Commission white paper on artificial intelligence: our response

Our response to the European Commission’s white paper on artificial intelligence, including its proposals for future regulation of AI products and services in the European Union.


Training data and algorithms

We agree that training data is fundamental to how AI machine learning applications perform, and that "measures should therefore be taken to ensure that, where it comes to the data used to train AI systems, the EU's values and rules are respected, specifically in relation to safety and existing legislative rules for the protection of fundamental rights". We also agree on the importance of the three key considerations listed by the White Paper: safety, bias and discrimination, and privacy.

We note, however, that assessing the quality, diversity, and general fitness for purpose of a training dataset is a complex task, with implications for the skills that will be required from regulators and developers. SMEs are at a double disadvantage, in terms of access to large, high-quality datasets (as opposed to large incumbents such as Google and Facebook), and access to the skills and resources required to critically assess them and take remedial measures (such as correcting for bias).

It is also intrinsically difficult to assess whether a training dataset is "good enough", in terms of how its characteristics will translate into an AI application making decisions that are safe and protect fundamental rights. Clearly, a biased training dataset is unlikely to lead to good real-world performance, or adequate protection of those rights. But apparent "diversity" in the training data is no guarantee that the resulting (for instance) neural network makes non-discriminatory decisions, or is generally fit for purpose. It would also be difficult to ascertain that a dataset covers all potential "dangerous" scenarios, as those are typically uncovered after the event. As pointed out in the White Paper, there will be a need to also verify the "relevant programming and training methodologies, processes and techniques used to build, test and validate AI systems" – not only the training data in isolation. This will place further demands on the skills of regulators, particularly if they are required to inspect the details of algorithms and underlying mathematics.

The implementation of remedies for training data will also pose challenges. The Commission suggests that a possible remedy is "re-training the system in the EU in such a way as to ensure that all applicable requirements are met." However, it is unlikely that training a system in the EU would be either necessary or sufficient to meet those requirements. Such a remedy also raises the issue of how developers would access appropriate EU datasets for their AI application. Would such data be made available through the Commission, or would it be necessary for firms to procure such datasets from third party companies or collect data independently? This might put certain businesses, such as SMEs, at a disadvantage.

Contact

Email: ai@gov.scot

Back to top