Realizing Trustworthy AI – Third Requirement: Privacy and data governance

Privacy and data protection

Today we are talking about the third requirement: privacy and data governance. Privacy is always closely linked to the previous requirement “Technical robustness and safety” while data governance focuses on the quality and integrity of the data used. The true power of artificial intelligence is the inference of traits, behaviors, sexual orientation, political views out of sometimes unrelated data. Thus the privacy protection needs to not only extend to the data provided, but also towards the inferred information created!

Quality and integrity of data

It can not be said enough, but if you feed a smart system with bad, false or outdated data, no AI system will be able to create a correct answer out of these data. “Bad data in – bad data out” is a standard saying in the data science community. For that reason I have created a project management framework, where data quality and data engineering are prominently discussed to avoid these problems. If you have questions in this regard, please let me know.

Access to data

This topic is also part of the governance strategy of a company. Mostly companies focus more on access of folders and groups than access of data. At first sight this might seem the same, but reality shows that data is easily moved from one folder to another or linked in the backend or shared via local storages or via private cloud storages. In order to prevent this, a company wide data storage policy needs to be in place first. Secondly all existing data needs to be viewed and then grouped according to its criticality. It is surprising how little companies know about their existing data and the level of criticality.

The General Data Protection Regulation (GDPR) started a valuable discussion and forced companies to look into the data they have, create and store.

However, we see that companies which are not following a unified communication or unified communication as a service approach, often also have problems to pinpoint all sources which create data. Thus focusing on a unified communication approach first, will help to solve the problem of accessibility of data later.

Let me give an example what I mean. Just imagine an employee is about to send a list of all employees bank information to an external client via a non-encrypted email channel. According to internal guidelines this is forbidden. But since you have no unified communication protocol in place, no system detects this error and damage could be done. However, with a unified communication system in place, you would always know

  1. Where your data is created
  2. Where your data is stored
  3. Who can access this data
  4. How this data can be used

We will talk about unified communication as a mean of data governance in a later chapter in more detail.

In the next chapter we will discuss the fourth requirement of transparency.

Goldblum's Services