The challenge

Accessing user data safely

More and more, organisations are collecting data about their users and customers. This data is then fed into sophisticated analytics, including machine learning algorithms, to unlock insightful information leading to higher value services and products.

The question is how organisations can then provide safe access to this data internally, or even share the data externally for societal or commercial benefit.  This is extended by considering the benefit of different organisations safely sharing data between them, and there is a strong incentive to do so.

Most data custodians recognise the privacy and confidentiality risks in using and sharing their data both within and outside their organisations.  However, there is no consistent and repeatable methodology or related tool for data custodians to confidently measure and understand the level of such risks in their data for the purpose of sharing or releasing it.

Our response

Re-identifier Risk Ready Reckoner (R4)

We have designed quantitative and qualitative privacy and confidentiality risk methodology, with appropriate assessment metrics and frameworks, to understand the risks with sharing or releasing data, or even just providing access to a wider internal audience. These tools leverage scientific knowledge from information theory and stochastic models to provide an accurate estimation of the residual risks associated with the sharing of sensitive data.

For example, one of our metrics allows the measurement of re-identification risks for an individual event, or transaction based ion factors such as uniqueness, uniformity and/or linkability. Another one of our metrics quantifies the risk of deducing a non-reported value in an aggregated data report.

We have also developed software, such as our Re-identifier Risk Ready Reckoner (R4), to implement these metrics and methodologies. R4 generates quantifiable risk assessments that display on a working dashboard - and provides data treatment options such as binning and perturbation to help data custodians mitigate these risks - before re-assessing the risk in the treated data.

The results

Improving awareness of privacy and confidentiality risk

Our work is improving awareness of privacy and confidentiality risk in data and helping in the management of that risk across the data ecosystem.

Our privacy and confidentiality risk frameworks and R4 software have been used extensively in several commercial engagements, identifying and measuring re-identification risks in so-called de-identified data pending release (or in some cases already released), as well as inference risks of not-reported data in confidential financial reports.

ISP Dashboard

Dashboard results from running ISP’s Re-identification Risk Ready Reckoner (R4) on a publicly available census data set (i.e. the “adult” dataset from the UCI Machine Learning Repository at https://archive.ics.uci.edu/ml/datasets/adult)

Demonstrating the impact of our work through these engagements, we have observed cases where data custodians have adjusted their approach to making data available due to better appreciation of the risk it carries.  In other cases, guided by our framework, data custodians have applied targeted transformation to the data to reduce the residual risks - while still maintaining an acceptable level of utility - before releasing it.

ISP Virtual Asset

Find out more: Information Security and Privacy

Do business with us to help your organisation thrive

We partner with small and large companies, government and industry in Australia and around the world.

Contact us now to start doing business

Contact Data61

How can we help you create your data-driven future? Use the form below to send us a message.
Your contact details
0 / 100
0 / 1900
You shouldn't be able to see this field. Please try again and leave the field blank.

For security reasons attachments are not accepted.