And They’ll Take Their Money Elsewhere If They Don’t Get It
In the past three years there has been a massive wake-up in customer awareness about privacy. Many customers are now refactoring how they buy, taking their business elsewhere if they don’t trust a company’s data practices. On the flip side, steadily increasing fines for privacy non-compliance means that ignoring privacy concerns is a costly mistake. As data leaks continue and major data scandals break, companies are beginning to cater to growing privacy-aware consumers by highlighting their privacy practices as a value-add.
While many organizations care about privacy, the unfortunate reality is that there are few easy, efficient, and effective way to integrate privacy into business systems and processes. Current solutions have multiple inherent risks or are so complex that compliance costs can be a major deterrent to action. Businesses need a new solution that takes into account how data is shared and leveraged — and they cannot forfeit control of the data to any third-party.
The economics of privacy and de-identification
According to a 2019 Cisco study, customers are now making purchase decisions based on privacy: 61 percent of people said they won’t buy from a company if they don’t trust how that company handles their data. In addition, 48 percent of people have switched companies or providers due to data privacy concerns.
Even if customers don’t take their money elsewhere, there are still revenue risks that are intrinsically dependent on responsible privacy practices. As more companies move to either sell data or to make business decisions based on data, keeping data secure — and out of the hands of competitors — is critical to maintaining a competitive advantage. On the cost side, privacy is becoming expensive to ignore. With GDPR and the advent of CCPA, fines regularly reach millions for non-compliance, as can the cost of lost business from reputational damage.
As a result, businesses have turned towards a variety of data privacy technologies in order ot keep their data secure. As permissioning and encryption have become standard, a critical remaining weak link is the risk of data being exposed from a malicious employee, an outside hacker, or when data is decrypted for analysis such as when addressing a customer support ticket. De-identification is often an option that’s considered for keeping data secure and private — creating even more precise data access control. An ideal de-identification solution leverages both advances in edge machine learning and re-identification risk scores, complex skillsets based on expertise that are unreasonable to expect from the average developer. Businesses have typically used rule-based systems (such as regular expressions) to get around some of the complexity. Rule-based systems are unfortunately inadequate to securely de-identify sensitive data, such as faces, license plate numbers, or personally identifiable text within unstructured data sources.
When expertly built with artificial intelligence, data can be de-identified with over 99% accuracy, so the likelihood of a user’s information being compromised is dramatically reduced, while its value for data-driven decision making is maintained. When done in tandem with encryption and permissioning, de-identification provides the next level of confidence in data privacy that serves as a competitive differentiator with customers’ increasing demand for privacy.
A current de-identification solutions conundrum
Right now, organizations usually tackle privacy and data security in one of three ways:
- Implementing in-house solutions.
- Leveraging third-party cloud solutions.
Some businesses lean on piecemeal systems to meet regulatory compliance or aren’t convinced that they’ll be hit with non-compliance fines (the reality is fines are huge and continue to grow and regulatory bodies are only getting started). Unfortunately for these businesses, there is a groundswell movement from customers to move away from companies that don’t take privacy seriously. Based on current trends, such companies will be met with higher non-compliance fines and higher customer churn in the near future, which should convince them to pay attention to privacy.
Building in-house might seem like the most reliable option on the surface. However, building a system that offers both sufficiently high accuracy to meet stringent regulatory requirements and also offers scalability and reliability requires in-depth machine learning and risk of re-identification knowledge, requiring a team of engineers and researchers to manage (and that’s if you can even manage to recruit them and afford them).
The reality is that custom in-house solutions hold plenty of promise, but rarely deliver a functional product that holds up to the high accuracy, precision, and recall standards that are required to properly protect consumer data. And even if companies can hire the right team, in-house solutions take a significant amount of time to build, further delaying true customer privacy, and do not leverage years of learnings from numerous projects and technological environments.
Third-party cloud privacy solutions seemed to offer a middle ground. They were scalable, affordable, and could be set up quickly. However, the idea that they are a silver bullet is a misnomer. Cloud solutions are both less flexible in terms of deployment and open up a new vulnerability because private data is transferred to, stored, and manipulated on an external server. The very act of moving that data to a third party is a significant liability and can count as sensitive data transfer under regulations like GDPR, increasing compliance burden. Additionally, if that third party is compromised then you’ve created a litany of new privacy problems to deal with.
A new solution
Beyond the complexities of building an in-house solution or bringing in a cloud provider, it has also become increasingly difficult to pin data down. Between employee sharing and businesses using multiple apps or platforms, customer data is in motion more than ever.
In order to properly protect customer data wherever it is, the next generation of privacy requires de-identification at the source. In particular, new solutions need to be:
Controllable: Cloud solutions are risky because data is transmitted to a third party server. De-identification technology needs to work on the edge, de-identifying data at the source prior to transit, whether on app, web, on-prem, or any other format while being cheaper, faster, and more reliable than the cloud. Businesses should have control of personal data at all times instead of using an outside platform. They need to add new technology to their core systems and workflows.
Customizable: Every organization has a different appetite for risk or might face different customer expectations. As such, a privacy solution should be completely customizable so each business can choose how and where information is de-identified.
Efficient: A usable de-identification solution must be built with the fact that hundreds of thousands if not billions of documents must be processed on a monthly basis. Scalability and speed are a non-starter.
Facilitating compliance: A new solution should lower the cost of compliance and improve ease-of-use. For example: with adequate de-identification at source, many datasets would fall outside of GDPR requirements, lowering GDPR compliance costs.
Privacy is a customer strategy
Helping businesses unlock datasets for analysis while strengthening employee and consumer privacy protections is why we built Private AI. Our proprietary AI-based technology empowers organizations to easily add privacy-preserving de-identification to almost any data workflow quickly and easily — all it takes is a few lines of code. What’s more, since our solution operates on the edge, we never touch, store, or see any data flowing through, meaning no weak links in the privacy protection chain. It’s built for scale, capable of running efficiently without the need for expensive custom hardware like GPUs.
As more customers demand privacy — and are willing to spend elsewhere if they don’t get it — the business opportunity of privacy is clear. What used to be nearly as expensive to manage as it was to ignore is now simple, customizable, and scalable. Private AI enables businesses to do more with data — without putting it at risk.
There exists a vibrant ecosystem of specialized security tools. The sad truth is that it is almost impossible to reach 100% invulnerability. What can we do to get closer?
Privacy Enhancing Technologies Decision Tree:
for developers, managers, and founders looking to
integrate privacy into their software pipelines
AI is rapidly being deployed around the world with few to follow. Along with the complexity of creating the technology, there remain many unanswered legal questions.
The new Tensorflow Lite XNNPACK delegate enables best in-class performance on x86 and ARM CPUs — over 10x faster than the default Tensorflow Lite backend in some cases.
Some techniques to improve DALI resource usage & create a completely CPU-based pipeline.
We introduce the four pillars required to achieve perfectly privacy-preserving AI and discuss various technologies that can help address each of the pillars.
We discuss a practical application of homomorphic encryption to privacy-preserving signal processing, particularly focusing on the Fourier transform.
We cover the basics of homomorphic encryption, followed by a brief overview of open source HE libraries and a tutorial on how to use one of those libraries (namely, PALISADE).
A number of people ask us why we should bother creating NLP tools that preserve privacy. Apparently not everyone spends hours thinking about data breaches and privacy infringements.
We cover symmetric encryption, asymmetric encryption, homomorphic encryption, differential privacy, and secure multi-party computation.