As any data scientist will tell you, datasets are the lifeblood of artificial intelligence (AI). That poses an inherent challenge for industries dealing in personally identifiable information (e.g., health care), but fortunately, encouraging progress has been made toward an anonymized, encrypted approach to model training.
Today at the NeurIPS 2018 conference in Montreal, Canada, Intel announced that it has open-sourced HE-Transformer, a tool that allows AI systems to operate on sensitive data. It’s a backend for nGraph, Intel’s neural network compiler, and based on the Simple Encrypted Arithmetic Library (SEAL), an encryption library Microsoft Research also released in open source this week.
The two companies characterized HE-Transformer as an example of “privacy-preserving” machine learning.
“HE allows computation on encrypted data. This capability, when applied to machine learning, allows data owners to gain valuable insights without exposing the underlying data; alternatively, it can enable model owners to protect their models by deploying them in encrypted form,” Fabian Boemer, a research scientist at Intel, and Casimir Wierzynski, Intel’s senior director of research, wrote in a blog post.
The “HE” in HE-Transformer is short for homomorphic encryption, a form of cryptography that enables computation on ciphertexts — plaintext (file contents) encrypted using an algorithm. It generates an encrypted result that, when decrypted, exactly matches the result of operations that would have been performed on unencrypted text.
HE is a relatively new technique — IBM Researcher Craig Gentry developed the first fully HE scheme in 2009. And as Boemer and Wierzynski note, designing AI models that use it requires expertise in not only machine learning but encryption and software engineering.
HE-Transformer aids in the development process by providing an abstraction layer that can be applied to neural networks on open source frameworks such as Google’s TensorFlow, Facebook’s PyTorch, and MXNet. Effectively, it obviates the need to manually integrate models into HE cryptographic libraries.
HE-Transformer incorporates the Cheon-Kim-Kim-Song (CKKS) encryption scheme and addition and multiplication operations, such as add, broadcast, constant, convolution, dot, multiply, negate, pad, reshape, result, slice, and subtract. Additionally, it supports HE-specific techniques like plaintext value bypass, SIMD packing, OpenMP parallelization, and plaintext operations.
Thanks to those and other optimizations, Intel claims that HE-Transformer delivers state-of-the-art performance on cryptonets — learned neural networks that can be applied to encrypted data — using a floating-point model trained in TensorFlow.
“We are excited to work with Intel to help bring homomorphic encryption to a wider audience of data scientists and developers of privacy-protecting machine learning systems,” said Kristin Lauter, principal researcher and research manager of cryptography at Microsoft Research.
Currently, HE-Transformer directly integrates with the nGraph compiler and runtime for TensorFlow, with support for PyTorch forthcoming. Deep learning frameworks that are able to export neural networks to ONXX, such as PyTorch, CNTK, and MXNet, can be used by importing models into nGraph in ONXX and exporting them in a serialized format.
Boemer and Wierzynski said that future versions of HE-Transformer will support a wider variety of neural network models.
“Recent advances in the field have now made HE viable for deep learning,” they wrote. “Researchers can leverage TensorFlow to rapidly develop new HE-friendly deep learning topologies.”