We recently announced our partnership with Snowflake (you can read more here), and are proud of our unique approach to data de-identification. This article and video cover how we’ve approached this integration to help address privacy concerns and accelerate customers’ move to Snowflake’s Data Cloud.
Baffle’s integration with Snowflake is the only solution that de-identifies data end-to-end in the data pipeline. One of the things that makes the solution unique is our ability to de-identify data on-the-fly as it moves to cloud. Most people trying to tackle this problem attempt to create clones and transform the data or migrate the data in the clear and then try to figure out how de-identify petabytes of data after it’s already landed in the cloud. Have you ever tried to tokenize or encrypt a petabyte of data?
At a simplistic level, our Data Protection Services for Snowflake performs two functions — (1) de-identification of data on-the-fly and (2) selective re-identification of data in Snowflake based on authorized roles. Below is a diagram that depicts a data pipeline flow where Baffle is de-identifying data before staging in S3 and re-identifying data for Snowflake.
This may seem rudimentary, but it offers customers the ability to move their data to and from any source to any destination and de-identify it on the fly. In the above example, we’re showing an on-premise database that can continually push data into elastic cloud staging and eventually pushed into Snowflake.
This 5 minute video below demonstrates how we do this on the fly.
Learn about our supported encryption modes here.
Schedule a time to discuss your data pipeline security requirements with us here.
Download our white paper, “A Technical Overview of Baffle Hold Your Own Key (HYOK) and Record Level Encryption (RLE).”
*** This is a Security Bloggers Network syndicated blog from Baffle authored by Harold Byun, VP Products. Read the original post at: https://baffle.io/blog/de-identifying-data-into-snowflake/