Build safer, better AI by preserving data privacy

Granica Screen is a data privacy service. It helps data teams detect, classify and protect sensitive information contained within cloud data lake files - without sampling - mitigating breach risk and safely unlocking more data for use with in-house training and external generative AI services.

Get a demo

Elevate data security posture

Accurately classify and de-identify sensitive information and PII hidden in structured, semi-structured and unstructured text-based training files.

Protect and unlock more data

Broadly apply high-efficiency, ML-powered scanning algorithms to process and safely unlock 5-10X more data vs. traditional approaches at similar cost. Stop sampling and start protecting.

Safely train on fresh data

Classify and de-identify incoming data in real-time to supply training pipelines with fresh, safe information for improved performance and accuracy.

Enhance privacy where you’re most vulnerable

Confidently handle the accelerating scale of data in your cloud data lakes to mitigate
privacy, security and compliance risk while enabling teams to build better AI, faster.

Discover & classify with leading accuracy

Generate reports for sensitive information at the field level inside files in your cloud data lake, especially data sets with high potential value for ML and AI.

Apply privacy within your data pipelines

Mask and de-identify at the field level inside structured, semi- and unstructured data, creating safe file copies ready for downstream processing.

Safely train on de-identified copies

Improve model performance and accuracy with the addition of more training data and information, while maintaining data privacy and compliance.

Data types and classifiers supported

Clickstream, Logs, Tabular, and more

Granica supports a wide range of data types and classifiers (e.g. phone number, SSN, VIN etc.) for AI/ML/analytics.
Bring us your unique requirements, we can further customize for your use case.

Clickstream

Logs

Tabular

Cost-effectively unlock more data

Granica Screen delivers 5-10X higher compute efficiency and thus lower infrastructure cost per byte scanned vs. traditional approaches, enabling cost-effective scanning of broad data sets rather than limited sampling. Similarly, ultra-efficient masking and de-identification unlocks 5-10X more data for safe use in model training for similar total cost to alternatives. Use Screen real-time in the application write path, further mitigating breach risk and enabling immediate use for training or other needs.

Unlock 5-10X more data to power AI/ML.
Safely train on fresh data as it lands.
Stay budget-neutral vs. AWS Macie and Google DLP.

Frequently Asked Questions:

Does Granica Screen modify the private source files?

No, Granica Screen typically is granted read-only access to private files. It reads and transforms sensitive information such as PII in those files using various de-identification techniques, and then stores safe-for-use copies in a separate target bucket.

How does Granica Screen compare to alternatives?

Unlike traditional data privacy solutions, Granica Screen is highly compute-efficient, lowering the cost to side-scan data by 10x and thus increasing the volume of data you can unlock for training by 10X at comparable costs. Our classification engine also provides both high precision (to mitigate false positives) and high recall (to mitigate false negatives), enabling safe and compliant use in ML and generative AI. Finally, Granica Screen can be integrated inline into source systems and applications, enabling it to detect and protect new, incoming sensitive data before it is ever persisted into your data lake.

Can I use Granica Screen and Granica Crunch together on the same data?

Yes, both products are built on the Granica platform and are fully compatible with one other. You can maximize your benefits by using both together. For example a common pattern is to first use Granica Screen to generate safe-for-use file copies in a target bucket. Then, use Granica Crunch on that target bucket to minimize the costs to store and access those copies.