Debunking Cybersecurity Jargon Part Four – What is Optical Character Recognition?

<a href='/blog?tag=Adaptive Redaction'>Adaptive Redaction</a> <a href='/blog?tag=Data Breach'>Data Breach</a> <a href='/blog?tag=Compliance'>Compliance</a>

As part of our on-going blog series explaining some of the many acronyms and pieces of jargon that bedevil the cybersecurity industry, we turn our attention to optical character recognition technology or OCR as it is commonly known.

Clearswift added OCR functionality to all of its core email and web security products to help organizations combat the risk posed by the millions of image files that are shared in, out and around corporate networks, as well as those uploaded and downloaded to cloud file sharing apps or the web.

Images: A Major Data Loss Risk to Organizations

Under data privacy laws, organizations are required to protect the sensitive information belonging to its customers, employees and partners so that the data does not end up in the wrong hands. To help remain compliant, organizations deploy data loss prevention (DLP) solutions to scan content being shared on the network to ensure that sensitive data is not included, and if it is included and authorized, to make sure the data is encrypted.

Today there are so many ways in which sensitive information or confidential documents can enter or leave the organization, for example, a screenshot of a customer record sent as an attached image or a PDF created by scanning physical documents using a multi-function printer. DLP solutions need to ensure that all content travelling on the network is inspected, yet very few scan image files to monitor this threat, particularly hosted security platforms due to the cost per unit overhead of scanning.

OCR: A Technology to Aid Compliance

OCR enables the analysis of every day image files such as PDFs, JPGs, PNGs, GIFs and BMPs allowing them to be processed using DLP functionality. Just as Clearswift’s Data Redaction option removes sensitive text from Microsoft Word or Excel files, OCR identifies sensitive data in images allowing them to be automatically masked (black boxed) if found.

Due to the depth of content inspection provided, even images embedded in an Excel spreadsheet, which is embedded in a Word document, which is scanned to PDF and shared as a ZIP file attached to an email is detected, analysed and any sensitive information removed, allowing the safe file to continue to its destination. Clearswift’s OCR functionality supports 20 different file formats and 48 languages, providing a comprehensive level of capability.

This automated bi-directional redaction capability not only protects the organization from employees accidentally sharing sensitive or confidential image files, and malicious insiders attempting to leak data, it also protects the organization from any unwanted data it receives. This might be images of credit cards sent by customers keen for a refund or third parties sharing unauthorized content.

Protect Against Data Loss Through Images

Optical Character Recognition (OCR) is an option for the Clearswift Secure Email Gateway, Secure Exchange Gateway, ARgon for Email, and Secure Web and ICAP Gateway products. For more information on how it works, ask for a demo from the team.

Ask us for a demo

Related Resources

Datasheet: OCR

On-Demand Webinar: How Images and Scanned Documents Present a Cybersecurity Risk for Organizations


*** This is a Security Bloggers Network syndicated blog from Clearswift Blog authored by Rachel.Woodford. Read the original post at: