Metadata scrubber: Every law office, staff included, must be aware of risks posed by metadata, and he lays out steps to reduce the dangers and ensure the sharing of clean documents.
Table of contents
Most lawyers have probably heard the word “metadata” and, at a minimum, know that it has something to do with their documents. But it’s vital that every law office, staff included, is aware of the risks posed by metadata, and what steps can reduce the dangers.
A popular, if slightly ambiguous definition of metadata is “data about data.” In fact, metadata is the hidden information that can be found embedded in electronic files. It may include transactional details about the document, such as author, software used, the date it was written, and even edits made along the way. It occurs in many different forms and in a vast number of locations. Modern document management systems (DMS) can facilitate the management of digital documents and metadata cleaning processes, improving workflow efficiency for law firms. In a Word document, for example, metadata can include tracked changes, comments, text smaller than 5 points, white text on any background, and previous authors. The most potentially damaging of these can be the tracked changes or comments made in earlier versions.
Emailing documents adds another layer to the problem. In addition to metadata contained within the files you attach, information will also be available about the email message itself. Then there is the email manager, such as Outlook, and the metadata created at the other end upon arrival of the email. This proliferation of metadata from an individual document being emailed back and forth can begin to have a “Russian Doll” effect.
What is Metadata?
Metadata is often called “data about data.” In the world of digital files, metadata is the hidden information inside a file that describes the file. This can be author, creation date, file size and more. While metadata is super useful for file management, organization and search, it can be a big security risk if not managed properly.
Metadata Definition
Metadata is a set of attributes or descriptive information about a digital file. These attributes
- Author and creation date: Who created the file and when.
- File size and type: How big is the file and what format is it in?
- Keywords and tags: Terms to categorize and search the file.
- Geolocation: Where was the file created or edited.
- Edit history and revisions: Changes made to the file over time.
Metadata Types
Metadata can be broken down into several types:
- Descriptive metadata: Tells us about the content of the file, like the title, author and keywords. This type of metadata helps us find and discover the file.
- Structural metadata: Tells us about the file’s layout and organisation, like the format. This is important to understand how the file is put together.
- Administrative metadata: Tells us about the file’s management and preservation, like the creation date, file size and usage rights.
Metadata in Legal Documents
In the legal world, metadata is important because it can contain sensitive information that’s relevant to a case. For example:
- Authorship and ownership: Who created or owns the document.
- Creation and modification dates: When was the document created and when was it changed.
- File history and revisions: What was the document’s evolution through its changes.
- Geolocation data: Location based information that’s relevant to the case.
Metadata must be removed from documents to prevent data leaks and maintain confidentiality. Client trust depends on it.
More on Metadata’s Uses — And Concerns for Law Firms
Metadata serves a number of useful purposes, most notably to facilitate resource discovery, digital identification, electronic organization, and archiving. Metadata is information that may be instrumental in providing a context for other information, and it can be found in various file types. It can also be viewed as evidence — some crucial, some less so. While metadata is usually harmless, depending on the context it can sometimes contain privileged, confidential, or sensitive information. Therefore, it represents a significant client confidentiality concern for attorneys.
Metadata scrubbing, or “mining,” tools can reveal tracked changes and notes on a document that could contain confidential information about a case or a client, such as their willingness to settle. Metadata scrubbing tools can also remove hidden data from email attachments, preventing data breaches. Metadata may also contain elements of a document that had been deleted, perhaps because after consideration they were considered too revealing. This, of course, could have disastrous consequences, making metadata cleaning essential.
Also, even though the majority of hidden metadata can only be uncovered with specialized software, there are some circumstances where metadata can be accidentally revealed — for example, where a file is corrupted or not converted properly and the metadata is displayed on the screen instead of the intended information. In the wrong hands, this potentially sensitive information could prove damaging.
In addition, many law firms use existing client documents as templates to create new client documents. This can certainly save time. However, deleted text may remain within the document as metadata. The risk is that clients may be able to access confidential information about other clients represented by the firm, or that opposing counsel will have access to information that was set out in a document while still in the drafting stages.
It’s likely that any firm that has ever emailed a Microsoft Word or Corel WordPerfect document to opposing counsel or clients has unknowingly sent information that is confidential. As this ABA chart of various states’ ethics opinions shows, there is ambiguity around the ethics of sending and receiving metadata. But clearly, it’s in every firm’s best interests to stay on top of any risks involved, especially for clients.
Lawyer Risks
Metadata is a big risk for lawyers as it can breach client confidentiality and expose sensitive info. Knowing the risks is the first step to mitigating them.
Data Breaches and Security Risks
Metadata is a backdoor into law firms and can compromise client data and confidentiality. For example:
- Revealing secrets: Metadata can reveal client information such as identity, location and financial info which should be confidential.
- Compromising communications: Metadata can be used to track and monitor client communications and breach the attorney-client privilege.
- Enabling cyber attacks: Metadata can be used to launch phishing attacks and other types of cyber attacks and put both the firm and its clients at risk.
To prevent data breaches and have robust data loss prevention law firms must take proactive steps to manage metadata. This means using metadata removal tools like BigHand Metadata Management and having a comprehensive data loss prevention policy. By doing so firms can protect sensitive info and maintain client confidentiality.
Metadata Scrubber: Ways to Minimize the Risks Metadata Presents through Metadata Cleaning
While reverting to paper copies and paper mail is one way to guarantee metadata is not shared, this is rarely viable. For example, Microsoft Outlook allows users to manage attachments efficiently without interrupting their workflow. To minimize the chances of a confidentiality breach, the best strategy is to make sure you and your staff are aware of the risks and then implement procedures to safeguard information before it leaves the firm.
What steps should you take?
You must eliminate — or at least reduce — the metadata contained within documents wherever possible, including removing hidden data. Even a simple understanding of the kind of metadata that can be created in common computer files, like a Word Document, can go a long way toward building awareness. Some metadata scrubbers also support multimedia files, ensuring comprehensive metadata removal.
If you are sharing documents that need to be edited and returned to you via email, you could use “metadata scrubber” software to remove metadata from multiple files before they are sent. A number of metadata scrubbers are available, including Metadata Assistant, Out-of-Sight, iScrub, Workshare Protect (my firm’s product), and ezClean.
Some tools focus on removing only metadata, ensuring that the rest of the document remains intact. Before emailing documents outside the firm, you may also want to consider:
- Switching off “Fast Save” functions when using Microsoft Word, PowerPoint, and Excel.
- Familiarizing yourself with the security settings in Microsoft Office products — in Office XP and Office 2003, for example, you can use the Security Tab to specify that some metadata not be saved.
- Adopting a policy that Adobe Acrobat always is used to converting documents to a “locked” PDF format before a file is sent. You can adjust Acrobat’s Security Options settings to add restrictions regarding certain metadata.
Ali Moinuddin is Chief Marketing Officer at Workshare, which produces document comparison and review software for the legal profession. Ali has over 15 years of experience in supporting high-growth companies. Previously he was CMO at SkyDox, Director of Marketing at Interxion, Director of Marketing EMEA for SPL WorldGroup (now a part of Oracle), and Marketing Manager EMEA and Asia-Pacific at Kana.
Illustration ©iStockPhoto.com