Daily Dispatch

Tech Tips

Mining Document Metadata: Pointers

By | May.03.17 | Daily Dispatch, Document Management, Law Practice Management, Legal Technology, Productivity

metadata mining

Each time an electronic document comes into being, metadata is created along with it. People often add their own, too. From the obvious (like page numbers) to the obscure (like dates of creation and author names), every piece of metadata serves some purpose.

During discovery, document metadata can prove just as important as the visible contents in the document. That’s why legal teams need to comb every document for metadata. (It’s also why many use metadata “scrubbers” on documents they share with other people.)

Things can get interesting when one legal team unwittingly sends documents to opposing counsel that contain metadata they weren’t aware of. The following tips cover some of the most common “accidental” metadata disclosures. These types of disclosures don’t always happen, but they’re worth checking for in files. And if you find such metadata, disclose your finds to opposing counsel. (Be sure to check for related ethics guidance in your jurisdiction; the American Bar Association has a handy resource for checking state metadata ethics opinions here.)

Turn on Track Changes in Microsoft Office Documents

Modern document revision often happens in the margins. Well, the right margin of Word documents. That’s where authors place comments and questions, each one labeled with their Office user ID. (The “Track Changes” feature can be called different things and be in different places in other types of documents.)

When authors choose to view a document in its “final” form, that margin and its notes disappear. Doing so means they might forget to get rid of Track Changes content before sending the document on to others. If that happens, that document will travel with all commentary and suggested additions and deletions. These threads can reveal deeper thinking behind a given document and offer hints to a legal team that wasn’t supposed to see those threads.

Check Document Properties

Ever wonder who the actual author of a document happens to be? How about when the document was created? If you’re checking a photo, was it ever opened using Photoshop? You can learn all this and more by checking a document’s properties.

Look for Hidden Rows and Columns in Spreadsheets

Sometimes an author will hide rows or columns to conceal information they contain. You won’t know until you unhide them.

Spotting hidden columns or rows should be straightforward. For instance, if one column is labeled F and the next one you see is J, that means three columns have been hidden.

Check Document Headers and Footers

From author names to Bates stamps to file paths, headers and footers can contain plenty of useful information.

For instance, sometimes authors redact documents by “cutting” them down once they reach the PDF stage. You may be holding a four-page document that says, in the bottom-right corner, “Page 7 of 19.” If you see something like this, you should go “hmm …”

Look for Speaker’s Notes in PowerPoint Documents

Sometimes presentation files contain entire scripts on crammed slides in fonts only bald eagles could read. And sometimes content is kept to a minimum, interspersed with tastefully chosen images that engage the audience.

That’s what the audience sees. What the audience doesn’t see is any speaker’s notes created in the spot reserved for such notes with each slide. Short of a video or audio recording of the presentation, this user-generated metadata may provide the most insight available into what was said.

Does a PDF Contain Redacted Parts?

If you receive a redacted PDF, try this:

  • Select and copy those bits.
  • Paste them into a text editor like Notepad, TextEdit or Microsoft Word.

If you don’t see anything “under” the redacted parts, the PDF’s creator knows how to redact a PDF properly.

But if all the creator did was draw black lines on top of the material to be redacted, those black lines don’t carry over when the material is pasted into a text editor. Who knows what fun stuff you might learn?

Mining for More

Expect disclosures such as the ones outlined here to become less frequent as more lawyers learn how to prevent them. That said, they haven’t all learned yet, so the above tips are certainly worth trying whenever you review electronic documents during discovery.

These aren’t the only ways to look for valuable metadata in documents. Did I neglect to mention your favorite tactics? Share them in the comments below.

Luigi Benetton is a journalist and freelance business writer and technical writer based in Toronto. He blogs about technology and the auto industry at Technozen. Follow him on Twitter @LuigiBenetton.

Illustration ©iStockPhoto.com

Subscribe to Attorney at Work

Get really good ideas every day: Subscribe to the Daily Dispatch and Weekly Wrap (it’s free). Follow us on Twitter @attnyatwork.

Sponsored Links

Recommended Reading

Comment