metadata mining Tech Tips

Mining Document Metadata: Pointers

By Luigi Benetton

Each time an electronic document comes into being, metadata is created along with it.ย People often add their own, too. From the obvious (like page numbers) to the obscure (like dates of creation and author names), every piece of metadata serves some purpose.

During discovery, document metadata can prove just as important as the visible contents in the document. Thatโ€™s why legal teams need to comb every document for metadata. (Itโ€™s also why many use metadata โ€œscrubbersโ€ on documents they share with other people.)

Things can get interesting when one legal team unwittingly sends documents to opposing counsel that contain metadata they werenโ€™t aware of. The following tips cover some of the most common โ€œaccidentalโ€ metadata disclosures. These types of disclosures donโ€™t always happen, but theyโ€™re worth checking for in files. And if you find such metadata, disclose your finds to opposing counsel. (Be sure to check forย related ethics guidance in your jurisdiction; the American Bar Association has a handy resource for checking state metadata ethics opinionsย here.)

Turn on Track Changes in Microsoft Office Documents

Modern document revision often happens in the margins. Well, the right margin of Word documents. Thatโ€™s where authors place comments and questions, each one labeled with their Office user ID. (The โ€œTrack Changesโ€ feature can be called different things and be in different places in other types of documents.)

When authors choose to view a document in its โ€œfinalโ€ form, that margin and its notes disappear. Doing so means they might forget to get rid of Track Changes content before sending the document on to others. If that happens, that document will travel with all commentary and suggested additions and deletions. These threads can reveal deeper thinking behind a given document and offer hints to a legal team that wasnโ€™t supposed to see those threads.

Check Document Properties

Ever wonder who the actual author of a document happens to be? How about when the document was created? If youโ€™re checking a photo, was it ever opened using Photoshop? You can learn all this and more by checking a documentโ€™s properties.

Look for Hidden Rows and Columns in Spreadsheets

Sometimes an author will hide rows or columns to conceal information they contain. You wonโ€™t know until you unhide them.

Spotting hidden columns or rows should be straightforward. For instance, if one column is labeled F and the next one you see is J, that means three columns have been hidden.

Check Document Headers and Footers

From author names to Bates stamps to file paths, headers and footers can contain plenty of useful information.

For instance, sometimes authors redact documents by โ€œcuttingโ€ them down once they reach the PDF stage. You may be holding a four-page document that says, in the bottom-right corner, โ€œPage 7 of 19.โ€ If you see something like this, you should go โ€œhmm โ€ฆโ€

Look for Speaker’s Notes in PowerPoint Documents

Sometimes presentation files contain entire scripts on crammed slides in fonts only bald eagles could read. And sometimes content is kept to a minimum, interspersed with tastefully chosen images that engage the audience.

Thatโ€™s what the audience sees. What the audience doesnโ€™t see is any speaker’s notes created in the spot reserved for such notes with each slide. Short of a video or audio recording of the presentation, this user-generated metadata may provide the most insight available into what was said.

Does a PDF Contain Redacted Parts?

If you receive a redacted PDF, try this:

  • Select and copy those bits.
  • Paste them into a text editor like Notepad, TextEdit or Microsoft Word.

If you donโ€™t see anything โ€œunderโ€ the redacted parts, the PDFโ€™s creator knows how to redact a PDF properly.

But if all the creator did was draw black lines on top of the material to be redacted, those black lines donโ€™t carry over when the material is pasted into a text editor. Who knows what fun stuff you might learn?

Mining for More

Expect disclosures such as the ones outlined here to become less frequent as more lawyers learn how to prevent them. That said, they havenโ€™t all learned yet, so the above tips are certainly worth trying whenever you review electronic documents during discovery.

These arenโ€™t the only ways to look for valuable metadata in documents. Did I neglect to mention your favorite tactics? Share them in the comments below.

Illustration ยฉiStockPhoto.com

Subscribe to Attorney at Work

Get really good ideas every day: Subscribe to the Daily Dispatch and Weekly Wrap (itโ€™s free). Follow us on Twitter @attnyatwork.

share TWEET PIN IT share share
Benetton Luigi Benetton

Luigi Benetton is a business and technical writer specializing in a wide range of information technology and business topics. He blogs about tech and his passion for cars at TechnoZen and is fluent in several languages.ย  Follow him @LuigiBenetton and contact him at LuigiBenetton.com.

More Posts By This Author
MUST READ Articles for Law Firms Click to expand
envelope

Welcome to Attorney at Work!

ย  ย  ย  ย 

Sign up for our free newsletter.

x

All fields are required. By signing up, you are opting in to Attorney at Work's free practice tips newsletter and occasional emails with news and offers. By using this service, you indicate that you agree to our Terms and Conditions and have read and understand our Privacy Policy.