Breaking Up the Family: Reviewing on a Document Level Is More Efficient

Lawyers have been reviewing document families as a collective unit since well before the advent of technology-assisted review (TAR).

They typically look at every document in the family to decide whether the family as a whole (or any part of it) is responsive and needs to be produced, or withheld in its entirety as privileged. Most lawyers believe that is the most efficient way to conduct a review.

What’s the Problem?

There are two problems with reviewing on a family level when you are using a TAR tool. The first problem is that TAR tools operate at a document level, and only at a document level. So coding decisions have to be made on the text within the four corners of every document. Extraneous text, including the text of attachments or parents, regardless of how overwhelmingly responsive, simply cannot influence the decision on responsiveness for any document under consideration.

Second, reviewing and tagging document families together actually impairs the efficiency of a continuous active learning (CAL) tool. A TAR review will be much more efficient if you batch and review documents individually, i.e., on a document level. In fact, our research has shown that review efficiency is optimized (at least for a CAL review) when documents are batched individually, coded for responsiveness only, and then passed with family members for any further full-family review (assuming production, or privilege withholding, on the basis of entire families).

The “Four Corners”

The most critical point in a TAR review is that coding decisions must be made solely on the basis of the text within the individual document being reviewed. Every coding decision essentially tells the TAR algorithm that the features (text) of the document being reviewed are collectively either responsive or nonresponsive. If you make coding decisions on the basis of extraneous text outside the four corners of the document, e.g. on an attachment of the email currently being considered, those decisions may mischaracterize the document and make it more difficult for the TAR algorithm to efficiently rank the remainder of the collection.

The easiest way to implement the four-corners approach from the TAR perspective is to ask this simple question: Do I really want to see more documents that have text like this? If the answer is yes, the document should be coded as responsive, which will inform the TAR tool (at least a CAL tool) to look for more of them. If the answer is no, the document should be coded as non-responsive, so no further time will be wasted on similar documents.

The best example of this four-corners-decision-making process is the review of a parent email that says “Please see attached.” It turns out that the attached is one of the most critical documents in the case. While the attachment is certainly responsive, there is no need to see more documents that simply say “Please see attached.” That language has nothing whatsoever to do with the case, and there is no guarantee that the next attachment will be anything of consequence. Accordingly, the email should be coded on a document level as nonresponsive.1

There are certainly some instances where referring to family members to determine responsiveness of the underlying document is important, but those are few and far between. Typically, that arises when the text within the four corners of the document is ambiguous, and the family members add context. Building on the previous example, a document that says “Please see attached concerning Project X” is ambiguous on its face. If the attachments suggest that Project X is pertinent to the litigation, then there is indeed a need for the TAR tool to find more documents about Project X. That email would be coded as responsive, even on a document level.

Family Tagging

The second problem stems from reviewing on a family level. Intuition should tell us that batching document families together for a TAR review is not the most efficient approach. The reason is that you will be required to review a lot of nonresponsive families (and not just individual documents) during the process.

Every TAR ranking will include a percentage of relevant and nonrelevant documents. In many cases, the ratio is one non-relevant to one relevant. The goal in a TAR review is to review as few non-relevant documents as possible. Even if you choose to review the families of identified responsive documents, that number will be less than if you review the families of documents put forth by the TAR tool that don’t prove to be responsive.

Simulation Research: Comparing the Approaches

Our research has confirmed that a document-level Catalyst Insight Predict review will be more efficient than a family-batched review, even if you eventually review all the family members of the responsive documents. And if you do need to review all of those family members, the best workflow is to first do a document level Predict review for responsiveness only, and then conduct a comprehensive review of the entire family of every responsive document (i.e., a review for responsiveness, as well as privilege, issue coding, etc.).

To evaluate the benefit of batching and reviewing on a document level, we conducted a simulation of an actual family-batched TAR review. The case had roughly 250,000 documents, and nearly 30,000 of them were responsive (for a richness of about 12%). The initial TAR review was conducted on a family-batched basis, and we used the final coding judgments to simulate both a family-batched review and a document-level review, from the same starting point.

1. Family-batch review

When we simulated a family-batched review, it was necessary to review about 70,000 total documents to achieve 80% recall. This included every responsive and non-responsive document prioritized for review by Predict, as well as the family members of each of those Documents.

2. Document-level review

We then simulated a document-level review to see how many individual documents would have to be reviewed to achieve the same 80% level of recall, before moving on to the family member review. That required the review of only about 36,000 documents, as shown in the below figure.

This figure shows the yield curves for the family batched (red line) and document-level (blue line) reviews. The solid green line reflects the number of documents required to be reviewed to achieve 80% recall on a family-batched basis, and the dashed green line shows the number to be reviewed with a document-level review.

3. Two-phase review

Once we knew how many, and which, documents had to be reviewed to reach 80% recall on a document-level, we expanded the simulation to account for the review of the family members of responsive documents. At that point, we only needed to review an additional 17,000 family members, for a total of roughly 53,000 documents, representing a 24% improvement over a family-batched review. The chart below shows the previous yield curves for the family-batched and document-level reviews. The review of additional family members is then shown as a branch extending from the blue (document-level) curve. The solid and dashed green lines illustrate the 24% savings when family members of responsive documents are reviewed.2

4. Refined two-phase workflow

We have also determined that even greater levels of efficiency can be achieved by following a refined two-phase workflow. The first phase would be a document-level Predict review solely for the purpose of identifying responsive documents. When a CAL algorithm is continuously ranking the entire collection on the basis of relevance or responsiveness, a responsiveness-only review can proceed much more quickly than a comprehensive document review.3 Responsive documents would then be passed, together with all family members, to a comprehensive family review for production.

Responsive documents would then be passed, together with all family members, to a comprehensive family review for production. Family review can be either simultaneous or sequential, i.e., at the end of the responsiveness review.

To evaluate the effectiveness of this workflow, we ran post-hoc simulations on several different collections, each of which had been fully coded for production. These simulations were slightly different, however, because we had to estimate the time it would take to review documents, in addition to the order in which they were Reviewed.

As a baseline, we simulated a typical document-level Predict review, followed by comprehensive review of any remaining family members. In a typical review, every document is reviewed only once, for every coding issue—responsiveness, privilege, etc. So every review decision is estimated to take the same average time, regardless of whether it was made during the Predict review or the family review.

For comparison, we simulated the refined two-phase workflow—an expedited first-phase Predict responsiveness-only review, followed by a comprehensive review of the entire responsive family (including the document coded in the first phase). Since the family review is essentially the same type of review as the baseline, we estimated the
family review at the same average review time as the baseline review. And, since we have seen responsiveness-only reviews proceed at a much faster pace than a comprehensive review, we evaluated different Predict review rates of one- to five-times the rate of a comprehensive review.

The chart below shows the time-based gain curves for just one of the collections that we evaluated. On the x-axis, we plotted the total time it would take to review sufficient documents to achieve the recall levels that are plotted on the y-axis. The thick, dark red line is the baseline, which assumes the same average review rate for every document. The other red lines represent the refined workflow at five different Predict responsiveness-only review rates, which are indicated for each curve.

As this chart shows, the refined two-phase workflow is as efficient, or more efficient, than a typical document-level Predict review when the responsiveness-only review is at least twice as fast as the average for the comprehensive review. And the results of the other simulations were very consistent.


Ultimately, our simulations show that reviewing on a document level will be more efficient than family batching. And by refining the workflow to focus first on responsiveness alone, review rates and efficiency can improve even further.

This post is a chapter from TAR from Smart People, Third Edition, the authoritative book on TAR and the new continuous active learning protocol. Download your free copy.


1. In reality, the likelihood that a TAR tool would elevate that particular email as a potentially responsive document is low, for exactly the reason it must be coded as non-responsive—the text within the four corners of the document simply has no bearing on the case. (Conversely, the attachment would likely be elevated for review much earlier in the TAR process.)

2. It should be noted that reviewing the family members actually increased recall to 85%.

3. See, e.g., Maura R. Grossman & Gordon V. Cormack, Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, XVII RICH. J.L. & TECH. 32-33 (2011) (discussing prioritized responsiveness-only reviews exceeding 180 documents per hour). This is consistent with our experience.


About Jeremy Pickens

Jeremy Pickens is one of the world’s leading information retrieval scientists and a pioneer in the field of collaborative exploratory search, a form of information seeking in which a group of people who share a common information need actively collaborate to achieve it. Dr. Pickens has seven patents and patents pending in the field of search and information retrieval. As Chief Scientist at Catalyst, Dr. Pickens has spearheaded the development of Insight Predict. His ongoing research and development focuses on methods for continuous learning, and the variety of real world technology assisted review workflows that are only possible with this approach. Dr. Pickens earned his doctoral degree at the University of Massachusetts, Amherst, Center for Intelligent Information Retrieval. He conducted his post-doctoral work at King’s College, London. Before joining Catalyst, he spent five years as a research scientist at FX Palo Alto Lab, Inc. In addition to his Catalyst responsibilities, he continues to organize research workshops and speak at scientific conferences around the world.


About Thomas Gricks

Managing Director, Professional Services, Catalyst. A prominent e-discovery lawyer and one of the nation's leading authorities on the use of TAR in litigation, Tom advises corporations and law firms on best practices for applying Catalyst's TAR technology, Insight Predict, to reduce the time and cost of discovery. He has more than 25 years’ experience as a trial lawyer and in-house counsel, most recently with the law firm Schnader Harrison Segal & Lewis, where he was a partner and chair of the e-Discovery Practice Group.


About John Tredennick

A nationally known trial lawyer and longtime litigation partner at Holland & Hart, John founded Catalyst in 2000. Over the past four decades he has written or edited eight books and countless articles on legal technology topics, including two American Bar Association best sellers on using computers in litigation technology, a book (supplemented annually) on deposition techniques and several other widely-read books on legal analytics and technology. He served as Chair of the ABA’s Law Practice Section and edited its flagship magazine for six years. John’s legal and technology acumen has earned him numerous awards including being named by the American Lawyer as one of the top six “E-Discovery Trailblazers,” named to the FastCase 50 as a legal visionary and named him one of the “Top 100 Global Technology Leaders” by London Citytech magazine. He has also been named the Ernst & Young Entrepreneur of the Year for Technology in the Rocky Mountain Region, and Top Technology Entrepreneur by the Colorado Software and Internet Association. John regularly speaks on legal technology to audiences across the globe. In his spare time, you will find him competing on the national equestrian show jumping circuit or playing drums and singing in a classic rock jam band.


About Andrew Bye

Andrew is the director of machine learning and analytics at Catalyst, and a search and information retrieval expert. Throughout his career, Andrew has developed search practices for e-discovery, and has worked closely with clients to implement effective workflows from data delivery through statistical validation. Before joining Catalyst, Andrew was a data scientist at Recommind. He has also worked as an independent data consultant, advising legal professionals on workflow and search needs. Andrew has a bachelor’s degree in linguistics from the University of California, Berkeley and a master’s in linguistics from the University of California, Los Angeles.