Using TAR to Find Relevant Documents

Traditional e-discovery review typically began with a list of keywords compiled by an attorney or through agreement between the parties. Often the list would be over-inclusive, for fear that something might be missed. That meant a lot of irrelevant documents had to be reviewed, at a substantial cost. Equally often, the search would be under-inclusive, meaning that relevant documents would not be retrieved.

Either way, the process was less than ideal. Reviewers had to click through thousands of irrelevant documents to find the relevant few. Productions ended up being deficient because many other relevant documents were missed in the culling process.

Using TAR to Make Review More Efficient

Ultimately, the goal of review is to find relevant documents as quickly and efficiently as possible. For years, people believed linear review was the only proper way to conduct the process. We now know this isn’t true. Linear review is expensive, time consuming and not nearly as reliable as people once thought.

Insight Predict℠ provides a powerful and cost-saving alternative to linear review, even if you plan to review all of the documents in your collection. Predict orders documents by relevance, placing the most likely relevant first. Review teams proceed from high to low, with the option of cutting off the review once the number of relevant documents drops to the point where further review doesn’t make sense.

In almost every case, reviewing documents in relevance order will reduce the cost of human review and improve review quality. Even if you choose to review all of your documents, the review team is more efficient because documents are grouped by topic. Once you get through the relevant documents, it becomes simple to click through the rest.

Predictive Ranking

Predictive Ranking orders a document collection by relevance to your inquiry, enabling review teams to focus their efforts on the most important documents.

The process is deceptively simple. Reviewers interact with Predict to train it to find relevant documents. If you have ever tried Pandora® Internet Radio, you already know how it works. Look at a sample document and give it a "thumbs up" or a "thumbs down."

Here are the key steps in the process:

Catalyst Predictive Ranking

  • Collect: Identify and collect a population of documents. Our system will then analyze the text and build a relationship graph of all the terms in the documents.
  • Train: Find as many relevant documents (“seeds”) as possible through search, analytics or old-fashioned witness interviews. Tag each document for relevance.
  • Rank: Based on your tagging, Insight Predict ranks your documents by likely relevance. Unlike first-generation systems, we rank all documents every time.
  • Review: Send your top-ranked documents to the review team. As they review, continue the learning process by feeding their tags back to the system. QC the review set to ensure consistency.
  • Test: Check your progress by taking a systematic sample of the document set. Predict will generate a yield curve so you can confirm that the ranking process is effective.
  • Produce: Produce documents to the requesting party or use them to prepare for depositions, trial or an administrative hearing. Ordering your review means the trial team gets access to the most important documents first.

Insight Predict is integrated, iterative and continuous. Add new documents to the system as you collect them. Feed review judgments back to the system to make the algorithm smarter. Send newly ranked documents back to the review team as they continue their review. The end result is more efficient review--the team finds relevant documents more quickly, with fewer total documents to review.

A Smarter Way to Review

By combining Predictive Ranking with Catalyst's powerful, hosted e-discovery platform, law firms and corporations can reduce the cost and time involved in early case assessment, search and document review:


Predictive Ranking techniques have been approved by the courts and major government entities.


Our continuous active learning techniques can reduce review by over 50% and often by even more.


The process is both defensible and easy to defend. Review cutoffs are justified through a systematic sample which measures richness (prevalence) both above and below your cutoff.

Catalyst scientists and senior attorney consultants have been helping clients and their law firms create defensible predictive ranking methodologies for years. We will stand beside you in hearings to explain the methodology, technology and sampling procedures.

About Catalyst

Catalyst designs, builds and hosts the world’s fastest and most powerful document repositories for large-scale discovery and regulatory compliance. We back our technology with a highly skilled Professional Services team and a global partner network to ensure the best e-discovery experience possible.
Catalyst Repository Systems

1860 Blake Street, 7th Floor
Denver, CO 80202

Phone: 303.824.0900 | Toll Free: 877.557.4273
Fax: 303.293.9073 |  Privacy PolicyPrivacy Shield


RT @CorpCounsel: .@jtredennick begins a three-part series on the duties surrounding legal holds and preservation, with tips on how to fulfi…

Catalyst's founder @jtredennick is featured in a new series for @CorpCounsel concerning key steps to consider in im…

RT @kayleewalstad: Live from #CLOC2018 - at the Discovery Strategies panel lead by @jtredennick of @CatalystSecure - Manage Data centrally-…