Technology assisted review is relatively new to the legal field, with the first judicial opinion approving the process issued in 2012. While many have tried predictive analytics with great success, many others have not yet taken the plunge. We often get questions from our clients and partners about the Predictive Ranking process and our methods. Here are some of the most common ones. Feel free to follow up with additional questions to your Catalyst representative.

What is Predictive Ranking℠ ?

Predictive Ranking is Catalyst's proprietary implementation of technology-assisted review (TAR). Through our Insight Predict engine, we integrate the Predictive Ranking process into Catalyst Insight, using sophisticated algorithms we developed especially for electronic discovery. The system ranks documents according to their likely relevance based on interaction (training) with human reviewers.

Why should I care about Predictive Ranking?

The huge volume of electronically stored information makes manual search and review impractical. It is simply too expensive to review every document in large cases. Keyword searching is quicker, but typically finds a much lower percentage of relevant documents and often finds a higher percentage of not-relevant documents. Predictive Ranking can help reduce the cost of review by providing better results than keyword search and by allowing you to ignore likely irrelevant documents.

How does it work?

The process is deceptively simple. Reviewers interact with Predict to train it to find relevant documents. If you have ever tried Pandora® Internet Radio, you already know how it works. Look at a document and give it a "thumbs up" or a "thumbs down."

Here are the key steps in the process:

Catalyst Predictive Ranking
  • Collect:Identify and collect a population of documents. Our system will then analyze the text and build a relationship graph of all the terms in the documents.
  • Train: Find as many relevant documents (“seeds”) as possible through search, analytics or old-fashioned witness interviews. Tag each document for relevance.
  • Rank: Based on your tagging, Insight Predict ranks your documents by likely relevance. Unlike first-generation systems, we rank all documents every time.
  • Review: Send your top-ranked documents to the review team. As they review, continue the learning process by feeding back their tags to the system. QC the review set to ensure consistency.
  • Test: Check your progress by taking a systematic sample of the document set. Predict will generate a yield curve so you can confirm that the ranking process is effective.
  • Produce:Ultimately, produce documents to the requesting party or use them to prepare for depositions, trial or an administrative hearing. Ordering your review means the trial team gets access to the most important documents first.

Insight Predict is integrated, iterative and continuous. Add new documents to the system as you collect them. Feed review judgments back to the system to make the algorithm smarter. Send newly ranked documents back to the review team as they continue their review. The end result is more more efficient review--the team finds relevant document quicker, with fewer total documents to review.

How long does it take to train the system?

Predict is a second-generation TAR product in which training distinctly differs from TAR 1.0 applications. In our system, training and learning are continuous, starting from the submission of the first documents for ranking and continuing until review is complete. With Predict, training becomes a more efficient form of review with active learning continuing from beginning to end. You can do traditional TAR 1.0 training with Predict but it is less efficient and we don’t recommend it.

Does Predictive Ranking work in every situation?

Yes. Insight Predict is optimized for all sorts of review projects, including document populations with low richness (prevalence). It also works well for small cases because there is no wasted time spent on training. Just start reviewing and benefit from the ranking process.

How much time and money does it save?

We have seen substantial savings over traditional search and review techniques as well as reduced review time. While time and cost savings will differ from case to case, they are often more than 50% compared to traditional review.

How much faster is Predictive Ranking?

Once the system has ranked your documents, review of the top 20% of documents may reveal 75% to 80% of all relevant documents in the population. The resulting time savings are obvious. Cut your review time by focusing on the most relevant documents first. Then, take the output from the review and feed it back into the system. In most cases the ranking curve will get better, allowing you to reduce the review population even further.

How do I know that the process is accurate?

A key part of the process is to test against the lower-ranked documents. You do this through a systematic sample of these files based on your decision about the proper confidence level and margin of error. If you check the files and find a higher than expected number of relevant documents included, you can return to the training process. Otherwise, you have basic proof to back your assertion that the lower-ranked documents are not candidates for review.

Is Predictive Ranking defensible in court?

Without question. While the techniques are relatively new, early decisions from federal courts support the process, as do academicians and most legal experts. In February 2012, U.S. Magistrate Judge Andrew Peck, of the Southern District of New York, issued an opinion endorsing Predictive Ranking:

"This judicial opinion now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases."

Da Silva Moore v. Publicis Groupe,
287 F.R.D. 182, 183 (S.D.N.Y. June 15, 2012) (Peck, M.J.).

Other courts have followed suit or at least recommended that the parties consider the technique. Defensibility is a function of both the Predictive Ranking methodology and the application. Our consultants have assisted clients and partners in creating defensible Predictive Ranking methodologies for more than four years. We will stand beside you in hearings to explain the methodology, technology, audit trail and sampling to both opposing counsel and the court.

Why Use Insight Predict for Predictive Ranking?

Insight Predict is the first second-generation TAR system optimized for continuous active learning. As Grossman and Cormack showed in their peer-reviewed study for the Association for Computing Machinery, continuous active learning is the most efficient and effective protocol for finding relevant documents. You can learn a lot more about what makes Predict unique through these articles:

What kind of training and support can I expect?

Success with Insight Predict comes both from smart technology and smart people. Our Professional Services team will work closely with you to analyze the data set, create an appropriate seed set (if needed) and develop a review strategy. To assure defensibility, they can help you sample data at different levels of relevance to verify the process to the court and opposing side, if needed..

What's next?

It just gets better from here—better algorithms and improved processes. Insight Predict reflects our long-standing investment in intuitive analytics to enhance review accuracy and cost effectiveness. Our research staff includes some of the world's leading search scientists and experts. Their mission is to keep enhancing our analytical tools to help you make review more efficient and effective, at a lower cost.

About Catalyst

Catalyst designs, builds and hosts the world’s fastest and most powerful document repositories for large-scale discovery and regulatory compliance. We back our technology with a highly skilled Professional Services team and a global partner network to ensure the best e-discovery experience possible.
Catalyst Repository Systems

1860 Blake Street, 7th Floor
Denver, CO 80202

Phone: 303.824.0900 | Toll Free: 877.557.4273
Fax: 303.293.9073 |  Privacy PolicyPrivacy Shield


Proving a negative can be tough. Luckily, we'll be talking strategies to do just that today on our webinar with…

Our new blog post from @jtredennick and @Catalyst_Gricks explores the beneficial ways that contextual diversity is…

We are honored to have Eric Willis on our team!