The TAR Challenge: How One Client Could Have Cut Review By More Than 57%

Catalyst_TAR_Challenge_Client_Save_57_PercentHow much can you save using TAR 2.0, the advanced form of technology assisted review used by Catalyst’s Insight Predict? That is a question many of our clients ask, until they try it and see for themselves.

Perhaps you’ve wondered about this. You’ve read articles or web sites talking about TAR’s ability to lower review costs by reducing the numbers of documents requiring review. You might even have read about the even-greater gains in efficiency delivered by second-generation TAR 2.0 platforms that use the continuous active learning protocol. But still you’ve held out, maybe uncertain of the technology or wondering whether it is right for your cases.

Now you can find out for yourself, using your own case, at no cost. This week, Catalyst launched the TAR Challenge. Give us an actual case of yours in which you’ve completed a manual review, and we will run a simulation showing you how the review would have gone – and what savings you would have achieved – had you used Insight Predict, Catalyst’s award-winning TAR 2.0 platform.

Simulation_ButtonTo illustrate what you could expect to see, we’re providing this anonymized report of a simulation we ran for a client in May 2017. The purpose of the simulation was primarily to see whether, and to what extent, Predict could have enhanced the efficiency and effectiveness of the previously completed review for responsive documents. The answer was impressive: Had the client used Predict, it would have improved the efficiency of its review by more than 57 percent.

What follows is a summary of the information from the full report.

A Summary of the Simulation

In this case, the full collection contained 4,938 documents. In its actual review, the client had used analytic culling techniques to eliminate 1,319 documents, leaving a culled collection of 3,619 documents. Against that culled collection, the client conducted a full manual, linear review, finding 619 documents that were responsive.

For our experiment, we simulated a review using Insight Predict’s continuous active learning (CAL) protocol on the document population. Specifically, we used Predict to continuously rank and batch the documents based on what Predict considered to be the most likely responsive documents.

For each batch, we used the previously tagged judgments as essentially an artificial review team, applying the previous judgments to documents in simulated Predict ranked order. We plotted those simulated results on a gain curve demonstrating the order in which responsive and not-responsive documents would have been reviewed with Predict, to provide an easy comparison to a linear review.

(A gain curve provides a simple but effective means to compare the results from a technology assisted review and a linear review because it shows the cumulative number of positive documents that would be found throughout the entire review process.)

To illustrate the results of the simulation, we used two methods. First, we provided a table showing the number of documents the client would have had to review to achieve various recall levels (70 percent, 80 percent and 90 percent), both through a linear review and using Predict. We also used a gain curve.

Culled Collection Recall

Our primary objective in this simulation was to measure Predict’s performance against the previous linear review of the culled collection. As you can see from Table 1 below, the Predict review was demonstrably more efficient than the previous linear review at recall levels that, in our experience, are typically associated with document production in a civil litigation. At 80 percent recall, a level that has been widely accepted in both state and federal jurisdictions, Predict was 66.2 percent more efficient than a full linear review, eliminating the need to review 2,394 documents.

 

Recall

Linear Review (Documents) Predict Review (Documents) Documents Eliminated From Review Increased Review Efficiency
70% 948 2,671 73.8%
80% 3,619 1,225 2,394 66.2%
90% 1,835 1,784 49.3%

Table 1: Simulated Responsive Review, Culled Collection

The gain curve in Figure 2 provides a more comprehensive picture of the comparison between the Predict review and the previous linear review. The diagonal black line represents a linear review, the rate at which one would expect to find responsive documents if one were to review the collection in random order. The black line on the left of the chart represents a theoretical perfect review, in which all responsive documents were reviewed before a single not-responsive document. This sets an absolute lower bound on the effectiveness of any TAR process. Finally, the blue line represents the results of our simulation, showing how many documents (x-axis) it would have taken to achieve a target document count or recall (y-axis), had Predict been used.

Figure 2: Insight Predict (blue); Linear Review (solid black); Perfect (dashed black)

Figure 2: Insight Predict (blue); Linear Review (solid black); Perfect (dashed black)

Full Collection Recall

As a secondary objective in this simulation, we measured Predict’s performance against an assumed linear review of the full collection. We did this to evaluate the performance of Predict against the original collection, in the absence of any culling. Here again, we assumed that all unreviewed documents were not responsive.

As you can see from Table 2 below, the Predict review was much more efficient than a linear review would have been on the entire collection. At 80 percent recall, Predict would have been 64.3 percent more efficient than linear review, eliminating the need to review 2,542 documents.

 

Recall

Linear Review (Documents) Predict Review (Documents) Documents Eliminated From Review Increased Review Efficiency
70% 1,015 3,923 79.5%
80% 4,938 1,409 3,529 71.5%
90% 2,144 2,794 56.6%

Table 2: Simulated Responsive Review, Full Collection

The gain curve in Figure 3 again provides a comprehensive picture of the comparison between the Predict review and a linear review of the full collection. The diagonal black line represents a linear review; the dashed black line to the left represents the theoretical perfect review; and the red line represents the results of our simulation.

Figure 3: Insight Predict (red); Linear Review (solid black); Perfect (dashed black)

Figure 3: Insight Predict (red); Linear Review (solid black); Perfect (dashed black)

Culling Comparison

Finally, we combined the two gain curves from the simulation of the full collection and the culled collection to provide a graphic representation of the relative difference between the two reviews. For this gain curve, we used actual cumulative document counts rather than percentages, because we were dealing with collections of different sizes and, therefore, showing the percentages would have been confusing.

What is apparent from the gain curves in Figure 4 is the fact that there is relatively little difference in the Predict review of the full collection and the culled collection. Ultimately, culling the collection did not significantly change the limited number of documents that had to be reviewed to achieve reasonable recall for the responsive documents.

Figure 4: Insight Predict Full Collection (red); Insight Predict Culled Collection (blue); Linear Review on Full Collection (dotted black); Linear Review on Culled Collection (solid black); Perfect (dashed black)

Figure 4: Insight Predict Full Collection (red); Insight Predict Culled Collection (blue); Linear Review on Full Collection (dotted black); Linear Review on Culled Collection (solid black); Perfect (dashed black)

In this blog post, I’ve omitted many of the details of the simulation. For those, please read the full simulation report. But now you have an idea of what a simulation can show and of what it might show for your case. So if you’ve been wondering whether TAR is right for you, now’s your chance to find out by taking the TAR Challenge.

Leave a Reply

Your email address will not be published. Required fields are marked *