Machine learning is an area of artificial intelligence that enables computers to self-learn, without explicit programming. In e-discovery, machine-learning technologies such as technology assisted review (TAR) are helping legal teams dramatically speed document review and thereby reduce its cost. TAR learns which documents are most likely relevant and feeds those first to reviewers, typically eliminating the need to review from 50 to 90 percent of a collection.
Lawyers are getting it, as evidenced by their expanding use of TAR. At Catalyst, 50 percent of matters now routinely use TAR—and none have been challenged in court. Continue reading
For some time now, critics of technology assisted review have opposed using general recall as a measure of its effectiveness. Overall recall, they argue, does not account for the fact that general responsiveness covers an array of more-specific issues. And the documents relating to each of those issues exist within the collection in different numbers that could represent a wide range of levels of prevalence.
Since general recall measures effectiveness across the entire collection, the critics’ concern is that you will find a lot of documents from the larger groups and only a few from the smaller groups, yet overall recall may still be very high. Using overall recall as a measure of effectiveness can theoretically mask a disproportionate and selective review and production. In other words, you may find a lot of documents about several underlying issues, but you might find very few about others. Continue reading
Citing research on the efficacy of technology assisted review over human review, a federal court has approved a party’s request to respond to discovery using random sampling.
Despite a tight discovery timeline in the case, the plaintiff had sought to compel the defendant hospital to manually review nearly 16,000 patient records. Continue reading
Hot off the press is a new, complimentary book from Catalyst that answers your questions about technology assisted review.
The new book, Ask Catalyst: A User’s Guide to TAR, provides detailed answers to 20 basic and advanced questions about TAR, and particularly about advanced TAR 2.0 using continuous active learning.
The questions all came from you – our clients, blog readers and webinar attendees. We receive a lot of good questions about e-discovery technology and specifically about TAR, and we answer every question we get. Continue reading
In this research we answer two main questions: (1) What is the efficiency of a TAR 2.0 family-level document review versus a TAR 2.0 individual document review, and (2) How useful is expert-only (aka TAR 1.0 with expert) training, relative to TAR 2.0’s ability to conflate training and review using non-expert judgments ? Continue reading
One of the bigger, and still enduring, debates among Technology Assisted Review experts revolves around the method and amount of training you need to get optimal results from your TAR algorithm. Over the years, experts prescribed a variety of approaches including:
- Random Only: Have a subject matter expert (SME), typically a senior lawyer, review and judge several thousand randomly selected documents.
- Active Learning: Have the SME review several thousand marginally relevant documents chosen by the computer to assist in the training .
- Mixed TAR 1.0 Approach: Have the SME review and judge a mix of randomly selected documents, some found through keyword search and others selected by the algorithm to help it find the boundary between relevant and non-relevant documents.
How does one reviewer do the work of 48? It may sound like a riddle, but a new infographic created by Catalyst illustrates the answer.
The question the infographic poses is this: In a review of 723,537 documents, how many reviewers would you need to finish in five days?
The answer depends on whether you are using an early version of technology assisted review (TAR 1.0) or a new-generation TAR 2.0 version. Continue reading
It is difficult to pin down precise numbers on how much companies spend on e-discovery. A 2010 survey prepared for the Duke Conference on Civil Litigation found that the average company paid $621,880 to $3 million per case and that companies at the high end paid $2.4 million to $9.8 million per case. A RAND study put the cost at a median of $1.8 million per case.
What we do know for certain is that e-discovery costs continue to rise as data continues to become more voluminous and complex. According to RAND, roughly 70 percent of e-discovery costs are attributable to document review. Continue reading
[Editor’s note: This is another post in our “Ask Catalyst” series, in which we answer your questions about e-discovery search and review. To learn more and submit your own question, go here.]
This week’s question:
Your blog and website often refer to “TAR 1.0” and “TAR 2.0.” While I understand the general concept of technology assisted review, I am not clear what you mean by the 1.0 and 2.0 labels. Can you explain the difference?
You may recall that, in an opinion issued last August, Hyles v. New York City, U.S. Magistrate Judge Andrew J. Peck denied the plaintiff’s request to force the defendant to use technology assisted review instead of keywords to search for relevant documents and emails. Now, another court has followed suit, similarly concluding that it was without legal authority to force a party to use a particular method of e-discovery search.
In the Aug. 1 Hyles decision, attorneys for Pauline Hyles, a black female who is suing the city for workplace discrimination, had sought to force the city to use TAR, arguing it would be more cost efficient and effective than keyword searches. But even though Judge Peck agreed with Hyles’ attorneys “that in general, TAR is cheaper, more efficient and superior to keyword searching,” he concluded that the party responding to a discovery request is best situated to choose its methods and technologies and that he was without authority to force it to use TAR. Continue reading