As data volumes grow in litigation, analytics become increasingly important tools for litigators. Analytics can help lawyers make sense of electronic information and reveal the stories hidden among the bits and bytes. But how well do you really understand analytics and what they can do for your case?
In a recent webinar, Litigation Analytics: How to Find Information Critical to Your Case, three experts in the use of analytics in litigation demonstrated core types of this technology and explained the different ways they can help you identify the core issues in your case more quickly and efficiently. Continue reading
The more documents a case involves, the more difficult the task for litigation teams to review and make sense of them. These days, even routine cases can involve many thousands of documents, while more complex cases can involve many millions. For litigators, these mountains of documents present a challenge: How to uncover the stories the documents contain so that you can prepare your cases for discovery and trial — and do so within the limits of available time and budgets.
This Thursday, Aug. 27, a free webinar will demonstrate how litigators can address these challenges using sophisticated analytics tools. The webinar, “Litigation Analytics: How to Find Information Critical to Your Case,” will show how analytics tools can turn mountains of documents into molehills, enabling litigators to quickly and affordably zero in on what they need to know. Continue reading
In Part One of this two-part post, I introduced readers to statistical problems inherent in proving the level of recall reached in a Technology Assisted Review (TAR) project. Specifically, I showed that the confidence intervals around an asserted recall percentage could be sufficiently large with typical sample sizes as to undercut the basic assertion used to justify your TAR cutoff.
In our hypothetical example, we had to acknowledge that while our point estimate suggested we had found 75% of the relevant documents in the collection, it was possible that we found only a far lower percentage. For example, with a sample size of 600 documents, the lower bound of our confidence interval was 40%. If we increased the sample size to 2,400 documents, the lower bound only increased to 54%. And, if we upped our sample to 9,500 documents, we got the lower bound to 63%.
Even assuming that 63% as a lower bound is enough, we would have a lot of documents to sample. Using basic assumptions about cost and productivity, we concluded that we might spend 95 hours to review our sample at a cost of about $20,000. If the sample didn’t prove out our hoped-for recall level (or if we received more documents to review), we might have to run the sample several times. That is a problem.
Is there a better and cheaper way to prove recall in a statistically sound manner? In this Part Two, I will take a look at some of the other approaches people have put forward and see how they match up. However, as Maura Grossman and Gordon Cormack warned in “Comments on ‘The Implications of Rule 26(g) on the Use of Technology-Assisted Review’” and Bill Dimm amplified in a later post on the subject, there is no free lunch. Continue reading
Predictive Ranking, aka predictive coding or technology-assisted review, has revolutionized electronic discovery–at least in mindshare if not actual use. It now dominates the dais for discovery programs, and has since 2012 when the first judicial decisions approving the process came out. Its promise of dramatically reduced review costs is top of mind today for general counsel. For review companies, the worry is about declining business once these concepts really take hold.
While there are several “Predictive Coding for Dummies” books on the market, I still see a lot of confusion among my colleagues about how this process works. To be sure, the mathematics are complicated, but the techniques and workflow are not that difficult to understand. I write this article with the hope of clarifying some of the more basic questions about TAR methodologies. Continue reading
On Jan. 24, Law Technology News published John’s article, “Five Myths about Technology Assisted Review.” The article challenged several conventional assumptions about the predictive coding process and generated a lot of interest and a bit of dyspepsia too. At the least, it got some good discussions going and perhaps nudged the status quo a bit in the balance.
One writer, Roe Frazer, took issue with our views in a blog post he wrote. Apparently, he tried to post his comments with Law Technology News but was unsuccessful. Instead, he posted his reaction on the blog of his company, Cicayda. We would have responded there but we don’t see a spot for replies on that blog either. Continue reading
In a recent blog post, Ralph Losey tackles the issue of expertise and TAR algorithm training. The post, as is characteristic of Losey’s writing, is densely packed. He raises a number of different objections to doing any sort of training using a reviewer who is not a subject matter expert (SME). I will not attempt to unpack every single one of those objections. Rather, I wish to cut directly to the fundamental point that underlies the belief in the absolute necessity that an SME, and only an SME, should provide the judgments, the document codings, that get used for training: Continue reading
The future of legal technology is looking cloudy — and that’s not a bad thing. Cloud computing is on track to overtake on-premise computing within the legal services industry in the very near future, according to a recently published survey of legal IT professionals. Fifty-seven percent of those surveyed predicted that this will happen within five years and 81 percent said it will be within 10 years. Only 16 percent said it would never happen.
The survey was conducted in September by the publication Legal IT Professionals and its results were published Nov. 26. The online survey of the publication’s global readership elicited 438 responses, representing law firms ranging in size from small boutiques to global megafirms. More than three-quarters of respondents work directly in legal IT, either within a firm (54 percent) or as external consultants (24 percent). Lawyers and paralegals made up 22 percent of respondents. Continue reading
Predictive coding is an effective e-discovery tool for ranking large sets of documents. However, it is commonly performed in a manner that may be severely under-inclusive–and therefore raise concerns about its defensibility.
In the use of predictive coding, it is a common practice for the producing party to run keyword searches first, and then sample and rank the resulting documents. The documents that don’t hit on the searches are culled out before reaching the predictive coding process. Continue reading
[This is one in a series of search Q&As between Bruce Kiefer, Catalyst’s director of research and development, and Dr. Jeremy Pickens, Catalyst’s senior applied research scientist.]
BRUCE KIEFER: In our last Q&A post (Q&A: How Can Various Methods of Machine Learning Be Used in E-Discovery?), you talked about machine learning and collaboration. More than a decade ago, collaborative filtering and recommendations became a distinguishing part of the online shopping experience. You’ve been interested in collaborative seeking. What is collaborative seeking and how does it compare to receiving a recommendation?
DR. JEREMY PICKENS: Search (seeking) and recommendation are really two edges Continue reading
After Recommind announced June 8 that it had obtained a patent on predictive coding, the news rapidly rebounded throughout the e-discovery industry. In a Law Technology News article published the same day, Recommind Intends to Flex Predictive Coding Muscles, reporter Evan Koblentz quoted Craig Carpenter, Recommind’s general counsel and vice president of marketing, as saying that the company would “seek to license the patents to other companies that already offer their own versions of predictive coding or that want to have the ability.”
Koblentz also spoke to Catalyst’s CEO, John Tredennick, who said, “We’re puzzled that you can get a patent on what seems to be 40 years in the making in the academic community.” The next day, in a post at the blog Above the Law, John’s response and those of other Recommind competitors were characterized as jealous and grumpy. Continue reading