Magistrate Judge Andrew Peck Discusses TAR in the Courtroom

U.S. Magistrate Judge Andrew J. Peck — author of the first-ever court decision approving the use of technology assisted review in e-discovery — was recently a guest on the Legal Talk Network podcast Digital Detectives. Hosts Sharon D. Nelson and John W. Simek, president and vice president of Sensei Enterprises, interviewed Judge Peck about how TAR works, what cases it is suitable for, and how it is being accepted in the courts.

Given Judge Peck’s leadership in broadening the adoption of TAR, we thought his comments would be of interest to readers of this blog. With the gracious permission of Sharon, John and the Legal Talk Network, below is a partial transcript of the show highlighting Judge Peck’s comments on TAR. You can hear the entire program through the Soundcloud player above or at the Legal Talk Network.

*              *               *              *

Sharon Nelson: There really is some good news about the proposed federal amendments. Would you share that?

Judge Peck: The process has been going down a long road but on April 29th, the Supreme Court transmitted the proposed rules amendments to Congress. If Congress does not act, then the rules and amendments will become effective on Dec. 1, 2015. Historically Congress has not acted. I think it’s almost 100% certain that we will have these rules on Dec. 1 this year.

John Simek: Can you explain briefly what TAR is?

Magistrate Judge Andrew Peck

Magistrate Judge Andrew Peck

Judge Peck: TAR seems to be the most accepted term at the moment. Technology assisted review is a combination of humans and technology so we cannot forget the human aspect. Where essentially the human reviewers train the technology, and the technology then uses what it has learned to go through the potentially millions of ESI documents (electronically stored information)—email or otherwise.

Probably the best analogy is for those who listen to music or those who shop on Amazon, you make a purchase or two and the next thing you know those services are recommending additional purchases. You bought this book, The Complete Sherlock Holmes, you might enjoy these Agatha Christie books and things like that. And then maybe you make a further purchase and they realize you’re not as interested in soft criminal cases mysteries but you prefer the private detective novel and so they will make recommendations in that area, similarly the way Pandora treats music. You start listening to Billy Joel and they may recommend Elton John etc.

The more the computer is able to see what you like, the more it is able to target that view—and that is essentially how technology assisted review works. There is a so-called seed set coded by the reviewers that is fed back into the computer, the computer and their various different models either will give back the most likely relevant documents and say, “Did I get it right?”, and there will be further coding of that and further training of the system. Or perhaps in some of the other models the computer will spit back out the gray area documents and say,
“We really can’t tell from what we have learned so far how to code those documents; please code some more of these so we can better train the system,” and then that process keeps going. The result is that ultimately the reviewers may only have to review 10 or 20 percent of the documents and know that the rest is likely to be non-responsive. So that is where the savings comes in at the end.

Sharon Nelson: There is a quote from the Rio Tinto March 2015 opinion that states, “It is now black letter law that where the producing party wants to utilize TAR for document review, courts will permit it.” Is it your view that this is really true for all federal courts, and what do you think about the state courts?

Judge Peck: It is actually true internationally, at this point. The most recent decision in this area is the Irish Bank case from the Irish High Court, a March 3 of this year decision, where that court, following my Da Silva Moore opinion—and other information—approved the use of TAR in Ireland and approved it even though their system requires the production of all responsive information. The judge there admitted that TAR is unlikely to find all but neither is any other method and TAR was better than anything else.

Every case that I have seen—federal and state—where the producing party—the responding party—wishes to use predictive coding/TAR/ whatever term we are using, the courts have allowed it. That’s not only federal court cases but some, in fact, the second earliest decision—the one following my Da Silva Moore—was the Global Aerospace decision out of circuit court for Loudoun, Va., that approved the use of TAR.

Another interesting case from a state court was the Delaware Chancery decision in the EORHB case, known as the “Hooters Case,” where Vice Chancellor Laster listened to some 60-some odd pages of oral argument on cross motions to dismiss. At the end of that, he ruled on those and mandated that they show cause to him if they were not going to use TAR.

One of the more interesting recent decisions was in September 2014 in the Dynamo Holdings case. The tax judge said, “It’s sort of strange that you’re asking me to approve TAR because in the old paper days, nobody would have asked me what reviewers should be used and how they should be trained.” He then went on to say that, since this is the first time it has come up in the tax court, he would indeed rule on it and he approved the use of TAR in that case.

Only in cases where either a party stipulated to key words and then manual review and then midstream tried to change to TAR was that disallowed—switching horses midstream as one judge called it. Other than that, every time the court allowed it to be used.

In the few cases where the requesting party tried to force the responding party to use TAR, the courts have said, “No, they are not going to do that.” The question is reasonableness, not best. But sort of a footnote to that, in the three cases I put into that bucket, the producing party had already spent over a million dollars using either a keyword or a hybrid keyword first to winnow the production down and then TAR. The courts said, “Enough money has been spent; we are not going to redo it.”

What would happen down the road if a requesting party tries at Day One before the responding party has spent any money to force the use of TAR remains to be seen. Now, probably not; five years from now, that may well be something the court does, force its use as the most economic means of production or at least says well do what you want but don’t come running to me for costs for manual review or keyword manual review if you’re not going to use the most efficient method.

John Simek: One thing that we hear quite a bit is that there really isn’t a good way to ensure the training of the TAR tool is done appropriately. Some people complain about the effectiveness and say that you should be using subject matter experts and others say that review teams can be used just as effectively. Can you talk about some of those complaints and how they can be addressed?

Judge Peck: In the infancy of TAR, there was much more of a need for a partner level or senior associate level—a so called subject matter expert—and either one or no more than two reviewers because if the TAR tool is getting inconsistent training it makes it harder for it to stabilize.

I think where we are now the technology has improved, and it can be subject matter reviewers, it can be review teams as long as there is a clear knowledge of what the scope of the case is about and as long as the training is sufficient, meaning there are enough rounds to stabilize the system—or in Continuous Active Learning the training is really part of that continuous production—it is less important.

Certainly I still believe that the Sedona Cooperation Proclamation applies, and that where possible the parties should be cooperating with each other in this area. But if cooperation does not or cannot occur one thing is certain, we should not be holding TAR to a different and higher standard than we hold keywords or manual review. To do that discourages parties from using the most effective, most cost-efficient method of analysis.

One can look at the end result and determine if the training was effective that way. In the typical—what I would call the gap analysis—there are certain key documents that appear to be missing. The primary person involved in the matter is suddenly radio silent for a month— there is probably a problem there. You can look at statistics at the end for recall and precision and see how good a job was done. Cooperation is good, but if a party is unable to cooperate with each other there are other methods both to train the system and to determine at the end that the results were more than sufficient.

Sharon Nelson: When using TAR, what do you think is the best way to come up with a seed data set to use for training? Should it be randomly selected? We were talking about it and it didn’t seem like the best method to us. Don’t we need some human expertise in there? Wouldn’t it be helpful to have conversations with primary known custodians?

Judge Peck: I think you are absolutely correct. Randomness may be necessary to make sure you are not missing something you don’t know about, or not so much for the seed set but the control set so one can see what is the likely richness of the collection. But since the richness of collections for document production often is in single digits, it is not a very useful method to only find 1 out of every 100 documents as relevant and responsive to train the system.

A judgmental approach to the seed set is probably the best approach; and indeed it is one approach where you can get that sort of transparency or cooperation with no downside. Ask the other side what keywords they would like you to run to find responsive documents. That doesn’t mean you eliminate documents using those keywords, it doesn’t mean you find every responsive document with that keyword, but it means you use that to pick some of the documents for the seed set. In addition, probably the only way you are likely to know the acronyms or abbreviations used by the document custodians is to talk to them. If you don’t know that a product that is now commercially available on the market was developed as Project Red or Code Red or anything like that, you are not going to find those documents. So the good old-fashioned lawyering is very useful in coming up with the seed set.

John Simek: When is a case big enough to warrant TAR?

Judge Peck: There are some statistics that show if you have 50,000 or more emails it is an appropriate case for TAR. You have to balance the cost of the vendor for TAR versus how much it is going to cost if you use keywords and manual review—or even worse manual review with no keyword screening and are paying whether its contract attorneys or associates at the firm x hundred dollars per hour to review documents. At almost any level 50,000 and above, you are going to be spending more on that sort of review than you would spending the money to use TAR and it’s not that complicated.


About Bob Ambrogi

Bob is known internationally for his expertise in the Internet and legal technology. He held the top editorial positions at the two leading national U.S. legal newspapers, the National Law Journal and Lawyers USA. A long-time advisor to Catalyst, Bob now divides his time between law practice and media consulting. He writes two blogs, LawSites and MediaLaw, co-authors's Legal Blog Watch, and co-hosts the weekly legal-affairs podcast Lawyer2Lawyer. A 1980 graduate of Boston College Law School, Bob is a life member of the Massachusetts Bar Foundation and an active member of the Massachusetts Bar Association, which honored him in 1994 with its President's Award.