In AI, No Evolution Without Evaluation

At the recent Legalweek New York AI Bootcamp Workshop, I was reminded of a very small, cheap pocket dictionary that I once bought at a book fair when I was in third grade. One day, while looking up definitions, I came across the entry for “bull.” Bull was defined as “the opposite of cow.” Curious, I looked up “cow.” It was defined as “the opposite of bull.” Nothing about both terms referred to bovines, gender or any other defining description. Just that bull and cow are each other’s opposites.

At the boot camp—designed to cover the foundation, use cases and legal considerations to separate the value of AI technology from “the noise”—I learned that “machine learning” is “not expert systems,” and “expert systems” is “not machine learning.” How is this any more helpful than my third grade dictionary?

As a PhD in computer science who has been working in AI and information retrieval for the past 20 years, I attended the workshop to gain knowledge about how lawyers are learning about AI. At the very least, I hoped to better understand my lawyer colleagues and clients so I could have more productive conversations with them. What I walked away with was even more “noise” (analogy above to the point).

What the Workshop Did and Didn’t Teach Lawyers

In the boot camp, we learned that the area of AI making the most waves is machine learning. Therefore, at a basic minimum, we should have learned the basic components of a machine learning system, such as the objective function and the relationship between training and the objective function, inferred function, what a supervisory signal is (or even what the vendor presenters are using for their supervisory signal). Most of these core concepts were not even discussed, and participants left with even more “noise.”

With all the brainpower in the room, doesn’t it feel like we could have done more in nine hours to actually educate lawyers and other legal professionals?

Where the Real Education in AI Needs to Happen in the Legal Community

If nothing else, the workshop should have taught us about a very important aspect of AI: evaluation. How does one evaluate an AI system, and using what criteria? (This is far more involved than validating a production, which we wrote about in “TAR for Smart Chickens regarding a new order that assists in bringing clarity to this gray space surrounding TAR). Evaluation is about being able to correctly and fairly compare AI techniques relative to each other.

Why Is Evaluation Important?

A main observation of Legalweek this year and in prior years that there is nothing fundamentally new in technology being released. However, if one doesn’t know how to assess the value of a technology, then one also is unable to perceive whether or not something is new. There is a saying in research: evaluation drives innovation. Or, “what you can’t measure, you can’t improve.” Or, “what you won’t measure, you can’t improve,” because there has to be a willingness to measure and to properly compare in order to truly understand whether something is an improvement. Our industry falls short in this area.

To know how to do evaluation is the same as the ability to recognize something for what it is. Or, sometimes, for what it isn’t. Evaluation prevents you from becoming swept away by a technology that appears shiny and new, but offers results no better—perhaps even worse—than that of which your technology is already capable.

Evaluation goes far beyond knowing what precision and recall are. Evaluation involves understanding how to structure an honest side-by-side comparison. Precision and recall are evaluation metrics, but doing evaluation means properly understanding how to do an entire comparison, not just choosing a metric.

Boot camp is about breaking down your old way of thinking and building you up into a new way of thinking. Is it easy? No. But that’s what boot camp is: it’s an intense training. It’s not about asking lawyers to become data scientists or technologists, but rather to understand the basic pieces of AI and how they relate to each other. You’ll know how to ask the right questions to those vendors with whom you interact. And that’s the most important, most fundamental skill that an AI-savvy lawyer or e-discovery professional needs to have.

We will be discussing these issues and more in our own educational series on AI. Stay tuned.


About Jeremy Pickens

Jeremy Pickens is one of the world’s leading information retrieval scientists and a pioneer in the field of collaborative exploratory search, a form of information seeking in which a group of people who share a common information need actively collaborate to achieve it. Dr. Pickens has seven patents and patents pending in the field of search and information retrieval. As Chief Scientist at Catalyst, Dr. Pickens has spearheaded the development of Insight Predict. His ongoing research and development focuses on methods for continuous learning, and the variety of real world technology assisted review workflows that are only possible with this approach. Dr. Pickens earned his doctoral degree at the University of Massachusetts, Amherst, Center for Intelligent Information Retrieval. He conducted his post-doctoral work at King’s College, London. Before joining Catalyst, he spent five years as a research scientist at FX Palo Alto Lab, Inc. In addition to his Catalyst responsibilities, he continues to organize research workshops and speak at scientific conferences around the world.