Q&A: Collaborative Information Seeking: Smarter Search for E-Discovery

[This is one in a series of search Q&As between Bruce Kiefer, Catalyst’s director of research and development, and Dr. Jeremy Pickens, Catalyst’s senior applied research scientist.]

BRUCE KIEFER: In our last Q&A post (Q&A: How Can Various Methods of Machine Learning Be Used in E-Discovery?), you talked about machine learning and collaboration. More than a decade ago, collaborative filtering and recommendations became a distinguishing part of the online shopping experience. You’ve been interested in collaborative seeking. What is collaborative seeking and how does it compare to receiving a recommendation?

DR. JEREMY PICKENS: Search (seeking) and recommendation are really two edges of the same sword.  True, there are profound differences between search and recommendation, such as the difference between “pull” (search) and “push” (recommendation). But these differences are not what primarily distinguish collaborative information seeking from collaborative filtering. Rather, the key discriminator is the nature (size and goals) of the team that is doing the information seeking.

With collaborative filtering, the “team” is just one person. You, alone and individually, are looking for a new toaster oven, or a new musician to listen to, or a new restaurant at which to dine during your vacation in Cancun. If one of your friends already owns that toaster oven, or a copy of that CD, or has dined at that place in Cancun, you might get a better recommendation about which option to choose. But it is not the fact that the friend already owns or has already experienced something that satisfies your information need. Rather, you are relying on the already satisfied needs of others around you in order to get better information about what is available to you, and thereby satisfy your own need.
With collaborative search, on the other hand, you are a member of a team consisting of at least one other person, possibly more. You are actively working together with that person to satisy a jointly held information need. My favorite example is of a couple looking to find a house or apartment. It does not help you to know that “people who bought this house also bought that house,” or that “people who live in this apartment also have lived in that apartment.” You are not going to move in together with all those people. You are going to move in with your partner.

And so as you are both searching for places to live, each of you enters different criteria about what is and is not important to you. You might like to live somewhere with great southern-facing exposure. Your partner might like a place with a garden. You might like a kitchen on the upper floor, and your partner might like enough work space in which to tinker on her motorcycle. A collaborative information seeking system should then attempt to find houses or apartments that satisfy both of your needs, jointly and simultaneously.

It is my belief that collaborative information seeking is much more appropriate to e-discovery than is collaborative filtering. Imagine collaborative filtering (“people who bought this also bought that”) in an e-discovery context: “People who have judged this document as responsive have also judged that document as responsive.” Of what value is it to know this? Given that someone else has already judged the document as responsive, why do I need to look at it? Unless I am doing quality control, it is simply a waste of time and client resources for the reviewer to judge again a document that has already been judged. Collaborative filtering falls apart in the e-discovery context, as it yields unnecessary repetition of labor. Collaborative filtering might work very well for toaster ovens, as you will still buy the toaster oven even if your friend has already bought the same model. It does not work well for e-discovery, as there is no sense in judging a document if your “friend” has already judged it.

By contrast, this is where collaborative search shines. Collaborative search allows you to find information that has not been viewed/judged/assessed by any member of your team of two or more people, but that is jointly relevant to the task that you are all working on, together. Collaborative search allows you and your team members jointly to push deeper into the collection, to documents that none of you would have likely found, were you working alone. Just as collaborative search allows you to find that house or apartment with both the southern exposure as well as the motorcycle workshop, it allows you to find documents that satisfy both the lead counsel’s as well as the review manager’s understanding of the task.


About Jeremy Pickens

Jeremy Pickens is one of the world’s leading information retrieval scientists and a pioneer in the field of collaborative exploratory search, a form of information seeking in which a group of people who share a common information need actively collaborate to achieve it. Dr. Pickens has seven patents and patents pending in the field of search and information retrieval. As Chief Scientist at Catalyst, Dr. Pickens has spearheaded the development of Insight Predict. His ongoing research and development focuses on methods for continuous learning, and the variety of real world technology assisted review workflows that are only possible with this approach. Dr. Pickens earned his doctoral degree at the University of Massachusetts, Amherst, Center for Intelligent Information Retrieval. He conducted his post-doctoral work at King’s College, London. Before joining Catalyst, he spent five years as a research scientist at FX Palo Alto Lab, Inc. In addition to his Catalyst responsibilities, he continues to organize research workshops and speak at scientific conferences around the world.