A Revised Version of My Alternative EDRM Model

In a recent post, What’s Wrong With the EDRM Model?, I argued that the popular graphic that is used to illustrate the Electronic Discovery Reference Model had a mistake. The colored triangles used to show the overall volume of data and the volume of relevant data seemed off. They appeared to show that, as the e-discovery process moves forward, the overall volume of data would drop to zero and the number of relevant documents would rise from zero. Neither could be true, of course, since the relevant documents were there from the start and made up part of the overall volume. I created an alternative illustration that I thought better depicted the flow of documents.

In a comment to my earlier post, George Socha, a creator of the EDRM model, wrote:

As we discussed, there is reason behind the differently scaled triangle. The idea is that if you start with a terabyte of data, you should not end up with a terabyte of data. It is highly unlikely that the entire terabyte is relevant, however you define relevance. And who would want to put a terabyte in front of a jury, even if a judge were to allow such a folly?

By the way, you lost some arrows along the way, so that your version never allows for disposition of data, which is an information management task.

The point George makes is one I missed. When I made my model, I did not consider the fact that the original model had different sized triangles. I am constantly amazed by how many things in plain view that I can overlook.

That said, and after reflecting on George’s point, I am not sure I agree. I think the triangles should represent two different things: 1) a decrease in the absolute volume of data in play; and 2) an increase in the percentage of remaining data that is relevant to the case. With a small modification to the chart, I think we can show both in a way that makes sense and fits actual practices.

Try this slightly revised version:

The left triangle is higher than the right one, as George pointed out. But I might have a different reason for doing so. Here is my thinking.

The left triangle represents the actual volume of data. In most cases, the relevant portion of that data is a small percentage of the total (but hopefully above zero). That is shown by the darker portion of the triangle that intersects the Y axis somewhere above zero (and we could always debate where that falls).

The right triangle represents the percentage of total document under consideration that are relevant to the inquiry. It is not as high as the left triangle because we know that in most cases there are also some less or non-relevant documents that make it into the court room (or in your trial notebook). For example, I may mark 100 exhibits but might use only 60 of them. The others were included just in case.

Thus, we have two values to consider on the Y axis of the right triangle: 1) percent relevant; and 2) the percent of non-relevant documents. If you add the two values together, you would get 100%. I would represent that value at the same height as the top of the left triangle. On the left, we show 100% of total volume. On the right we would show 100% of the total (relevant, non-relevant) as well. That seems to make sense and has elegance.

Maybe this was the intent all along of George and the others who created the model. If so, a tip of the hat. If not, what do you think? This is kind of fun to think about and it beats the other work that I should be doing.

Oh, on the lines at the bottom. I left them out because I think they make the model look overly complicated without adding much. As a lawyer, I always worked with copies of the original documents and rarely if ever sent them back to the client or the client’s records manager. Rather, we would hold them for awhile and then dispose of them. The same is generally true for our work as an e-discovery platform. Sometimes we archive and return the documents when the case finishes. Sometimes we are asked to delete them. In either case, they don’t go back to the records manager. They go to the law firm or the GC’s office to hold just in case they need the data.

Thanks for the comments on my original post. Feel free to share your thoughts on this revised version.


About John Tredennick

A nationally known trial lawyer and longtime litigation partner at Holland & Hart, John founded Catalyst in 2000. Over the past four decades he has written or edited eight books and countless articles on legal technology topics, including two American Bar Association best sellers on using computers in litigation technology, a book (supplemented annually) on deposition techniques and several other widely-read books on legal analytics and technology. He served as Chair of the ABA’s Law Practice Section and edited its flagship magazine for six years. John’s legal and technology acumen has earned him numerous awards including being named by the American Lawyer as one of the top six “E-Discovery Trailblazers,” named to the FastCase 50 as a legal visionary and named him one of the “Top 100 Global Technology Leaders” by London Citytech magazine. He has also been named the Ernst & Young Entrepreneur of the Year for Technology in the Rocky Mountain Region, and Top Technology Entrepreneur by the Colorado Software and Internet Association. John regularly speaks on legal technology to audiences across the globe. In his spare time, you will find him competing on the national equestrian show jumping circuit or playing drums and singing in a classic rock jam band.