In January 2022, I set out to solve a problem in the legal tech space.
From my stint as a lawyer, I noticed that a frustratingly high amount of time was spent on legal research. I conducted user interviews with numerous lawyers, who uniformly reverberated this sentiment - the existing legal research tools left much to be desired. One of them even told me “If a relevant case can’t be found by page x, then it doesn’t exist”. Knowing that the incumbents’ solutions relied heavily on BM25, a decades-old algorithm, and that their natural language search capabilities were lacklustre, I thought I had found a problem space ripe for disruption.
What was Ariadne?
Ariadne was designed to be a search engine for case law, with the relevance scoring function comprised of not only BM25, but also weighted graphs and techniques from the frontier of Information Retrieval. A key intuition was that legal authority was derived from precedents, meaning the landscape of judgments can be mapped naturally as a graph1. I settled on a bi-encoder reranker with sentences as the atomic unit2, which made intuitive sense as there should only be a few sentences relevant to a specific query in an arbitrary judgment. A key feature of the product, was the ability to traverse the judgments landscape through node traversal, which allowed users to reach landmark judgments more expediently. Through my rose-tinted glasses, I failed to notice any red flags.
What went wrong?
I exuberantly hacked together the prototype - living, breathing, dreaming of Ariadne. I thought that Ariadne was the key to illuminating the way through the treacherous labyrinth of case law. Eventually, I had to test my hypotheses.
-
Did Ariadne truly provide superior search results?
-
Did the search experience truly increase the productivity of lawyers who relied on legal search engines?
-
Was Ariadne truly a 10x product?
From my subjective experience using my prototype, I felt that the relevance algorithm did manage to produce search results of decent quality. The search experience was delightful (a bit self-congratulatory, yes), and the node traversal feature did allow me to save time on scanning cases for relevant sentences and citations. However, I had no way of scientifically evaluating the search quality in a vacuum. I was also acutely aware of my own biases, as I had built the prototype entirely from scratch, even anthropomorphizing it by giving it a name. The only way of ascertaining the value of the product, was to solicit user feedback.
I showed the demo of Ariadne to a cohort of lawyers, and had them use it in my presence.
“Where are the case reports?”, asked most of them.
Case reports were commentaries written about a specific sub-domain of law, providing guidance on said domain through providing analyses on cases - essentially SparkNotes. I had rarely relied on case reports in the past, leading to my erroneous undervaluation of such information. The importance of case reports did not adhere to my notion of legal research, and my then framework of reality. Furthermore, case reports were not publicly available, as the legal search incumbents had some sort of monopolistic gatekeeping to such data. Needless to say, I had to adjust reality.
I came to the realization that even if I managed to deliver 10x search quality, Ariadne would still be lacking in a fundamental category of data provided. I had to go back to the drawing board.
Ariadne, however, was anything but a waste of time. I’m still convinced that the legal search space has room for disruption. There is a reason why the incumbents are still using BM25 - it’s good enough to solve the problem, it gets the job done.
What did I learn?
- Importance of Detailed User Interviews
- In hindsight, I should’ve conducted user interviews in significantly more breadth and depth. Instead of assuming that users’ search habits were similar to mine, I should’ve acquired more granular understanding of their existing workflows. I also suspect that had I conducted more user interviews, I might’ve decided against tackling legal search - an absolute behemoth of a problem. Is it truly a “hair on fire” problem? Most importantly, I would likely have discovered the prevalance of relying on case reports, which was ultimately the fatal blow to Ariadne.
- Avoiding the Seductive Allure of Building
- To an extent, I fell in love with the technical problem of delivering superior search quality, instead of the customer. I became obsessed with applying state-of-the-art Information Retrieval techniques to a problem which had previously caused me pain. I now realize that part of me wanted to prove to myself, that I was capable of building a full stack search engine from scratch - it sounded like a sufficiently complex (technical) problem to solve and I was fascinated by the practical applications of Natural Language Processing.
- Necessity of having a Co-founder
- I don’t think it’s controversial to assert that having a co-founder drastically increases the odds of success. At the very least, it would increase the speed of iteration by more than 2x, and I would not have been stuck in an echo chamber of my presuppositions about the market and the problem.
A lesson worth the price paid. Having performed backpropagation on the flaws on my previous approach, I’m eager to attempt my next shot on goal. Such mistakes will not be repeated, I’m sure of it - I’m now in a much better position to start my next venture thanks to Ariadne. But first, let me find a co-founder.
-
Case x cites cases y and z. Cases y and z would be the parent nodes of case x. Through recursive traversal, the root nodes would be identified - usually they are landmark judgments, containing the declaration of key legal principles therein. ↩︎
-
A cross-encoder architecture would have been ideal, as cross-encoder rerankers dominated the Information Retrieval benchmarks. However, given my lack of access to GPU compute and the theoretical increase in latency, I decided against it. ↩︎