Human-friendly, yes, but are the answers ChatGPT gives anything like human reasoning?

Artificial intelligence is seeping into modern life like the fog rolling around Victorian London in the opening pages of Bleak House, Charles Dickens’ classic novel about the law.

The more organisations entrust decision-making to AI systems, the greater the concern about how much of this technology remains shrouded in mystery.

How do these machines actually work? How much will they transform workplaces? And how much will they impact on jobs in white-collar, knowledge-intensive professions?

We have explored the effect of machine learning models on the work of the legal profession, the opportunities they created and the risks they posed.

Operating under the name of LegalTechCo, we built an in-house AI system for litigation from scratch. Its job was to provide in-depth analytics about judges’ decisions and their reasonings.

This presented enormous challenges. Court documents are unstructured and require legal analysts to interpret them. Any system replicating this work would have to develop something akin to human reasoning to independently understand legal texts.

How does AI replicate human reasoning?

The problem is that we do not really know what that ‘something akin to human reasoning’ is. Machine learning models learn from datasets to create their own logic, relying on opaque algorithmic rules. They are ‘black boxes’ whose internal workings are incomprehensible to humans.

We built a state-of-the-art deep learning model that analysed legal documents with remarkable accuracy. But, no matter how much we fine-tuned the dataset we fed into the computer, we could never be quite sure how the system had arrived at its findings. 

In our study, we looked for clues, using heat maps that highlighted the AI’s neuron activations in the text and cross-validated the AI results with human calculations.

Will AI eventually pass the Turing test?

The data science world is still grappling with how AI ‘brains’ work. Despite significant advances in large language models such as ChatGPT, which interact in a human-friendly way, it is unclear if the answers they give are anything like human reasoning. Is AI playing its old game of finding something correlated to the questions it is asked? The jury is still out.

The day that the machine finally passes the Turing test, demonstrating its ability to think like a human being, is probably some way off. Until then, the following findings from our research remain relevant.

1 Bespoke is best

One day it may be possible to choose ‘off-the-shelf’ machine learning tools, but for the time being, companies should develop their own AI systems for their unique situations. Successfully developing a machine learning system starts with properly defining problems for it to solve. This is a complex but critical task and needs senior staff to work with AI developers.

The company must explain the vision, then break it down into simple questions that need solving before AI developers can turn it into code. These questions must have ‘yes’ or ‘no’ answers. Staff working with the developers need to gain an intuitive understanding of the key concepts.

Sometimes AI will need human help or a new algorithm. In our case, lawyers wanted the AI system to scan whole documents, which can be 6,000 words or more, but the state-of-the-art algorithms had a limit of 512 words. It took two months to develop the new algorithm required to read whole documents.

2 Refine through iteration  

Machine learning AI does not have to be perfect before being used. Refine it as you go. The more varied, good-quality data it is fed, the more accurate its decisions will be. This is why the training dataset is so important.

Don’t just select a random pile of data. Ensure there is a balanced number of examples of ‘yes’ and ‘no’ answers for each question. This should iron out any biases.

Save time by getting the machine learning to 80 per cent accuracy. This can then be reviewed by experts in the field, corrected and fed back into the system. We started with 200 documents, but, to gain accuracy, the training dataset eventually reached 701 documents.

3 Ask for clues in measuring accuracy

The best way to understand AI’s decisions is to look for clues, rather than fixating on one metric, which can downplay the importance of others.

Our lawyers were told the machine learning system had reached 90 per cent accuracy, but they found a false-positive rate of 38 per cent and a false-negative rate of five per cent due to an imbalanced dataset.

The lawyers could not work with that many false negatives. Adjusting the model reduced these to one per cent and increased the number of false positives, which they could live with. The results should also be cross-validated with human calculations. Differences can challenge assumptions made by the firm, so analyse them carefully with an open mind.

4 Tap into local knowledge clusters

So much more can be achieved through partnerships with service providers or universities with data science expertise than by going it alone.

Encouraging firms to make better use of the UK’s technology base – the Knowledge Transfer Partnerships programme – can help.

Contrary to early assumptions, AI is affecting knowledge-intensive professions more than low-skilled jobs. This technological revolution will impact everyone, from lawyers and journalists to accountants, architects and artists.

Our legal tech company reduced the time spent on each file by at least 90 per cent, requiring just three minutes of AI analysis followed by 15 minutes of human validation.

The most visible impact of this revolution is on job losses. IBM’s decision to freeze recruitment, as it expects AI to replace 7,800 non-customer facing jobs, has rung alarm bells.

Yet AI will often augment human expertise. It will streamline the analysis of commercial lease reviews, picking out key provisions such as rent, lease terms, and break clauses. Or it will help to write news reports, allowing journalists to gather more quality stories.

Many of its functions will be extractive and analytical, allowing human beings to interpret and conceptualise.  

The new world will require people with flexible skill-sets. It will require innovation managers, problem solvers and ‘AI competent’ leaders.

But, as we adapt to our new environment, some of the fog surrounding AI may well clear. 


Further reading

Zhang, Zhewei, Nandhakumar, Joe, Hummel, Jochem T. and Waardenburg, Laura (2020) Addressing key challenges of developing machine learning AI systems for knowledge intensive work.

Joe Nandhakumar is Professor of Information Systems. He teaches on the MSc Management of Information Systems and Digital Innovation, the Doctor of Business Administration, and the Executive Diploma in Digital Leadership where one of the modules he teaches is Leading Digital Transformation.

Zhewei Zhang is an Assistant Professor of Information Systems Management. He teaches on the MSc Management of Information Systems and Digital Information

Learn more about AI with the four-day Postgraduate Award in Leading Digital Transformation

For more articles on the Future of Work sign up to the Core Insights newsletter here.