How Large Language Models work
And how they might change how you do business.
The world has gone AI crazy. There’s been a number of exciting AI breakthroughs lately including several in the area of generating images based on simply writing what you’d like to see in the image. But the one that has everyone excited is ChatGPT.
ChatGPT is a type of AI known as a Large Language Model (LLM). In particular it is an LLM that uses the transformer architecture, an architecture proposed by researchers at Google who perhaps didn’t fully understand the value of what they discovered. The clever team at OpenAI took that approach and combined it with other techniques to produce an LLM that is truly incredible and continues to get better.
An overly simple explanation of how it works is that it consists of a model. Think of the model as a big box full of parameters. This model is produced by training it on a very large amount of text. Basically, the entire internet as of 2021. We can’t really be sure what OpenAI included in the training data as they keep that part secret.
Once the model is trained, it works quite simply. You feed it some text and it simply predicts which word it thinks should come next, based on all the data it was trained on. It runs the text you provided through all its parameters and does some math and out the other end of the box comes the next word it has predicted.
So, if I type “Mary had a little” into ChatGPT, it will likely respond with “lamb”. Then it repeats the process and takes that whole new sentence and feeds it through the model again to predict the next word and then the next word after that and so on. It will keep going and probably produce the entire rest of the nursery rhyme before it predicts that it’s time to stop.
That in itself does not seem so remarkable but this fairly simple process is capable of incredibly complex and emergent behaviour that gives the appearance of a very smart artificial intelligence. You could ask it to “explain quantum physics in the style of a Jerry Seinfeld standup routine” for example and it will happily do it. It’s truly mind-blowing that you can ask it to act in a particular way and it will do so like an Academy Award winning character actor. Many are asking it to be their own personal therapist or science tutor for example.
The problem with LLMs on their own however, is that they have no concept of what is true or false. There is no real intelligence or logic here. It is simply predicting the most LIKELY words to say next. If it doesn’t have a good answer for you, it does not tell you that it doesn’t know the answer or isn’t sure. Instead, it confidently makes something up the best it can. “Confident BS” or “hallucinating” are the terms many are using. The end result is that unfortunately, you can never quite trust it. This presents some challenges when putting it to work in a business and the work ahead is figuring out how to overcome that.
Every day people are finding new ways to use LLMs and by combining them with other technologies and systems are producing exciting products and services at breakneck speed.
So, the question is, how is this going to change how we work and do business?
Here are just a few ideas of how we expect this to impact business processes…
- Summarising documents to produce more concise information
- Incorporate it into your software so it makes suggestions as you type as to what you might want to write next (auto-completion) based on the data you’ve trained it on.
- Train it on a body of text such as a company’s business procedures or help documents to become a helpful Q&A bot for staff or customers
- Have it analyse data and provide you with an easy-to-understand written interpretation
- Produce personalised communications to customers based on their profile information
- Write marketing material, website copy or social media posts
- Help your staff more productive as they can get prompts on how to approach a task
- Help with generating ideas and names for things
- Act as a tutor or coach for whatever skill you are performing
- Generate checklists and task lists for whatever task you are performing so you don’t miss anything
- Help people who are not software developers, write scripts, tools, macros and Excel formulas to automate processes
Each of these is a large discussion in itself, which we’d love to have with you if you’re up for it!
In the future we expect it will be more advanced and we will have fully autonomous AI agents that given a task to do, will go away and on their own figure out the steps to do it, produce code to perform those steps and then actually execute the code themselves to perform the task. The AutoGPT project is already experimenting with this, although currently it often gets stuck and it’s just early days.
Of course, the trouble will always be; how do you know it is correct and did the right thing?