The news about ChatGPT is everywhere, and like many businesses, you’re probably wondering: What is it? Does it work? Should I use it? Does it have capabilities that can benefit my business? Is it safe? Is it something we need to consider?
We recently announced our own integration with GPT, the large language model (LLM) that ChatGPT is built on, and you’ll be able to see how it works in our upcoming NLP LiveStream on Thursday, March 2 at 11 AM EST (Save the date to join us!).
By way of background, here is some information to help sort through the current ChatGPT buzz and understand if and how ChatGPT, and by extension large language models (LLMs), can be leveraged within your enterprise.
There are daily accounts online of how people are experimenting with ChatGPT that showcase both its novelties as well as its weaknesses, so it’s important to have a foundational understanding of the technology at work in order to separate hype from reality (and practicality for your business).
With this in mind, we’d like to share some guidance to help you understand how to think of ChatGPT when it comes to enterprise use cases.
What Is ChatGPT?
ChatGPT is a form of “natural language generation” that runs on Open AI’s GPT to produce text in response to question-and-answer queries. Here are some quick facts:
- ChatGPT was released in late November as a chatbot and is an example of “NLP generation.”
- ChatGPT generation capabilities are impressive, mainly because it generates human-like language, often creatively or with authoritative declarations. But be forewarned, it is not grounded in fact and can get things very wrong or make them up (generally called “hallucinating”).
- If prompted properly, ChatGPT can extract language data in specific instances, but this is not ChatGPT’s core purpose at scale.
- While ChatGPT is the application currently creating buzz, it is based on Open AI’s GPT model.
- GPT is a type of LLM pretrained on unlabeled data using deep learning AI techniques that have been around for several years now. Many LLMs are now available—often open-source versions—and they will continue to evolve.
.
How Does It Work?
It’s important to understand that the content ChatGPT generates in response to a request is not based on an understanding of relationships and context within language, but on a prediction of the words and sentences that might come next based on historical inputs from analysis and modeling of large (actually massive) amounts of language sources (hence, LLM).
The result is content that is convincing, usually grammatically correct, intelligent sounding and presented in a conversational tone. Sometimes it is factual, but not always, because at times when there is a lack of findable factual data, a response is generated nonetheless; most importantly, it doesn’t identify what is fact and what is fiction. It’s this factual unevenness and potential for “coherent nonsense,” as referenced by Forrester, that introduces risk when used exclusively in enterprise applications. There are ways to mitigate that risk, which we’ll identify later in this post. But with any AI model, one of the most important things to know about ChatGPT is how it arrives at the information that it generates.
ChatGPT Within the Enterprise
GPT and LLMs have limitations that must be considered:
- ChatGPT, GPT and LLMs are based on data in the public domain:
- Like other LLMs, the data on which ChatGPT has been trained is publicly available content (up until a specific point in time – currently through September of 2021). This content includes everything from articles and books to Reddit discussions, social media content and Wikipedia. As a result, this data is exposed to bias, discrimination and uneven factual accuracy. This is how LLMs and ChatGPT have the potential to return potentially inaccurate or even toxic disinformation and misinformation. GPT, and specifically the GPT version ChatGPT is built upon, have addressed some of these concerns with human input and have improved as a result, but they have not eliminated the potential for toxic responses. That said, it has been reported that the extent to which this content has been “cleaned” of bias and harmful language depends on the work of outsourced contract labor.
- As a result, this creates concerns around copyright infringement and for data privacy and consent around the use and sharing of personally identifiable information (PII) and related legislation, such as the European Union’s GDPR and other consumer protection laws. Gartner has highlighted the lack of an enterprise-grade privacy policy in OpenAI’s existing ChatGPT privacy policy, and it recommends both humans in the loop to verify output and policies that limit how it is used to avoid disclosing confidential information.
- GPT and all LLMs are considered “black box” transformer models that are not explainable:
- Based on the source of data and algorithms used by most all LLMs, the results cannot be explained. Accurate and inaccurate results cannot be practically tracked to understand how the results were obtained.
- GPT and LLMs use “large” amounts of data and require massive compute power, creating concerns for explainability and transparency (as mentioned above), sustainability and the ‘human-in-the-loop’ aspects that are essential for Responsible AI:
- The costs to run ChatGPT’s more than 175 billion parameters has been referred to as “eye watering” by its own creators, with a carbon footprint of up to 23.04 kgCO2e per day, according to one estimate.
- ChatGPT is cloud-based:
- ChatGPT is publicly hosted and lacks any service level agreements (SLAs) and cannot ensure privacy. While GPT models have service level options in some cases, enterprises should research those options and any cost associated with SLAs. Note that some LLMs allow hosting, but this is not the case with the GPT model ChatGPT is based on.
- ChatGPT is costly:
- While ChatGPT is free, other GPT versions add layers of accuracy and services. As you access the latest/best version of GPT models the costs per transaction increase, sometimes by orders of magnitude. While the individual transaction costs appear low, NL applications typically leverage tens to hundreds of thousands of transactions in automating language processes.
- ChatGPT is not domain-specific:
- Since its out-of-the-box language capabilities are limited to public data, and even if this content widely covers many domains, it is not representative of what is used in most complex enterprise use cases, either vertical domains (Financial Services, Insurance, LifeSciences and Healthcare) or highly specific use cases (contract review, medical claims, risk assessment and cyber policy review). So, even for chat/search use cases—the ones that work similarly to ChatGPT—it will be quite difficult to have quality and consistent performance within highly specific domains.
In summary, while ChatGPT offers impressive results in the NL generation arena and has potential application as a contributor for enterprise use cases, enterprises should know what the limitations are and how to manage them.
We have integrated GPT (the LLM that ChatGPT is built upon) in our platform to ensure we can continue to provide the most comprehensive set of tools and workflow to implement practical enterprise use cases. Our position is not to advocate for any single approach (ML, DL, Symbolic, LLM) to solve your natural language challenges but use the best set of tools and approaches to solve the problem at hand with a hybrid NL approach. We enable that with the ability to design, develop, deploy, manage and monitor your solution within an enterprise environment.
We welcome your questions and a chance to further the discussion. Join us for “GPT, Large Language Models and Hybrid AI Picking the Best Tool for the Job” on March 2 at 11am ET.