Now that more people and companies are using large language models (LLMs), I started wondering how prompt engineering as a technique will change the processes and methods of developing natural language processing (NLP) solutions. Large language models and prompting certainly provide new opportunities for implementing NLP solutions, but the assumption that I wanted to explore is the following: shouldn’t a good prompt engineer also act as a knowledge engineer when it comes to implementing complex NLP pipelines?
The Role of a Knowledge Engineer
At expert.ai, knowledge engineers play a key role in developing NLP pipelines. They have a deep understanding of the domain-specific business processes that need to be automated, and they spend a lot of their time designing and implementing the NLP solution that mimics this process: extracting value from language.
In building an NLP solution, knowledge engineers can use different techniques, and in a hybrid AI environment, this could also include prompting LLMs. In fact, knowledge engineers are involved in all key phases of the data value chain and project implementation cycle.
To get an idea of a knowledge engineer’s responsibilities in this realm, let’s look at some of the steps and skills required:
Business problem understanding
By shadowing or interviewing subject matter experts (SMEs), knowledge engineers are able to dig into the business processes, explore the input data and assess its quality. A key factor in this phase is expertise in the project’s domain: a knowledge engineer who knows a domain allows them to anticipate the needs and requirements for SMEs, and it helps them extract valuable information in the language.
Design and modeling
Designing the solution and structuring the knowledge to support the processes automation is a key phase. It’s when you ask questions like, “Do documents need to be classified?” “What data needs to be extracted?” “Can a question answering system solve the problem?” or “Do I need to summarize contents?” The outcome of this process will lead to the design of the NLP solution and the assessment of the best implementation strategy/technique/approach for each NLP task. Here, efficiency and effectiveness are the key criteria.
NLP solution implementation and configuration
Once the knowledge engineering team understands what needs to be done, it’s time to proceed with the actual implementation. At this point, data preparation, annotations and technical implementation (machine learning model training and testing, prompting, scripting or rules writing, etc.) keeps the team busy.
Model evaluation
The final step is evaluating the solution: SMEs validate the results; weaknesses are identified, and the model is fine-tuned until the target accuracy is achieved in an iterative manner.
Completing the steps outlined above requires:
- A holistic approach. Solid knowledge of the available implementation techniques is essential in order to select the most suitable technique for each task.
- NLP expertise. This includes the ability to understand and combine many different components (OCR, converter, data pre-processing modules, ML models, post-processing logic) into a solid analysis pipeline to achieve the best results for each component.
- Tools and utilities for easy prototyping and feasibility assessments.
- Domain-specific knowledge.
The Role of a Prompt Engineer
Given that prompt engineering is a rather new discipline, let’s start with a definition.
Prompt engineering is the process of designing prompts that get a large language model or related application to produce the desired response. You can think of it as a sort of programming that is designed to provide the right instructions phrased in just the right way that tell the model what to do. Because of the wide range of information that LLMs are trained on, there are a wide range of responses to any given question. To guide the model, prompts are carefully structured—containing context around the topic and specific instructions to frame the search—in order to get the output expected.
Since the release of ChatGPT last November, “prompt engineering” is now a desired skill, even an in-demand occupation in itself. Prompt engineering courses are being offered to train people to become prompt engineers, companies are offering prompt engineering services, and prompting libraries are being created for repurposing.
I believe that prompt engineers and knowledge engineers share many similarities in terms of processes and objectives. But here are some questions that I want to put to the test: Can prompt engineers ignore the business case when crafting prompts? Can they skip the knowledge representation required in the design phase, which is a typical task of knowledge engineers? Do they need to have expertise with LLMs, or do they need domain-specific knowledge? Is the iterative process that leads to fine-tuning the prompt for a given task so different from the fine-tuning that knowledge engineers apply to ML/symbolic/hybrid models to get the best out of them?
When building custom NLP applications, the prompt engineer is tasked with creating and optimizing prompts (input data) to generate the desired output from language models, such as GPT3, GPT4 or others. This requires certain skills:
- A good understanding of the business case (what are my prompts supposed to achieve?).
- Ability to design and model the solution. When building complex pipelines, it is unlikely that you will be able to solve the use case with just one prompt. Instead, you will need more than one prompt and more than one component to go through the data preparation, data mining and business logic implementation steps.
- When it comes to implementing the solution, prompt engineers need to know the capabilities and characteristics of the AI model in use so that they can craft the specific prompts to extract the best possible output, as well as other techniques, to adopt the best approach for a given task (again, consider the complexity of potential use cases). This is very similar to the duties of a knowledge engineer: studying prompt engineering techniques to be able to build hybrid pipelines and determine the best strategy for the given use case.
- Test the accuracy of the results. Whether you’re performing Q&A tasks, data extraction or summarization, the quality of the results must be tested and then fine-tuned, in an iterative manner, until you achieve the desired results.
Shared Skills
In addition to those AI- and NLP-specific skills we highlighted above, there are several key skills that are common to both roles. These include:
- Analytical skills. Able to analyze business processes and represent the knowledge structures behind the process.
- Knowledge of a holistic approach and comprehensive knowledge of NLP. This includes being aware of and able to use different NLP techniques, knowing each step of the data value chain and being familiar with the components for performing the main tasks in each step of the NLP project implementation life cycle in the most effective and efficient manner.
- Business- and domain-specific knowledge. Again, understanding the business problem or use case, as well as the domain, is essential for getting the results you need. Prompt engineers and knowledge engineers could eventually become SMEs in the project’s domain.
Both roles must also be able to prepare data and evaluate results and computing metrics. They should also be familiar with strategies and techniques for fine tuning when models don’t act as expected.
However, there is a key caveat that we want to point out: Both knowledge engineers and prompt engineers must know the tools they are working with and apply their skills accordingly. A large language model and related applications, such as ChatGPT, Bard, LLaMA or any of the many open-source tools now available, are very different from your average AI model. It has been trained on massive amounts of data of varied provenance. As a result, its capabilities, strengths and weaknesses are different too.
And, both roles must be aware of any potential biases that their training might bring to the table. This is especially the case for knowledge engineers who are much more involved in building a system’s knowledge from the ground up.
The linguistic knowledge that a knowledge engineer might rely on and utilize in their prompts could be useless for a LLM and end up biasing the prompt creation. For example, being able to think about all the possible ways in which a linguistic phenomenon can be expressed—so that general comprehensive rules can be written—is very important in the symbolic AI world, while knowing how to craft a prompt with all the necessary details for the required output is a “new” unrelated skill. Also, because knowledge engineers are involved in building logic and knowledge into a system, the opacity of a LLM and what it may or may not “know” may lead to a tendency to write overcomplicated prompts that underestimate the LLM’s capabilities.
In conclusion, I believe that a prompt engineer in the field of NLP must also have knowledge engineering skills (this may not be true for content creators or artists, but it’s essential for NLP pipelines). Likewise, knowledge engineers need prompt engineering techniques in their skill set.