top of page
 

13th of June 2024

GenAI and Mixed Models



The landscape of generative AI is rapidly evolving, with applications spanning various industries. In creative fields, generative AI is revolutionizing art, design, and entertainment by enabling the creation of unique works and enhancing creative processes. In business, it is being used to automate content creation, design marketing materials, and develop innovative products. The technology also has significant implications for healthcare, where it assists in drug discovery and the creation of synthetic medical data.

However, the rise of generative AI also brings challenges, particularly the risk of hallucinations, where the AI generates false or misleading information that appears plausible. These hallucinations can lead to the spread of misinformation, undermining the credibility of AI-generated content.

This article will introduce our way to combat this, a mixed model approach.


Introducing Generative AI

Generative AI (GenAI) refers to artificial intelligence systems capable of creating new content, such as text, images, music, and even videos, by learning patterns from existing data. These systems use sophisticated algorithms, particularly deep learning models like Generative Adversarial Networks (GANs) and transformers, to generate outputs that are often indistinguishable from those produced by humans.

The dangers of GenAI are in the name: instead of following a prescriptive algorithm, it generates the output itself. This makes AI unsupervised in a sense that we can never be sure what output it generates. Although the most accurate generative models showcase an impressive range of understanding, even these systems have underlying risks in principle:

  1. The AI might not follow every instruction (1)

  2. The AI might follow every instruction but generate additional content on top of that. When this information is factually inaccurate, we refer to this as hallucination. (1)


However, customer service is a highly prescriptive discipline, opposite to creative industries or sales for example. We expect customer service agents to provide the exact information they are taught, in the style that follows the company guidelines. Anything more or less is deemed unacceptable, and in the case of GenAI, this could range from a slight stylistic issue to hallucinating factually incorrect information or outputting sensitive content.

From our research, the average GenAI on the market answers chats at around 90% accuracy (2), which is only slightly higher than what a non-generative product can achieve. This comes with the heavy risk of unpredictibility though. In most cases, we only need one disasterous chat before it becomes a public relations or legal issue, despite the fact that these systems tend to perform accurately for the most part. In the current iteration of GenAI however, it's impossible to completely eliminate hallucination.


The prescriptive chatbot

The alternative of a generative system is a prescriptive one, where the system picks one of the outcomes derived by algorithmic logic. These systems tend to perform at a much lower accuracy rate than a generative model, however, it can eliminate hallucinations completely.

This is because the output of the AI is ultimately not controlled by the AI as the logic of choosing an output is hard-coded in the model instead of generating it on the spot.

What these models suffer from is not hallucination but inaccuracy. It's important to take a moment to define the difference between these two concepts:


Inaccuracy is when an AI system misunderstands the query but provides factual, although irrelevant information. For example, we would consider it inaccurate if we asked an AI what colour the sky was, and it replied with saying that the largest animal is the blue whale. It is not answering our question, but at the same time, it's a factually accurate statement.

Hallucination occurs when the AI provides factually wrong information, whether it has understood the question correctly or not. If for our example above the AI responded with saying that the sky is purple, that would be considered a hallucination as it is factually incorrect.


We understand that an inaccurate AI will cause frustration, however, the impact of hallucination is far larger, you only need to think of previous newspaper headlines such as issues with the DPD (3) chatbot or Eliza (4).


Mixed model

Our proposal to combat both hallucination (generative models) and inacurracy (prescriptive models) is the use of mixed models. This architecture combines aspects of both generative and prescriptive systems, where a generative system is picking from set outcomes and the generative output is fed into a system that uses prescriptive logic to produce pre-approved outputs. On the simplest level, this could be a system where GenAI is used to simplify queries, which are then processed by a prescriptive system. This results in increased accuracy with no hallucination.

This, however, can put significant pressure on the prescriptive system to understand queries. It also poses the underlying risk of the GenAI simplifying the query incorrectly, misleading further processes. For example, crucial information could be left out, such as the reason of a person wanting to return a product, which might be important context to process the question.


A more advanced methodology of using a mixed model is to only use GenAI as an intermediate step. Here, the prescriptive model would select the closest matches to a query (this can be achieved by technology such as vector search for example), then present it to GenAI in a numbered list and ask it to select the best match. The match then would be processed by the prescriptive sytem again, resulting in a pre-defined output.

This will result in both close-to-GenAI level of accuracy and no hallucination. Whilst this is our recommended approach as this is the most complete product, it can also have some drawbacks. A system like this relies on multiple AIs interacting with each other, which can both increase costs and processing times. Although this method is a step up from using a prescriptive system, it will still require significant manpower to maintain this system as it offers a less automated system compared to a fully generative product.


Conclusion

We have presented an overview of three methods for a modern chatbot, all with different approaches. The main considerations we need to make here is the importance of accuracy and hallucination. Generative AI is the most accurate solution and is recommended for personal use and creative industries. It can be, however, a significant risk for more procedural environments, for example for customer service. Here, a prescriptive system can be preferred to eliminate hallucinations. On the other hand, this system will reduce accuracy and be less autonomous than a generative one, thus requiring more resources to function. Our solution of using mixed models enhances the shortcomings of both systems by creating an environment where a prescriptive and generative system interacts with each other. This comes with the accuracy of GenAI and the non-hallucinating aspect of a prescriptive AI.

  

Author

Robert Flick

 

References

1

Van Steensel, J. (2024). Dangers of GenAI. Conversations by Ami.

2

Van Steensel, J., Flick, R., Chinavicharana, S., Draper, J., Zhao-Bird, W. Y. (2024). Industry standards in chatbot accuracy across multiple technologies. Conversations by Ami.

3

Gerken, T. (2024). DPD error caused chatbot to swear at customer. BBC News. Available at: https://www.bbc.co.uk/news/technology-68025677.

4

Atillah, I.E. (2023). AI chatbot blamed for ‘encouraging’ young father to take his own life. Euronews. Available at: https://www.euronews.com/next/2023/03/31/man-ends-his-life-after-an-ai-chatbot-encouraged-him-to-sacrifice-himself-to-stop-climate-.

 

Commentaires


bottom of page