Five years ago, when the three of us founded Cresta and Google Contact Center AI (CCAI), we had a shared vision: AI would enable knowledge workers to be 10x more efficient and 10x more effective. Therefore, workers would overall be 100x more productive. We started with customer conversations, where deep neural networks were just beginning to revolutionize the field of natural language processing. Because language serves as the “interface” between humans, we firmly believed this would be the most exciting commercial application of conversational AI.
Since then, we’ve been fortunate to witness paradigm shifts in AI technology every two to three years. Our initial products used sequence models trained from each customer’s conversation, from scratch. In 2018, deep transfer learning models such as BERT made it possible to develop such classifiers with much less data. And in 2019, we were one of the first companies to deploy GPT-2 like models for our Chat Agent Assist.
Fast forward to today, AI has crossed the inflection point for working with human language. The innovations around large language models has led to a Cambrian explosion of applications. At Cresta, we’ve been fortunate to have front-row seats to both lead and participate in this revolution. Today, large language models are under the hood in every one of our products.
One thing for certain is that, when we are one day looking back from 2028, the predictions we make today will look foolish in some areas and lack imagination in others. Five years ago, we could never have predicted the acceleration to CCaaS precipitated by a global pandemic, or the explosion of excitement around ChatGPT.
But what will not change is our obsession with customers. As we continue to invest in future-proof products, here’s what will remain constant: Businesses want high-quality conversations with their customers, and to gain deep insights from these conversations.
Before we get into the predictions, it’s important to mention the very real limitations of language models (LLM) today, as mentioned in our previous blog post:
1. LLM still hallucinate.
Some of ChatGPT’s answers are surprisingly good, but sometimes it can confidently reply with nonsense, making stuff up while still appearing authoritative. The lack of controllability and transparency limits its application in call centers.
2. Understanding business knowledge.
Today LLM lack controllability, predictability and explainability. These limitations need to be mitigated before LLM can be widely deployed in an enterprise context. One of the key research opportunities comes from seamlessly integrating with proprietary knowledge and structured data like the business rules and policy required for a contact center.
3. Integration with key enterprise workflows.
Customer data in most enterprises remain siloed. For Generative AI to drive real value for the enterprise, it shouldn’t stop just at generating responses. It should integrate into different workflows such as CRM entries, coaching software, order management, etc. In other words, it should drive actions across systems.
4. Cloud penetration for contact center platforms is still only at 25%.
The majority of contact center agents globally still use on-premises software adding to the complexity of integrating AI into their environment.
Looking ahead to the next five years, we believe that, more than ever, the future contact center will be fundamentally built around AI. Here are a few predictions that we are excited about. We intend our predictions to be provocative, leading to discussion and hopefully, to provide direction for practitioners.
Prediction 1: AI Will Disappear
In five years, every single workflow of the contact center will use AI. It will become so ubiquitous that the thought of “I am using AI” will disappear, in the same way people don’t think about using electricity or the internet. All customer interactions will rely on some form of neural network computation.
Prediction 2: Typing Will Disappear
Smart Compose and Suggestions will be available on all chat contact center interfaces, with AI doing the majority of the typing for chat agents. Agent’s jobs will shift towards selecting suggestions and navigating call flows. Manual typing will be reduced to less than 5% of all keystrokes.
Prediction 3: Search Will Disappear
Agent Assist will listen to the conversation and automatically surface the right information at the right time. No more searching articles using keywords and interacting with various system interfaces. Large language models will resolve the hallucination issues and will seamlessly integrate with proprietary knowledge base, policy models, and guardrails.
Prediction 4: Repetitive Work Will Disappear
Most repetitive conversations will be made self-serve or automated through Virtual Agents. More importantly, AI will learn from screen recordings and learn to do repetitive work through learning by demonstration*.
Prediction 5: “Wait on Hold” Will Disappear
We’ll see a significant drop in Average Handle Time for voice conversations. Voice Virtual Agents will handle simple conversations. And AI will generate all call center notes and after-call summarizations.
Prediction 6: Model Training Will Disappear
Large language models are going to accumulate enough expertise in contact centers that custom labeling and training will no longer be necessary. The model will read a business’ knowledge base, documents, KPI objectives … and know what to do to drive better conversational outcomes. Customers will be able to see AI’s value on day one.
Prediction 7: AI Will Talk to AI over the Phone
We will have more and more AI-to-AI conversations on the telephony system, which was originally designed for human conversations over a century ago. Over the next five years, with the proliferation of personal assistants on the consumer side (directly built into smartphones and dialers), and voice virtual agents on the business side, we will inevitably see an increasing number of phone conversations powered by AI on both ends: personal assistant on one end and enterprise virtual agent on the other end.
(Bonus) Prediction 8: Generated by ChatGPT
On a more serious note, one non-AI-related prediction: over a longer time horizon, the medium in which people interact with businesses will continue to shift from the traditional telephony to digital as consumer-facing messaging platforms aggregate more and more users and businesses. The telephone system is probably the greatest demonstration of the power of network effect. Since its invention over one century ago, telephony has become the de-facto communication medium. While its capability has not evolved much for a long time, it remains as a dominant channel for consumer to business conversation thanks to the network effect and the resulting switch cost.
On the other hand, messaging platforms enable far richer capabilities for consumer-to-business communication. As demonstrated by WeChat in Asia, a messaging platform can provide both async and sync communication, sending photos & videos e.g. for troubleshooting, leaving voice message, sharing precise GPS location, and even sign-up forms or generic workflows via something like mini-program.
However the next five years shape AI innovations and the contact center world, we look forward to being actively engaged, constantly iterating and improving for our customers. Share your thoughts on what is coming next in the world of AI with us on Twitter!
*See our early work at OpenAI & Stanford on learning to interact with web browsers: https://openai.com/research/universe