Why Your Bank Deserves its own Fine-tuned Banking LLM
General-purpose LLMs
Let’s go back to when we were starting high school – we had acquired the basic skills of writing, speaking, reasoning, science, history, math – et al. Having built a strong foundation after years of training and exposure to a range of subjects, we could have intelligent discussions about these topics and now, it was time to immerse ourselves into our chosen fields – gain the depth required to become an accountant, a data scientist, a civil engineer or a literary expert.
This meant more training and learning, through instructors, self study and real world interactions.
It’s the same with foundational LLMs, that as the name suggests, have a rock solid foundation with training on up to a pentabyte of dataset and as large as a trillion parameters. If you’re here, you’ve tried ChatGPT or Bard or Microsoft copilot or Dall-E or any of the other Generative AI offerings. These feel surreal compared to where we were back just in 2022.
But how do you make these general purpose LLMs work for your business?
The answer is fine tuning
The advantage of LLMs is that they are pre-trained and can perform a range of tasks, from content generation, summarization, classification and recommendations to translation – which is a game changer. Now to make these models work for specialist tasks in fields such as marketing, sales or even industries such as finance, retail or travel, they need fine tuning.
What is fine tuning, you ask? It’s getting the LLMs ready with Vertical AI – in this case, banking specific knowledge, vocabulary and context. It’s essentially retraining a foundational LLM (that no doubt does a fantastic job of creating text, answering questions, summarization and coding) on new data. This isn’t easy and requires a large dataset of the domain in question.
Let’s understand this with an example – imagine a deep learning model that has been designed to identify domestic dogs in cities and suburbs. It understands the parameters and performs with near-perfect accuracy on images of dogs in similar settings. What if you now want a model that can identify wild dogs and dholes in their natural habitat – forests, mountains and deserts. The model’s performance will dip, or it may inaccurately classify these creatures also as dogs, given the common visual characteristics.
Fixing this would require training the model on a fresh dataset of images of wild dogs and dholes. This can be prohibitively expensive and effort intensive, requiring a huge amount of data. Instead, what if you could delta Δ train (in order words, incrementally train) the same image detection model and optimize it for the new scenario? Thanks once again to the many common visual characteristics of domestic dogs and wild dogs, the dataset required would be small. The fine-tuning would optimize the same model and enable repurposing it for an additional application.
Unsupervised or supervised training
Now let’s turn to Generative AI and LLMs, where the model can create compelling content across a range of generic topics (like when you were in high school), but needs specialized training to answer banking related questions, such as ‘what are the foreclosure rules?’, ‘what are the charges on credit card payment default’.
Theoretically, this can be done in two ways – unsupervised or supervised learning. The former uses an unstructured dataset and is primarily reserved for foundational models, requiring petabytes of data and millions of dollars.
Fortunately, fine-tuning a foundational model can be done through the supervised approach – the idea here is to update the knowledge of an LLM (such as Llama 2 or OpenAI or MPT) for the new domain. The Generative AI language dataset could include policy documents, terms and conditions, past customer interactions and enquiries, bank’s website and other articles. The training data is duly labeled and ranked, so the Banking LLM, purpose-built for conversations can steer the exchange in the right direction and stay on topic.
ACE Banking: Gupshup’s fine-tuned Banking LLM
Model fine-tuning is an iterative process that mainly involves data preparation and processing, model training, evaluation against a benchmark, and repeating these steps till we have the best model.
Fine-tuning for Banking starts with collecting and processing training data. We draw on our deep domain expertise, proprietary data sources, human created and synthesized examples and relevant public datasets to make this a high quality dataset. This also requires sanitizing the data to ensure there’s no sensitive or private data in the mix.
In addition to the training dataset, a benchmarking dataset to test and score the model is also created. This serves as a yardstick used to measure the improvements in the model through the various fine-tuning iterations. In this case, the benchmarking dataset also needs to be specific to Banking, ranging from use cases such as questions about banking products, processes, terms & conditions to banking account related enquiries and transactions.
The next step is to determine the appropriate training parameter and set up for a distributed model training. A fine-tuning process could take hours or days to finish, depending on the dataset size and the hyper-parameters set for the training. The fine-tuning is done on a base open source LLM that is already scoring high on the general language task community leaderboard. At the time of writing this blog, the Llama v2 and MPT models are rated high, and are in use by us.
Once the model is fine-tuned, it is validated against the benchmarking dataset we prepared earlier. Based on the validation results, we may go back to the first step and add/remove or update the training data and repeat the steps.
This process continuously improves the ACE Banking LLM through better open source models, optimized datasets, and an iterative fine-tuning process.
Banks looking to streamline their operations and improve the customer experience can count on our Banking LLM to communicate with their customers via chatbots and virtual assistants, and also serve as a co-pilot for their sales and field teams.
Now one final step is to ingest your bank’s enterprise knowledge base in the model, creating your very own Banking GPT. It’s now ready to handle just about any topic – products offered, bank’s policies, foreclosure rates, service charges, account tiers, loan processes..
Banking GPT provide a competitive advantage to banks
Lower cost of operations: Generative AI can automate routine tasks and processes such as KYC, customer service, freeing up RMs and staff to focus on higher value tasks and relationship building
Customer insights: GPT can analyze customer interactions, identify patterns and help the bank make informed decisions about their products and services, proactively address common dissatisfaction points and better meet customer needs
Higher customer satisfaction: With Generative AI, banks can anticipate customer needs, retain context and provide resolution faster – doing away with the standard ‘We’ll get back to you in 3 working days’ response
Better response accuracy: The responses need to be rooted in facts and be based on the bank’s literature and actual policies and, hence, rules break as underlying source information changes. LLMs, however, can stay up to speed with revised policies, interest rates, terms and conditions and ensure customers receive the right information.
Now let’s look at an example, one of India’s premier financial institutions operating across consumer and corporate finance has improved customer access to information via a Generative AI powered chatbot that can answer customer queries in Hindi, English and interestingly, Hinglish. This chatbot built on ACE Banking GPT provides customers with high quality content, personalized responses and on-demand support. Prior to this, the institution had launched a voice based solution to help customers understand their loan eligibility, application process and more.
No wonder LLMs are all the rage and are finding newer applications – banks can provide a personalized service to their customers, improve response times and optimize their cost to serve. We can only expect more use cases to emerge as the technology evolves. It’s time to ACE Banking with Gupshup and the power of LLMs.