Q&A: How Discover Financial Services created an AI governance council

The financial services industry has gone through an upheaval over the past several years with “open banking,” where customers control their financial data, has replaced the traditional model. That change has forced the industry to accelerate the adoption of digital technology.

At the same time, customer data remains at the epicenter of the financial services industry, so the need to protect, store, and leverage it is gaining importance. Along with big data and advanced analytics, artificial intelligence is the new frontier in financial services’ quest to stay competitive while also protecting sensitive data.

AI can be used in financial services for demand and revenue forecasting, anomaly and error detection, decision support, cash collections, and a myriad of other use cases.

Financial services is also among the most regulated of all markets, so while it may have the resources to deploy the latest tech to create better products and services, as well as increase efficiencies, risk is always a concern.

Discover Financial Services has been slowly exploring AI to create efficiencies in its processes, such as summarizing customer service iterations and fraud detection.

Raghu Kulkarni is senior vice president and chief data science officer for Discover Financial Services, and one of the first things he did before rolling out the first large lanaguage model (LLM) at the firm was to create an AI governance council to ensure repeatable processes and safeguards.

Kalkarni spoke to Computerworld about his approach to deploying AI and what guardrails his team established to ensure its safe but productive use.

What’s your role at Discover? There are two or three parts to my role. The first part is to develop decisioning models. What do I mean by decisioning models? Like when people apply for a card or a loan, we develop underwriting models. We develop live management models. We develop models that can detect fraud or money laundering. So, behind the scenes are a lot of analytics that happens to depict each one of these areas.

“Our shop develops machine learning models, which can underwrite, lie manage, detect fraud, collections — the entire [financial product] lifecycle of a customer. And beyond that, we need a platform on which to develop these models and a base on which to implement these models. So the engineering of how we back end the data science also falls under my purview.”

In what ways has Discover utilized AI to create efficiencies, improve customer service, etc.? “First, you want to ensure responsible AI. So, ensuring there are no biases and it’s accurate. My first job is to understand the math behind what I’m using. Within the sphere of AI and ML, we use static supervised learning — known data and known target. We have a very defined problem statement. With that, we use models to assist in acquisitions to get into house and detect fraud.

“When it comes to customer service, that’s where AI really picked up. Let’s say we want to look at summarizations of calls. We have genAI models for that. And the term genAI has been used very broadly. ChatGPT has really expanded the use, but genAI previously was neural networks for NLP [natural language processing]. I prefer very explainable NLPs and ground-level LLMs, which can understand and then help assist customer agents.

“No matter what we do, we always have a human in the loop. These models were meant to assist humans. We want to make sure we understand all the math, but we also want to make sure there’s a human on the output end to make sure the output is tangible.”

Do you have a prompt engineering job role at Discover today? “We don’t yet. With generative AI, though, that is going to become a field of importance. What we do have right now are data engineers, application developers, the folks who can convert the mathematical models in to APIs. That’s what we have today.”

Do you see prompt engineer becoming a job title in the future? “I do see it becoming a job title, but not as soon as you may think. We don’t have ChatGPT. If you fine-tune the model enough, though, that’s where prompt engineering comes in. We’re maybe a few years away.”

What prompted Discover to create its AI governance council? “In this AI venue, especially with genAI, we realized it’s a team sport. Just a data scientist’s model isn’t going to cut it. We have seen both the good and bad side of genAI. So, we want to make sure our partners in cybersecurity are comfortable, and we want to make sure data privacy is taken care of. We want to make sure that infrastructure [is solid so] even if I have a lot of dreams, I want to be able to implement them and use them. And then we want to talk about model risk, compliance, legal.

“So, genAI was a logical combination of end-to-end partners that we need to work together in a cohesive way so we can go from ideation to realization with risk controls.”

Who is on the council — that is, what parts of the business have a say, and why? “Our head of cybersecurity, head of architecture, head of model risk, head of compliance risk, legal, and me, the chief data science officer. This is an end-to-end-to-end team sport.

“This has really helped us. We realized there’s protection, but there could be gaps also. So we want to make sure everyone understands what we’re doing and how it helps our offices.”

What kinds of standards did the council create? “We’re working on the policies and standards as they’re evolving. We already have some principles we go by. The first is, what is the use case? What are we trying to solve? Is it a call center optimization; is it sentiment analysis? Everything starts from problems we’re trying to solve. Then you see what kind of a solution might fit the problem. What kind of a model do I need?

“More often than not, you know that GPT-2 may actually do the job. You don’t need a [GPT] 4, you don’t need a Bard, or what have you.”

“Then you have to look at the risk it imposes — the cybersecurity principles, the architectural principals, so what are the things we could architect together, model risk principles? We’re still using SR 11-7 — that’s our model risk document. This is a governing policy by the Federal Reserve Board for bond models. We adhere by it. It’s expansive enough to accommodate for it.

“So, for me, nothing really changed. I still go through same rigor as any other banking model. We follow that. We talk to legal and they hand us some principles and compliance rules.”

In what way did regulatory proposals influence your standards? “One of them is the NIST framework. We have been able to work with one of authors of the NIST framework as a colleague. We follow what is happening today as of today. We also follow what the US Chamber of Commerce and others are talking about in terms of AI policy. At the end of the day, it’s about responsibility.

“You don’t want to go to extremes. You want to see what we’re doing. Risk tier the usage and then make sure you follow existing guidelines as well as evolving guidelines. And evolving guidelines are evolving. We’re still learning.”

How do you envelop evolving guidelines into your internal policies, especially when these guidelines are coming out at the local, state, and national level? “At Discover, we abide by existing guidelines, which are more stringent than evolving guidelines in most cases. That’s why I mentioned explainability, transparency, and bias. All those are basics for us.

“Then there are evolving risks. What do you do with hallucinations? Guess what, the new word may be hallucinations, but an old word was model error. And why do model errors occur? Because of a lack of data. So, if you have basics and foundational elements right, you can still [achieve the right policies].

“So, there are two parts of it. We look at evolving policy, but we also keep it very structured and simple. The council really helps, because we all understand the moving parts, and if there’s a place where it’s a little too risky, we actually go for a simpler model.”

What do you see as the current greatest threat from AI and why? “If I talk to my cybersecurity friends, it’s going to be utilizing this for dark uses like fraudulent activity or what have you. But there’s also data privacy issues. As a data scientist, it’s more than a threat; it’s really the overhyping of it. It’s not really a threat, but you use the actual discussion like we’re having rather than saying, ‘AI is going to take over the world.’

“We’re just summarizing certain documents right now. It will evolve. It will do a lot more things. But when it comes to banking and banking regulations, we want to be simple, straightforward, and transparent.”

What do you see as the greatest future threat posed by AI as it approaches general AI status? “I still feel cybersecurity and the weaponization of AI. To be honest, I’m more of an optimist. For every threat there’s a solution. If there’s fraud activity, then we have a fraud model. So, there’s always good folks to counter that. That’s the way I feel about it.”

What advice do you have for other businesses facing security, privacy, and regulatory challenges from AI? “Work together. It’s not one department. This technology is going to impact in a good way, but we have to be responsible end to end. Second, I’d keep it simple. Abide by all the regulations. Learn through simulations. Learn through simpler models. Keep it transparent and explainable.”

In the future, do you see large language models shrinking so as to be more domain- or even job-specific? “I do see that happening. Even today, if you use these humongous billion, trillion-parameter trained models, you still have to fine-tune your data. It’s like a segmented model. What if you had a model trained on our data for all those actual business purposes?

“What’s stopping institutions from developing these domain-specific trained models versus the LLMs is computing power. The whole conversation about generative AI today is because computing power has caught up. The big companies are able to afford the computing power. As computing power costs reduce, every domain-specific problem can have its own LLM, which is more suited to their own use. That’s why I go back to my original point: figure out what the problem is that you’re trying to solve.”

So, if not GPT 4, what LLMs are you using? “Right now we’re looking at NLP and GPT-2. We are very cautious and very careful when comes to these super humongous large language models. Let’s keep it simple. Let’s see the usage. We have deployed smaller LLMs. Nothing based on the ones that make the news.”

Did you create these LLMs inhouse? Are they based on open-source models? “These are open source. These are explainable enough and manageable enough to see the risks and the benefits.”

How large are your LLMs? “I’m not sure, but it’s not in billions and trillions [of parameters] — maybe millions? Our purpose right now is to have them read documents and summarize them with human in the loop. How much do you need? [is the question].”

How would you connect LLMs to back-end systems, databases, documents? “This is part of a future road map. Let’s say I’m a free thinker and don’t have any restrictions, which I’m not saying I don’t. Eventually you need a vector database; they need a lot of prompt engineering and they need to be fine-tuned. With that architecture and current regulation, we are still trying to bridge it. Right now they’re batch models where we run them to reduce the total time and then bring a human into the loop.”

READ SOURCE