Knowledge Engineering: Intuit's Chief Innovation Officer Explains Their Approach to AI

Article hero image

We are entering a new era in artificial intelligence, one where exciting breakthroughs seem to arrive every week. But the roots of today’s AI revolution actually date back many decades. So what changed? The catalyst for the recent advances is the fact that these techniques can now leverage unprecedented amounts of data and computing horsepower. The result has been a gold rush to build smarter machine learning algorithms.

Despite all the buzz around artificial intelligence, however, one tried and true area of AI has been largely untapped: knowledge engineering. Instead of throwing huge amounts of data at a system and letting it learn through extensive trial and error, knowledge engineering uses hand crafted rules to devise a dynamic decision system personalized to suit the needs of the situation.

How can this technique be applied in today’s world? We sat down with Bharath Kadaba, Chief Innovation Officer at Intuit, to discuss how his company has been working to utilize this unusual approach. He explained how Intuit has found this classical AI method to be key to disrupting the highly complex financial services industry.

Q: First off, tell us a little bit about yourself.

A: Well, I was lucky enough to begin using a computer and writing code as a teenager at the Indian Institute of Science in Bangalore, India. That led me to move to the US, where I earned a Ph.D. in Information Theory and Networks from the University of Hawaii.

My first job, after completing my studies, was at IBM TJ Watson Research Center where I worked for 15 years on distributed systems and networks. In the late 1980s, I had the good fortune to work with the National Science Foundation to contribute to the early development of the backbone of the Internet. During the dot com days, I led technology teams at several startups. Before joining Intuit, I was responsible for Media Engineering at Yahoo, where we built a shared services platform for all media properties (news, finance, sports, games, etc.)

In 2008, I joined the team at Intuit to lead global product development. Three years ago, I assumed the role of Chief Innovation Officer with a focus on exploring and developing cutting-edge technologies to solve challenging consumer and small business financial problems.

Q: Over the last ten years, machine learning techniques using large datasets have become the dominant trend in AI. Increasingly that means we feed large amounts of data through algorithms from logistic regression to neural networks, tweak hyperparameters, and let the systems learn on their own. You still put a lot of emphasis on knowledge engineering, an approach based on rules devised by human beings. What draws you to this older approach?

A: It’s interesting that you use the word “older” to describe knowledge engineering. The term I would use is “not in vogue.” Neural networks date back to the 1940's. In that sense, today’s practitioners of deep learning are building on “older” concepts, just like we are with knowledge engineering. What we are doing in knowledge engineering today is drastically different from the classic production rules system of the 1980’s, though they have many connections.

We place an emphasis on knowledge engineering because of the nature of the customer problems we solve at Intuit: personal and business taxes, payroll, and financial compliance in general. These problems are often mission critical, requiring precise and logically interconnected results with a clear and convincing explanation of why the system provided its response. These challenges are often difficult if not fatal for pure data-driven approaches where prediction errors and uncertainty are inherent.

In the field of financial services, the margin for error is slim to none and the need for explainability is high. For example, in a U.S. income tax return, being off by $1 could result in an incompliant return, and a customer likely wouldn’t feel comfortable hiring an accountant if he/she cannot explain why the customer owes a tax payment this year. Moreover, compliance requirements change frequently. Based on a recent Thomson Reuters survey, a new compliance alert is issued every 7 minutes around the world. Frequent changes in compliance laws mean there is little to no data that can be used to “learn” these changes ahead of time.

All of that said, I would like to note that there are a complementary set of problems at Intuit which are an excellent fit for data-driven modern machine learning, and we have a large group of data scientists and machine learning engineers working on important customer problems in this technology spectrum.

But the true differentiator for AI at Intuit is the work we are doing to marry these two approaches: classical knowledge-driven AI and data-driven machine learning. Combining the power of rules-like knowledge with the statistical insights derived from a large amount of data is the secret sauce that gives us the best of both worlds. We have been working in this direction since 2010, starting with our core tax preparation software, and millions of customers have benefited from this work.

Q: Rich Sutton, a Canadian professor of computer science, recently wrote a popular blog post called The Bitter Lesson. It argues that time and again researchers have returned to knowledge engineering because we want to find systems that “think” the way we do. Despite those efforts, techniques that learn from data itself have prevailed when it comes to setting major milestones. How would you respond to this argument? Where do you see knowledge engineering prevailing?

A: We completely agree with Prof. Sutton’s point of view that manually powered knowledge engineering systems are often not the answer. However, note that Prof. Sutton is arguing for systems which can learn the rules over abstractions from data by themselves, and then operate via these rules. In particular, if you look at the last paragraph of the blog, you realize that he’s talking about machines learning the meta-methods. In the case of financial compliance, it is rules.

At Intuit we believe that, though relying on human experts to manually curate financial rules may be a reasonable starting point for certain applications or domains, that approach is simply not scalable. These rules should be learned using natural language processing (NLP) and machine learning techniques. This is not an easy task by any means, but this is the very problem we have been working on for many years with amazing success. For example, we now have a system that can automatically convert a high percentage of tax form instructions into knowledge engine-driven compliance software to power tax and other compliance applications. With this framework, we are now more able to scale compliance solutions globally — an impossible feat if it were to be driven purely through manual knowledge engineering in isolation.

This approach of developing systems that learn abstractions and rules is the path to building much more intelligent products and user experience for our businesses. Pure encoding of human knowledge will not scale. But pure data-driven methods, such as deep learning solutions, will ultimately hit a wall once the applications exceed a certain level of complexity beyond perception-driven tasks, such as image and speech recognition, or for playing games like chess or Go, where millions of simulations are possible. To create the intelligence of a five-year-old child, or even primates for that matter, we will need to marry these two approaches. The fact that we at Intuit have been heading in this direction for a long time gives us tremendous confidence that, in the years to come, we can unleash the full potential of AI, removing drudgery and ultimately powering prosperity for our customers.

Q: How do you put AI to work when it comes to financial products. How can these techniques be used to help customers save money or earn a better return?

A: One of the biggest benefits to our customers is AI can help hide massive compliance complexities, allowing them to file their taxes quickly and get back to what they are passionate about. Thanks to the knowledge engine, we can generate personalized user experiences fully tailored to their situation by asking them the minimal set of missing information without sacrificing accuracy or violating compliance requirements. This ability saves hundreds of millions of hours in tax return preparation for the country.

Furthermore, with personalized explanations to questions such as “why didn’t I qualify for earned income credits?” and “why is filing as head of household better for me than filing as single?” we can answer the most confusing questions related to taxes right on the spot, which removes a tremendous amount of anxiety and doubt. Such explainability has proven to be critical in driving confidence in our products. Finally, by codifying the tax compliance requirements into machine computable knowledge representation, we can potentially find misses and errors made by customers so that they can get every penny they deserve and avoid penalties.

AI also helps a great deal in our development phase by helping us be more productive and data-driven. Using knowledge engineering, for example, we can speed up the development process of getting our software to market each year with the latest tax codes while ensuring maximum accuracy. AI allows us to convert 80,000 pages of U.S. tax code into an easy-to-use product that simplifies taxes for our customers. Combining knowledge engineering and large scale machine learning, we can be efficient in our testing and verification by creating more compacted test sets that cover all potential permutations of tax situations.

Q: How does AI, whether rules-based or machine-learning-based, interact with the regulatory side of the business? How do you build a system that is flexible enough to adapt to new laws and tax codes over time?

A: The answer lies in the architecture of our AI. We designed our compliance systems through a well-partitioned architecture which cleanly separates user experiences from domain business logic encoded within the knowledge engine. At the core of our knowledge engine is the knowledge graph that codifies all the tax rules and the interconnections amongst them. This partitioning ensures that any regulations can be reconciled purely with the knowledge graph, which serves as a single source of truth.

As I mentioned above, as our technology advances, an increasingly higher percentage of the knowledge graph is now automatically curated using NLP and machine learning. To ensure the correctness of the machine-generated parts, we built an integral knowledge engineering pipeline that allows human experts to review and correct any anomalies that the AI flags because of the low confidence score, or when they fail our stringent tests.

We’ve been working for a long time on bringing together AI in the context of taxes and compliance. We started building our knowledge engine in 2010 and we’ve been using it in production since tax year 2014. Through machine learning, we’re able to learn useful patterns and changes in taxes from the millions of users that file with us each year, helping us optimize returns. Moreover, we continuously enhance our knowledge engine and other AI solutions so that they are progressively improving in their performance, getting better over time.

Login with your stackoverflow.com account to take part in the discussion.