IRS Automated Guidance: Pros And Cons

In this episode of Tax Notes Talk, Professors Joshua Blank and Leigh Osofsky, authors of Automated Agencies: The Transformation of Government Guidance, discuss the IRS and other federal agencies’ use of artificial intelligence and chatbots in providing legal guidance to the public.

Tax Notes Talk is a podcast produced by Tax Notes. This transcript has been edited for clarity.

David D. Stewart: Welcome to the podcast. I’m David Stewart, editor in chief of Tax Notes Today International. This week: I, taxbot.

As automation and artificial intelligence become ubiquitous, U.S. government agencies like the IRS have been incorporating automated tools into their interactions with the public. While these tools do make it easier to share information, how accurate is the automated guidance? Does it always reflect agency positions, and what does that mean for taxpayers seeking the advice? Do the benefits of these tools outweigh the drawbacks?

Two law professors recently explored these questions and more in their book, Automated Agencies: The Transformation of Government Guidance. Joining me now are the authors, Joshua Blank and Leigh Osofsky. Josh, Leigh, welcome to the podcast.

Joshua Blank: Thank you. It’s a pleasure to be here.

Leigh Osofsky: Thank you so much for having us here.

David D. Stewart: So why don’t we start off with the basic question of what sort of automated tools were you looking into? What sort of automated tools are being used at the IRS and other agencies?

Leigh Osofsky: For the purposes of our research, we conducted a review of automated tools across the federal government, and we specifically focused on chatbots and other automated tools that provide guidance to the public about the law.

Some of the major tools that we examined included, at the IRS, the Interactive Tax Assistant. Also, at other agencies, tools like “Emma,” which provides guidance about immigration and citizenship questions that you might have, and also at Federal Student Aid, “Aidan,” which provides answers to questions you might have about your student aid and the law surrounding that.

David D. Stewart: What led you down this path of research? What was the initial inspiration?

Leigh Osofsky: Yeah, for over a decade we’ve been researching the ways that the IRS communicates complex law to the public. The issue is that tax law is incredibly complex and often ambiguous, and yet it has to be applied by hundreds of millions of individuals on an annual basis, many of whom simply don’t have the time, or sometimes even the capacity, to understand how the complex tax law applies in their situations.

At the same time, the IRS has obligations to help taxpayers comply with their tax obligations. Indeed, this is part of the IRS’s mission statement. This puts the IRS in a difficult position: How can the IRS explain complex law that the public can’t really understand? To solve this question, the IRS is increasingly turning to automated tools like chatbots to help the public understand the law. We got really interested in chatbots and started investigating them, and specifically one of the chatbots that we focused on was the Interactive Tax Assistant.

If you go on the IRS’s website, the Interactive Tax Assistant will be able to answer a whole number of questions you might have about your tax obligations, such as, “Do I have to file a tax return?” Or, “When is the tax return due?” Or even, “Can I take this particular deduction?” We started doing some of our own work around the Interactive Tax Assistant.

We were then hired by the Administrative Conference of the United States, or ACUS, to examine chatbots across the federal government. Based on that research and other work we’ve done, we eventually published our book, Automated Agencies: The Transformation of Government Guidance. This book sets forth a comprehensive examination of the ways that federal governments are using automated technology like chatbots to advise the public about the law.

David D. Stewart: Let’s start off with the good of what you found. What are the positives that you found in the government’s use of chatbots?

Joshua Blank: Well, first, automated tools like the Interactive Tax Assistant do a great job of providing fast answers. They are much quicker and more efficient than a human customer service representative, especially if the issue is really simple, like “When is my tax return due?”

Second, agency officials view these types of tools as helping the public navigate complex legal rules and procedures and also aiding third-party advisers like accountants and lawyers.

The last real benefit that we discovered in our research is that agency officials view these tools as a way to clearly and transparently state the agency’s views of legal issues to the public, and also to advisers.

David D. Stewart: It sounds like these can be helpful, but what are the downsides of them?

Joshua Blank: Well, one of the features of tools like the IRS’s Interactive Tax Assistant is that it illustrates what we describe as simplexity. Simplexity — in our research, we’ve described this as occurring when the law is complex and the government doesn’t simplify the underlying law; instead, it presents the law as though it’s simple. Our book shows how the government’s simple presentation of complex law can result in systematic deviations from the underlying law. In other words, the government can present the formal law as something other than what it actually is.

So just to take a couple of examples, sometimes the Interactive Tax Assistant will present the law in ways that are favorable to the taxpayer. Imagine that you are interested in getting an MBA — a Masters of Business Administration — and you want to know whether you can deduct the cost of that degree. So you go to the Interactive Tax Assistant, and in a few clicks you reach a question where the Interactive Tax Assistant says, “Well, is this necessary to meet the minimum educational requirements of your trade or business?” And you think about it and say, “No, actually, I’m already in the business, and the MBA isn’t necessary to be in this business.” The Interactive Tax Assistant quickly tells you that, “Well, in that case, the answer is that the MBA costs are deductible.”

Now, that’s a clear, quick answer, but if we dig a little bit deeper, we’ll quickly see that the issue is a lot more complex. In fact, this particular issue has been subject to lots of litigation. The Interactive Tax Assistant may tell you the expense is deductible, but that was only because of the information you provided. Very simple inputs lead to clear outputs, but they aren’t necessarily consistent with the underlying law.

Sometimes a tool like the Interactive Tax Assistant can deliver advice which is unfavorable to the taxpayer. For example, imagine a chronically ill taxpayer who can’t take care of daily needs like bathing and cooking and hires a home health aide, and the taxpayer wants to know, “Well, are my expenses deductible?” The Interactive Tax Assistant very quickly will tell the taxpayer, “No, household health expenses are not deductible.”

But if we take a look at the governing law, we’ll quickly find that the relevant section of the Internal Revenue Code, section 213, has a provision involving qualified long-term care services, and the legislative history makes it clear that maintenance or personal care services can include things like meal preparation, household cleaning, and other similar services. So this is an example that shows the Interactive Tax Assistant may tell the taxpayer you can’t claim a deduction that the taxpayer may actually be entitled to. And we talk about many more examples of this in our book, Automated Agencies.

David D. Stewart: I know in other areas of AI and these sort of automated systems there’s a danger of hallucinations, where the AI just assumes a thing exists and then wills it into being in its text, but it’s not really true. Are we seeing that with these government chatbots?

Leigh Osofsky: Interestingly, at least at present, the federal government chatbots maintain careful control over the content that the chatbots provide to members of the public. So the chatbots are not currently making up their own content. Rather, what they’re doing is trying to answer questions posed to them with content that’s been preprovided to them by the agency.

As a result, again, at least at present, there’s not the same risk of hallucination with these chatbots that we see in some of the technology that might be available through private sources, but we definitely have to be careful about expanding the technology beyond its current form. If we did see such expansions, then we would have to worry about hallucination and other potential problems.

David D. Stewart: What have you heard from people in the agencies about these automated systems? Are they in favor? Is it helping them do their work?

Leigh Osofsky: Yeah, that was really one of the most interesting parts of our research, was being able to talk to the folks who create these for the federal government. The folks who create them are definitely very optimistic about them and feel that they’re doing a good job. They’re answering lots and lots of queries, so we’re talking about millions of queries that these chatbots are answering per year per chatbot by the federal government agencies. The federal agencies think that this is a good way for them to provide more guidance to the public.

Indeed, one of the ways that the federal agencies are currently evaluating these chatbots is principally through user surveys. So after you complete your interaction with a chatbot, some of these chatbots will ask you, “How did you like your experience? Did you feel like it was helpful?” And federal agencies are currently using these statistics to show that people are getting information that they think is valuable from these chatbots.

One of the problems with that, though, is that, of course, the people who are using these chatbots don’t understand the underlying law. That’s why they’re turning to the chatbots — to get a sense of what the legal answers are to their questions. And so when agencies are using user surveys to evaluate the quality of these chatbots, they’re missing some of the things that the chatbots might not be getting right in their interactions with the public.

David D. Stewart: Another question that raises is, you have an agency that is determining what the answers are going to be for individuals coming to them. So as they’re simplifying, that would be sort of an interpretation on behalf of the agency. So in a world where the courts are getting more skeptical of agency interpretations, what does that mean for the answers these chatbots are spitting back?

Leigh Osofsky: The sea change that you’re referring to is Loper Bright, the Supreme Court case that overturned the Chevron doctrine. Interestingly, Loper Bright shouldn’t really affect the chatbots’ output, and the reason is because Loper Bright was a case about deference. The question was to what extent should courts defer to agency interpretations, and informal interpretations like those given by chatbots were never given a whole lot of deference by courts to begin with. And so Loper Bright doesn’t formally change the regime around the types of answers that chatbots give.

However, somewhat ironically, Loper Bright might actually cause agencies like the IRS to make interpretations to a greater extent through the use of chatbots. And the reason why is because if agencies are going to get less deference for the regulations that they’ve painstakingly issued, they might actually issue fewer regulations. In order to issue regulations, agencies have to go through a very laborious notice and comment process, which takes a lot of time and money. If those regulations aren’t going to get the same level of deference, then agencies might decide, “You know what? We’re just going to make what the rules are in the context of these more informal interpretations.”

And so we actually, after Loper Bright, might see greater use of informal guidance, including in the context of chatbots. So imagine, for instance, that Congress passes some new small business deduction and leaves a lot of questions unanswered in the code about the deduction. Treasury might make a decision that rather than using the laborious notice and comment process to issue regulations about that deduction, maybe it’s just going to make a lot of more informal interpretations, including by having chatbots tell us some of the answers to the questions.

David D. Stewart: What are the processes involved in keeping these chatbots up to date and adding new information into them?

Joshua Blank: This was one of the big questions we had as we talked to the IRS and other agencies in our study. It varies from agency to agency, and sometimes even within a particular agency people who were involved in part of the process of creating a tool like a chatbot might not know all of the different participants in the process.

And so one of the main recommendations that we have in terms of process is that agencies should adopt a clear chain of command regarding how they design, maintain, and review their automated tools. And we also feel that they should publish that information, because sometimes there will be questions involving ambiguous unsettled law, and it would be helpful for the public to know exactly how did the government’s chatbot provide the advice that it did? Who was involved, and were there only people within the agency involved, or outside experts evaluating the issue as well? Those are the types of things that we can’t see easily from the outside, but it certainly could be possible to adopt transparency measures to make the public have the ability to see those processes.

David D. Stewart: And one of the issues that you’ve raised in your book was about the inequities question, the advice being given to people without the means to seek out independent advice. So could you explain that issue and how acute it is for taxpayers?

Joshua Blank: Automated tools like the Interactive Tax Assistant can be very efficient and user friendly and often provide accurate information to the user, but there are certainly downsides. First, we found that automated tools can provide guidance to members of the public that can deviate from the formal law. Sometimes they portray the law as unambiguous, or they may add administrative gloss to the law, or they may omit discussion of statutory and regulatory exceptions.

Another issue is that automated legal guidance tools often provide little, if any, notice to the user about some of these drawbacks, including the failure of the tool to capture the nuances of the formal law and the inability of the user to rely on the tool in order to bind the government or defend against tax penalties, for example.

And finally, without reform, automated legal guidance tools like the Interactive Tax Assistant can have the counterintuitive effect of exacerbating inequities in terms of access to the law. If you think about who has access to the “law,” well, if you can hire a tax lawyer who can research statutes and regulations and cases, you will have much greater insight into the formal elements of the law. That tax lawyer might even write you an opinion, and you can use that written opinion to potentially defend against tax penalties using the reasonable cause and good-faith defense, for instance, if you end up being audited.

But that’s not the case for taxpayers who can’t afford legal counsel and who turn to tools like the Interactive Tax Assistant and other types of automated tools. In these cases, the tool is providing free advice. As a legal matter, the taxpayer can’t use what the tool says to activate certain defenses to tax penalties. For instance, there is a defense called the reasonable basis defense. In order to use that defense, the taxpayer has to rely on an authority that’s on a specific list provided in the Treasury regulations. The statements of chatbots like the ones we’re describing and the Interactive Tax Assistant do not fall into that list.

If the taxpayer wants to go back and say, “Hey, well, the tool told me these different things, and I relied on them,” the taxpayer really needs a record to do that. And when we started our research, we found that many of these tools did not provide any way to download a record. The IRS has been making some strides to adjust its tool and to make some of these features that weren’t existent available. Many agencies, however, do not do that. It’s very difficult to capture the discussion; you have [to have] the foresight to take screenshots of your conversation with a chatbot.

In general, a tool like the Interactive Tax Assistant is so effective and pervasive because it offers personalized advice. It uses words like “you” and “your” when it answers your question. The advice is often nonqualified. In fact, the IRS will say “answer” at the top of the screen as the Interactive Tax Assistant responds to your question. And of course it’s effective because the advice is almost instantaneous and it’s free.

David D. Stewart: So what sort of steps should the government take to improve this? If we’re going to be relying on these chatbots going forward as a means of providing customer service, what can be done so that you minimize those inequities and improve the service?

Joshua Blank: We think there are many steps that the IRS, and actually all federal agencies, can take when employing chatbots and other automated tools. This is not an issue that we think is a question of whether the tools are going to be a very prevalent part of how the government communicates with the public. The private sector has been using tools like chatbots and other automated systems for years. The government has caught up, and we only think these are going to grow in the future.

In terms of steps that the government can take — first, transparency: Agencies should notify users when the formal law is contrary to what the automated tool is saying, or at least when it’s not settled, and sometimes there are major disagreements between circuits, for instance. It’s possible to notify the user that there is at least something you should think more about because the law is unsettled. We also think that as a transparency matter, agencies should create publicly accessible archives that show and include an explanation of changes to statements made by chatbots and other tools.

Reliance: Agencies should allow users to reasonably rely on statements provided by chatbots to defend against penalties for noncompliance like tax penalties. One important step that agencies should take to allow users to do that is to create ways for them to download records of their exchanges with the tools.

Disclaimers: Agencies should include disclaimers regarding limits on the user’s ability to bind the agency and on the limit of the user to rely on the advice for a defense against penalties. Certainly, these tools should also include a disclaimer that they’re not human. Some of the agencies have very human-looking chatbots with names and faces, and agencies should certainly inform the user that they are not actually people responding to their questions.

And finally, process: Agencies should adopt a clear chain of command regarding how they design and maintain and review their automated legal guidance and publish that information. The Administrative Conference of the United States in June of 2022 adopted 20 recommendations based on our report and included many of these recommendations. We differ in certain areas from ACUS, but those recommendations were printed in the federal register and distributed to all federal agencies.

More broadly, I should also just add that chatbots and online tools like the Interactive Tax Assistant highlight what we would describe as a greater democracy deficit in administrative law. And to address this deficit, we think that first the public should have a role in participating in how these different automated tools are designed. The public should have a role in agency explanations of the law. Agencies like the FDA do this, and other agencies have. The IRS should include the public when creating informal guidance such as the guidance that the Interactive Tax Assistant offers.

We also feel that there are limited opportunities under current law for the public to raise any formal challenge to inconsistent explanations of the law when the agency differs from the underlying law. There are approaches to this issue, too. Congress could request, for example, that the agency ombuds offices could be tasked with reviewing agency explanations and issuing regular reports.

And finally, public reliance. Again, agencies should allow members of the public to reasonably rely on certain agency explanations. And we also feel that agencies should bind themselves to follow explanations where the agency uses language that is fixed and the same for all users.

David D. Stewart: In the last several months we’ve seen some major changes in tax administration in the U.S. Does that raise the importance of these chatbots, or do you have any concerns that maybe they won’t be maintained in the level that they’re needed to, given the reduced staffing at the IRS?

Leigh Osofsky: Yeah. We think that the chatbots are likely to only become more important as a result of recent changes at the federal government level. We see, certainly, a fair amount of downsizing in terms of federal government employees. At the same time, taxpayers and other users of various agency services across the federal government are going to continue to have questions about how they can comply with the law and how the law applies to them.

And so the more limited availability of human resources and the continued questions that folks are going to have is going to just increase the need for chatbots to fill some of the gap that we’re likely to see in available information for the public. So we’re likely to see an increased use of chatbots — and indeed, we’ve seen various statements by the administration that there is likely to be an emphasis on the development of more automated technology going forward. So we think that the research that we’ve produced regarding chatbots is likely to become only more important going forward.

We think that a lot of the lessons that we’ve learned in recent years will apply to this technology. So in the existing chatbots, we see some of the issues that we’ve talked about in terms of providing people answers that might seem simple and straightforward but aren’t necessarily accurate given their situation. We’ve seen a lack of process and procedure around creation, we’ve seen a lack of interest in really doing a deep substantive evaluation of the information that the chatbots are giving, and we’ve seen sometimes a lack of clear line of authority within the agencies in terms of who’s producing and maintaining the chatbots.

And so these are definitely going to continue to be something that we need to keep our eye on, and this becomes only more important as the federal government turns to automation to a greater extent to automate the public’s interaction with the law.

Joshua Blank: The public also wants to access information differently than in the past. When I was a kid, I would go with my mom to the public library in April, and we would pick up tax forms and the IRS publications — these thick booklets where the government would explain the tax law to the public. Today, members of the public want to pick up their phone and take care of different tasks, whether it’s banking or booking airline tickets.

And if they have a question about a legal obligation or a government benefit they may be entitled to, they also expect to be able to just tap on their phone and receive an answer quickly. So the way that we absorb information has changed. Technology has had a major role to play in that, and the use of technology to communicate with the public is only going to increase in the future.

Leigh Osofsky: And one other thing that we discuss in our book, Automated Agencies, is how there’s likely at some point to be a turn from automated guidance to automated compliance. And indeed, we’ve already seen this turn begin to happen in various ways. So what we mean is that a lot of the chatbots that we examined provide guidance to the public in an automated fashion.

If you want to know, “Can I take this deduction on my tax return?” the Interactive Tax Assistant will answer that question for you, but you still have a role to play there. You have to ask the Interactive Tax Assistant a question. You have to input information. And based on that information, the Interactive Tax Assistant will give you an answer. You then take that answer and make a decision when you fill out your tax return.

Instead, we can imagine a system that takes this automation but employs it to a greater extent in the form of automated compliance. And what that would look like is you either input information, or maybe you don’t have to input information at all because the IRS already has a lot of information through various data sources, and then the IRS just fills out your tax return for you.

That’s obviously a more dramatic use of automation, but some of the same issues that we examine in our book, Automated Agencies, we would actually see carried out in the form of this automated compliance. It would just be even less transparent to the public what’s happening. And what I mean by that is, as we show in our book, when the government gives you an automated answer to your question, it’s often simplifying the law and making decisions about how it thinks the law best applies to the facts in a way that’s not always clear or unassailable.

And so if the government turns to automated compliance to do more of the work for you with less of your own realization of what’s happening, the government is still going to be making a lot of those same decisions, but it will be even less transparent to you how those decisions are being made. And so we think not only is there likely to be greater use of automated guidance in the coming years along the lines that we talk about in our book, but there’s likely to be increasing pressure to turn in various ways to automated compliance. And we need to be very careful to think about some of the issues that we’ve identified as we make that turn.

David D. Stewart: All right, well, it’ll definitely be interesting to see how technology evolves in the area of tax. This has been a fascinating conversation. Leigh, Josh, thank you for being here.

Joshua Blank: Thank you so much.

Leigh Osofsky: Thank you so much for having us. It was great to speak with you.

Read the full article here