Model training data already contains all the text there is[0], so they can already answer questions like this (especially with web search), but they aren't good at tax calculations.
The problem is that the text of US tax code isn't enough to know the correct action to take. The IRS has semi-formal policies based on how it has chosen to interpret the statutes. There are areas of gray that they don't clearly specify. Some of this is in supplementary publications but it still has subjective elements. One example is that settlements for "serious injuries" are regarded as non-taxable income. What constitutes serious is a squishy concept.
You can technically use the language model as a data model. That was the quick hack that started it all, autocomplete on a question produces the answer, yes.
However it's clear that we are moving towards separating the data and the language model. Even base chatgpt is given Search Tools and python Tools instead of producing them by text, the tool call itself may be generated by the model though.
You can for sure use a pure LLM to ask it questions about tax code, but we'll probably see specific tools that only contain canon law and kosher case law, and sources it properly. Y'know instead of halucinating
https://arxiv.org/abs/2507.16126v1
[0] but it's quite possible the conversion from HTML to text is bad