This post continues my look at the relationship between taxes and growth (what I modestly called the “Kimel curve”), which I will continue expanding on over a series of posts. Today I want to look at marginal rates, effective tax burdens, and how each or both affect growth rates. As an added bonus for non-economists and folks who don’t deal with statistics on a daily basis, I will also expand a bit on regression analysis and the process of building a rigorous “econometric model.” (Some basic material appeared at the Presimetrics and Angry Bear blogs).
To begin… its no secret that marginal tax rates don’t always have all that much to do with the amount that taxpayers actually pay. This is especially true for folks with extremely high incomes, particularly if a big chunk of their income doesn’t come with a W-2 attached. (Don’t take my word for it – I’ll give you a number later in the post.)
So, while the Kimel curve equation I provided last week dealt with the effect of the top individual federal marginal tax rate on economic growth, this week I want to throw the federal “tax burden” into the mix. The tax burden is simply the percentage of income that actually gets paid in taxes, which we can calculate as the personal current federal taxes divided by personal income. The former comes from line 2 of NIPA Table 3.2. The latter came from line 1 of NIPA Table 2.1.
(If you’re new to my posts, the NIPA tables, or National Income and Product Accounts tables, are computed by the the Bureau of Economic Analysis of the Commerce Department, the agency responsible for computing GDP. They also keep track of a lot of other interesting data about the US economy.)
Real GDP comes from any number of tables on the BEA website – let’s just pull ‘em from here this time around. And of course, we need top individual marginal tax rates IRS’ Statistics of Income Historical Table 23.
Before we go on, let’s just note that the correlation between the tax burden and the top marginal rate from 1929 to 2008 is 4.5%, which is to say, pretty close to zero.
Now, I’m setting up a simple model of growth as follows:
Growth in Real GDP, t to t+1 = B0 + B1*Top Marginal Tax Rate, t
+ B2*Top Marginal Tax Rate Squared, t
+ B3*Tax Burden, t
+ B4*Tax Burden Squared, t
As I mentioned in the previous post, I’m throwing in the X and X squared terms to account for the fact that the effect of variable X on growth rates can change as X rises or falls. For example, maybe when tax rates are low, increasing taxes has only a small effect on growth, but as tax rates rise, further increases in those tax rates can have a very big effect on growth. I’m also fitting the model using a regression. If none of this makes sense, or you don’t remember how to interpret a regression, please take a look here again.
OK. So we let it rip, and get this ouput:
So now we can just go ahead and compute the optimal tax rate and optimal tax burden, right? Well, not so fast. Just because we ran a regression doesn’t mean its any good. Last week we discussed some of the diagnostics you can find in the output above, but what I didn’t mention is that you really should look at the error terms of the regression as well. The errors, or residuals, in a “good” regression look like they came out of a shotgun – they don’t have any obvious patterns. Patterns in the residuals from a regression mean something is systematically wrong with the way the model being estimated fits the data, and if something is systematically wrong, it can (and should be) fixed. Worse still, one of the mathematical assumptions of regression analysis is that you didn’t specify a model that has something systematically wrong with it, which means that the output of a regression is misleading in various ways if you there is something systematically wrong with the model. (In practice, you will never see a perfect shotgun pattern, but you want to shoot for something close.)
But, errors in this regression do show a pattern:
As the graph above shows, the errors tend to be pretty big in the beginning, and they tend to shrink over time. Since OLS regressions maximize the sum of squared errors, big errors early on mean the model is putting an overemphasis on the early years. Additionally, the correlation between the errors in one period and the errors in the next are about 50%; big errors tend to be followed by big errors, small errors by small errors, positive errors by positive errors, and negative errors by negative errors. Now, if you’re in a Ph.D. program where showing you have chops is a big deal, you’ll deal with this using any number of cool sounding techniques, each of which is built on a number of assumptions that are truly horrifying if you stop and think about it. But if you’re long gone from academia, and spent a decade post grad school working with these cool sounding techniques, you might have gotten smart and comfortable enough to have rediscovered the KISS rule. If that’s the case, you’ll take a look a second look at the residual graph, and conclude a few things:
1. The 1929 – 1932 recession was a major outlier early on
2. The early part of the US’ involvement in WW2 (starting in 1940- think lend lease, and other gov’t expenditure) is a major outlier
So you might, as a first pass, create a couple of dummies – one for the 1929 – 1932 recession, and another for “major US involvement in WW2” aka 1940 – 1944. A dummy variable takes a value of 1 or 0, which amounts to “yes the condition is met” or “no the condition is not met.”
Rerun the regression with those dummies and you get a regression with these residuals:
I’ve kept the scale in this graph the same as on the other. Notice… most of the big errors have dropped away, much of the “heavy early on” pattern is gone, and the correlation between errors in one period and errors the next has dropped quite a bit. A simple fix, and we’re good enough to move on for now. Here’s the output:
Notice… the new model (using tax data and a couple dummies alone) explains about 57% of the variation in the growth in real GDP. Also… the tax burden is not significant. (The P-values are too far above a “significant” value such as 0.01, 0.05, or 0.1 depending on how strict you want to be, or how many asterisks you want to put in your paper.) The two dummies, not surprisingly, are significant; growth was slower than the model would otherwise predict during the 1929-1932 recession/depression, and faster than the model would otherwise predict during the 1940-1944 period when the US gov’t ramped up its involvement in the War. (BTW… anyone thinking that war is a way to promote economic growth should consider we’ve had a number of other wars during this period. What was unique about 1940-1944 was the degree to which the government decided to run the economy.)
The top marginal tax rate and top marginal rate squared are both significant, and we can use them to compute a top marginal rate that maximizes growth (at least in this model). That figure is (drumroll): 62%. Pretty close to the 67% we computed using last week’s model. And nothing like what Congressman Ryan is likely to glean from reading Atlas Shrugs…
By the way, the list of things I want to look at in future posts, in no particular order, includes:
1. Is the post-WW2 (or post 1963, or post 1981, or post 1986) era different?
2. What is the effect of different demographic groups?
3. Does this work for other forms of growth?
4. Does this type of model always provide an “optimal” result? Does this apply to states? What about other countries?
5. What about other types of taxes, such as corporate taxes? Should we focus on the tax rates paid by middle income earners rather than (or in addition to) tax rates paid by folks at the top?
6. What about the national debt? Or government spending? Or other variables?
7. Does the political party of the President or the Congress matter?
8. What is the effect of the Fed on all of this?
9. How do we know whether this is all merely correlation or is there any sign of causality going on here?
10. Given that this isn’t rocket science, why aren’t “real economists” doing stuff like this? (I would be derelict in not mentioning this paper by Pietro Peretto at Duke, which provides a model showing that “the endogenous increase in the tax on dividends necessary to balance the budget has a positive effect on growth.”)
This seems to have the potential to become the Mike Kimel full-employment act, though sadly, it isn’t my job and it doesn’t pay. Running regressions is quick and easy, and interpreting them (and spotting pitfalls) is second nature to someone who works with them on a daily basis, but pulling data, sorting it and organizing, and even just thinking about that data is very, very time consuming. So please have some patience as its going to take a while to get somewhere. Also… I will probably have occasional posts on other topics in the meanwhile as well.
All that said, one of my goals with these posts is to give non-economists a view of the way this sort thing is (or should be) done in the profession. If I’m not explaining enough, or not keeping it intuitive enough, let me know.
Finally, as always, my spreadsheets are available to anyone who wants ‘em. This one has some cool info that I didn’t get a chance to use in this post, including corporate income, corporate taxes, and some demographic information. If you want to play along at home, or even move ahead of my posts, drop me a line and I’ll send you what I have. Até à próxima, pessoal.