Tuesday, March 18, 2008

How to Pick Your Brackets

So, for the past two years I've been putting together data about the teams entering the NCAA tournament the past 4 seasons. About 60 different variables, every stat you can think of and more.

By using regression analysis and a Tobit model, I have now made an econometric forecast for this year's tournament outcomes based on the profiles of this year's teams.

I'm told my model will be published on BracketScience.com. Hopefully, this will be the start of a second career for me. Since nobody wants to pay me to do econometric analysis I do it anyway on my own for fun.

How does this work?

Well, suppose you had a sort of reverse BracketMaster program, where you could enter in the various historical attributes of teams and how well they did, and let the program figure out which attributes correlated significantly to a team’s performance in the tournament. You could see how those attributes correlated wins holding everything else constant. Attributes such as Effective FG%, Defensive TO%, age of the team, experience of the coach, etc.

Now, suppose you could tell the program the attributes of this year’s teams and it would tell you how many games you could expect each team to win. This is basically what I did.

How did it turn out? Send me $5 and I'll tell you. Or pay $20 to see my bracket on BracketScience.

It's my personal effort to have a completely data-driven bracket, no emotions whatsoever! I have to give a hearty thanks to the head of a certain Economics department at a Big 12 university for generously helping me out. Also a thanks to Jared for reminding me that my adjusted R-squared matters. Thanks to Joni as well for checking my model's predictions for previous tournaments, and also bearing with the countless long hours of working on this both at home and at the office.

Given the attributes and their magnitudes, the model successfully predicts 81% of games the previous 4 years, with 11 of the last 16 Final 4 teams.

I can tell you that the data says that this will be the Year of the Midmajor. Look forward to some big-time upsets and both of my alma maters being in the Sweet 16. It's not what I think; it's what the data say.

3 comments:

queenhaiku said...

wow...that does sound like a fun thing to sit and figure out. i am going to have to do some stats in my spare time!

TaylorW said...

brilliant!

will you reveal your final four ?

JTapp said...

I'll say that Memphis and Kansas are in my Final 4. The model says that they have the strongest numbers of any team since UNC in 2005.