Page 1 of 2 12 LastLast
Results 1 to 15 of 17
  1. #1
    All-Star
    Points: 34,659, Level: 57
    Level completed: 35%, Points required for next Level: 791
    Overall activity: 1.0%
    Achievements:
    Social25000 Experience PointsVeteran
    driegner's Avatar
    Join Date
    Jun 2010
    Location
    Columbus, OH
    Posts
    1,134
    Points
    34,659
    Level
    57
    Thumbs Up
    Received: 0
    Given: 3

    Engineering a Statistically Average Bracket

    Hey guys. This is a tad long, but I think it revealed some interesting stuff to use on your brackets this week. I hope you'll take a few minutes to read through!

    Since I'm a sucker for Cinderella, my brackets usually have WAY too many upsets. With this in mind I wanted to gather some data, present it in a digestible way, and use it to write a computer program that will pick statistically non-absurd brackets and see how the computer does against my friends.

    I needed to figure out, on average, which seeds tend to be upset the most, and how often. So I used bracket information from 2000-2011 and counted the number of upsets in each game (1/16, 2/15, … 8/9). Since there are 4 regions over 12 years that gave me 48 total games at each seeding.

    Taking the total number of upsets in each game over 48 gave me the average likelihood of an upset in that particular match up.

    The upsets in 12 years with some useful averages:



    So pause here and consider this: On average there will be maybe one 4/13 upset. Also, on average you should predict AT LEAST one upset in the 5/12, 6/11, 7/10 and 8/9 games

    In order to use this data to pick my bracket unemotionally, I needed to fit the data to create a usable model for the likelihood of an upset. Cue Excel and some neato graphs.

    Here's a graph showing the percentages from above with a linear fit. As you can see, it isn't terrible for 6, 7 and 8, but it predicts far too many upsets in games for the 2 and 3 seeds.



    So, I thought this data looked kind of like an erf function. I used excel to fit the erf function to the data and the result was MUCH better.



    I shoved the formula for this fit into MatLab and used a fairly simple algorithm to simulate upsets and output results. Think of this like the pre-tourney "Eye Tests" we love so much: No names, no brands, no emotion. Just numbers.

    I'll let you guys know how my brackets do and if anyone is interested I'll share the MatLab code.


    Last edited by driegner; 03-12-2012 at 03:04 AM.

  2. #2
    Legend
    Points: 298,190, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Overall activity: 71.0%
    Achievements:
    50000 Experience PointsSocialOverdriveCreated Album picturesVeteran
    IHavNoCyCash's Avatar
    Join Date
    Sep 2011
    Posts
    13,488
    Points
    298,190
    Level
    100
    Thumbs Up
    Received: 315
    Given: 539

    Re: Engineering a Statistically Average Bracket

    nice work, thanks for sharing...looks like I'm going to update my picks a little.


    CyCash's Post of the Week

    Quote Originally Posted by 3TrueFans View Post
    I bet he's a shoe in at Jimmy Johns though.

    Denver Broncos fan: 13-3 Super Bowl Losers :(
    Arsenal Fan: 4th place, FA Cup Champions! COMMUNITY SHIELD WINNERS!
    Iowa St: 28-8 Big 12 Tourney Champs!!!! Sweet 16

  3. #3
    All-Star
    Points: 34,659, Level: 57
    Level completed: 35%, Points required for next Level: 791
    Overall activity: 1.0%
    Achievements:
    Social25000 Experience PointsVeteran
    driegner's Avatar
    Join Date
    Jun 2010
    Location
    Columbus, OH
    Posts
    1,134
    Points
    34,659
    Level
    57
    Thumbs Up
    Received: 0
    Given: 3

    Re: Engineering a Statistically Average Bracket

    looking for some feedback...so...bump



  4. #4
    All-Star
    Points: 25,202, Level: 48
    Level completed: 66%, Points required for next Level: 348
    Overall activity: 0%
    Achievements:
    Veteran25000 Experience Points

    Join Date
    Jul 2008
    Posts
    1,272
    Points
    25,202
    Level
    48
    Thumbs Up
    Received: 21
    Given: 42

    Re: Engineering a Statistically Average Bracket

    Interesting - thanks for sharing. It would also be interesting to take this analysis into subsequent rounds - what percentage of time do various seeds advance to Sweet Sixteen, Elite Eight, etc.

    As we all know, it's nice to get the first-round games right, but the big payoff for NCAA brackets comes with picking the following rounds.



  5. #5
    Legend
    Points: 286,769, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Overall activity: 27.0%
    Achievements:
    Veteran50000 Experience Points
    Doc's Avatar
    Join Date
    Aug 2006
    Posts
    12,382
    Points
    286,769
    Level
    100
    Thumbs Up
    Received: 591
    Given: 1,049

    Re: Engineering a Statistically Average Bracket

    Quote Originally Posted by driegner View Post
    looking for some feedback...so...bump
    My feedback: You are a nerd.

    But seriously, I love analysis like this and will take a look at it.



  6. #6
    Addict
    Points: 166,400, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Overall activity: 0%
    Achievements:
    SocialVeteranCreated Album pictures50000 Experience Points
    jaretac's Avatar
    Join Date
    Nov 2006
    Location
    Frigidaire
    Posts
    7,635
    Points
    166,400
    Level
    100
    Thumbs Up
    Received: 2
    Given: 0

    Re: Engineering a Statistically Average Bracket

    How about the second round match ups?



  7. #7
    All-Star
    Points: 34,659, Level: 57
    Level completed: 35%, Points required for next Level: 791
    Overall activity: 1.0%
    Achievements:
    Social25000 Experience PointsVeteran
    driegner's Avatar
    Join Date
    Jun 2010
    Location
    Columbus, OH
    Posts
    1,134
    Points
    34,659
    Level
    57
    Thumbs Up
    Received: 0
    Given: 3

    Re: Engineering a Statistically Average Bracket

    Quote Originally Posted by jaretac View Post
    How about the second round match ups?
    I have a pretty solid idea about how to do it but I have a final tomorrow morning. After my exam I may put some time into it but because far more match ups are possible it will add complexity to the model.



  8. #8
    Pro
    Points: 61,384, Level: 76
    Level completed: 84%, Points required for next Level: 266
    Overall activity: 9.0%
    Achievements:
    VeteranCreated Album pictures50000 Experience Points
    besserheimerphat's Avatar
    Join Date
    Apr 2006
    Location
    Mount Vernon, WA
    Posts
    2,599
    Points
    61,384
    Level
    76
    Thumbs Up
    Received: 92
    Given: 145

    Re: Engineering a Statistically Average Bracket

    Have you figured out yet how you are going to pick your upsets? Obviously it's not enough to get the right number of upsets, you need to pick the actual upsets as well. I've messed around some with football score analysis and gotten good correlations with conference records, but never anything that was predictive (other than a rough estimate of odds to win, but most of the time that's doable without a whole lotta math).


    You can spend a lot of time and money picking out the perfect floral bouquet for your date ... but you're probably better off checking if you have bad breath and taking the porn out of the glove compartment.

    The moral: you gain more by not being stupid, than you do by being smart. Smart gets neutralized by other smart people. Stupid does not.

  9. #9
    Hall-Of-Famer
    Points: 59,308, Level: 75
    Level completed: 51%, Points required for next Level: 742
    Overall activity: 0%
    Achievements:
    Veteran50000 Experience Points
    Kyle's Avatar
    Join Date
    Mar 2006
    Posts
    4,075
    Points
    59,308
    Level
    75
    Thumbs Up
    Received: 0
    Given: 0

    Re: Engineering a Statistically Average Bracket

    Quote Originally Posted by GoCubsGo View Post
    Interesting - thanks for sharing. It would also be interesting to take this analysis into subsequent rounds - what percentage of time do various seeds advance to Sweet Sixteen, Elite Eight, etc.

    As we all know, it's nice to get the first-round games right, but the big payoff for NCAA brackets comes with picking the following rounds.
    I too would be very interested in seeing an analysis like this of later rounds.

    I have a theory that picking no upsets whatsoever will generally result in a well above-average bracket, but often not a winning one. While there will be upsets, the chances of correctly picking the upsets is not favorable. As your numbers bear out, for any given game, you are more likely to pick correctly if you pick the higher seed. For the past several years I have submitted a number of brackets in which I pick no upsets. I find that such brackets are usually above-average and I am usually in the running until the final four is completed, but that the no-upset bracket usually loses, as someone often correctly picks a few later-round upsets.


    Seneca Wallace.

  10. #10
    Addict
    Points: 95,913, Level: 96
    Level completed: 44%, Points required for next Level: 1,137
    Overall activity: 14.0%
    Achievements:
    50000 Experience PointsVeteran
    cyclones500's Avatar
    Join Date
    Jan 2010
    Location
    Michigan
    Posts
    9,046
    Points
    95,913
    Level
    96
    Thumbs Up
    Received: 200
    Given: 468

    Re: Engineering a Statistically Average Bracket

    Quote Originally Posted by besserheimerphat View Post
    Have you figured out yet how you are going to pick your upsets? Obviously it's not enough to get the right number of upsets, you need to pick the actual upsets as well. I've messed around some with football score analysis and gotten good correlations with conference records, but never anything that was predictive (other than a rough estimate of odds to win, but most of the time that's doable without a whole lotta math).
    That's the tricky part, I think.

    I've done brackets long enough and seen statistics and so I already use an approach similar to what OP is trying (mine is less number-reliant, obviously, since I don't have firm data at hand). So I know to find a 10-2 upset, advance at least one double-digit to Sweet 16, look for 12-5's and maybe a 13-4.

    Trouble is always, which?

    Even though that part is the challenge, and history can't predict anything, it's generally good to at least have a starting point, and not go too far w/ upsets (or not far enough) in first few rounds.



  11. #11
    Addict
    Points: 95,913, Level: 96
    Level completed: 44%, Points required for next Level: 1,137
    Overall activity: 14.0%
    Achievements:
    50000 Experience PointsVeteran
    cyclones500's Avatar
    Join Date
    Jan 2010
    Location
    Michigan
    Posts
    9,046
    Points
    95,913
    Level
    96
    Thumbs Up
    Received: 200
    Given: 468

    Re: Engineering a Statistically Average Bracket

    Quote Originally Posted by jaretac View Post
    How about the second round match ups?
    Good point.

    I think most people who do brackets for many years zero-in on where to find 3-4-5-6 upsets, then ignore the next step.

    Where might we get a 12/13 matchup in Round-of-32? Where would you be willing to 'waste' both a 4 and 5 to move a 12 to the Sweet 16? It doesn't happen every year, but has occurred far more often than seems reasonable.

    One 2 and one 3 are almost certain to bounce by the end of the first weekend. Who is it? Do you go with perceived vulnerability factor? Or which of the 6-11/7-10 teams has the best shot at surviving first two games?

    And so on.

    It's even further risk (but higher reward) to try an outside-the-box Elite 8. One of my brackets last season managed to include (8) Butler vs. (2) Florida. Of course, I whiffed by advancing Florida to Final Four. But at least I got that close.



  12. #12
    Starter
    Points: 24,828, Level: 48
    Level completed: 28%, Points required for next Level: 722
    Overall activity: 2.0%
    Achievements:
    Veteran10000 Experience Points

    Join Date
    Mar 2007
    Location
    Austin, TX
    Posts
    874
    Points
    24,828
    Level
    48
    Thumbs Up
    Received: 9
    Given: 3

    Re: Engineering a Statistically Average Bracket

    If you want a more robust data set (particularly for later rounds analysis) you can use this site, which has the records of all 1-16 seeds since the tourney expanded to 64 in 1985:

    mcubed.net : Men's NCAA Basketball Tournament : Records per seed



  13. #13
    Pro
    Points: 37,502, Level: 59
    Level completed: 72%, Points required for next Level: 348
    Overall activity: 1.0%
    Achievements:
    Veteran25000 Experience Points

    Join Date
    Nov 2006
    Location
    Boston, MA
    Posts
    2,603
    Points
    37,502
    Level
    59
    Thumbs Up
    Received: 38
    Given: 8

    Re: Engineering a Statistically Average Bracket

    Quote Originally Posted by Kyle View Post
    I too would be very interested in seeing an analysis like this of later rounds.

    I have a theory that picking no upsets whatsoever will generally result in a well above-average bracket, but often not a winning one. While there will be upsets, the chances of correctly picking the upsets is not favorable. As your numbers bear out, for any given game, you are more likely to pick correctly if you pick the higher seed. For the past several years I have submitted a number of brackets in which I pick no upsets. I find that such brackets are usually above-average and I am usually in the running until the final four is completed, but that the no-upset bracket usually loses, as someone often correctly picks a few later-round upsets.
    This. The OP's data actually say that you shouldn't pick any upsets. In fact, trying to pick one upset (which his data suggests that there usually is one upset each for the 12 and 13 seed games) will probably actually reduce your chances of getting them all right. Of course, this will just lead to a better then average bracket, but will be unlikely to win any decent-sized tournament pool. Unfortunately, I think you've gotta just roll the dice and hope you get lucky....



  14. #14
    All-Star
    Points: 34,659, Level: 57
    Level completed: 35%, Points required for next Level: 791
    Overall activity: 1.0%
    Achievements:
    Social25000 Experience PointsVeteran
    driegner's Avatar
    Join Date
    Jun 2010
    Location
    Columbus, OH
    Posts
    1,134
    Points
    34,659
    Level
    57
    Thumbs Up
    Received: 0
    Given: 3

    Re: Engineering a Statistically Average Bracket

    Quote Originally Posted by Clone9 View Post
    This. The OP's data actually say that you shouldn't pick any upsets. In fact, trying to pick one upset (which his data suggests that there usually is one upset each for the 12 and 13 seed games) will probably actually reduce your chances of getting them all right. Of course, this will just lead to a better then average bracket, but will be unlikely to win any decent-sized tournament pool. Unfortunately, I think you've gotta just roll the dice and hope you get lucky....
    As I said, I tend to pick too many upsets. This data has given me guidelines as to how many actually tend to happen in a given year. The answer to that question is ~1 in the 4/13 and almost 2 in the 8/9.

    Moving forward my next 2 steps are to build in the next 2 rounds, and to determine actual bracket configurations with a realistic number of upsets in the "best" positions.

    I want to see if I can get MatLab to pick a better bracket than an average person. Can I get a model that will actually do better than someone who picks the favorite everytime?

    I can't wait to play around with this more.

    P.S. Anyone familiar with the subject may realize that this is very similar to a statistical thermodynamics question. We have a list of configurations and their relative likelihood of occurring. The "energy" in this situation is analogous to the probability of a given configuration occurring. It's unlikely that the system will be perfect (No upsets) but it's also unlikely to be completely disordered (all upsets). "Equilibrium" is somewhere in between.



  15. #15
    Hall-Of-Famer
    Points: 59,308, Level: 75
    Level completed: 51%, Points required for next Level: 742
    Overall activity: 0%
    Achievements:
    Veteran50000 Experience Points
    Kyle's Avatar
    Join Date
    Mar 2006
    Posts
    4,075
    Points
    59,308
    Level
    75
    Thumbs Up
    Received: 0
    Given: 0

    Re: Engineering a Statistically Average Bracket

    Quote Originally Posted by driegner View Post
    I want to see if I can get MatLab to pick a better bracket than an average person. Can I get a model that will actually do better than someone who picks the favorite everytime?
    The first question is not all that interesting. As discussed, a no-upset bracket is almost always going to do better than average.

    If you could build a formula such that the answer to the second question was a definitive yes, you could make a fortune as a Vegas odds-maker. That would undoubtedly require the input of far more information than you plan to use though, and I'm sure you are not the only one that has tried it. Here's a short article on predictive statistics.
    The Secret Formula for Picking NCAA Basketball Tournament Winners | Wall St. Cheat Sheet

    It seems that the most interesting thing for you to try and accomplish is to use the statistical data to create a system for creating brackets that is more likely to produce winning brackets than other methods. Using the numbers you provided for the first-round games, it seems that a bracket that picks upsets in the ratio your numbers suggested is more likely to be a "perfect bracket." I suspect that as the pool-size increases it becomes more necessary to get closer to a perfect bracket in order to have the top bracket in the pool. For example, if I am competing against only one other person, I suspect that a no-upset bracket would be the odds-on favorite to win, as the chances that the one other person correctly picked the upsets is small. It would seem that as the pool size gets larger, the odds that a no-upset bracket will be the best should get smaller faster than the odds that would be predicted by simply increasing the size of the pool. I'd be curious at what point the odds of a no-upset bracket are reduced to less than one over the pool size (e.g. 20% win probably for a five person pool). It seems that as the pool gets larger, one must pick more upsets in order to win. Here's an article that suggests the same thing.
    March Madness: Using Game Theory To Win Your Upset Picks | Sports Business | Minyanville.com

    The basic technique you are using seems best suited to determining the number of upsets that should be picked when in an extremely large pool. For smaller pools, you are probably better off picking fewer upsets than are suggested by your numbers.


    Last edited by Kyle; 03-12-2012 at 06:54 PM.
    Seneca Wallace.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
  • TV: FOX
  • Iowa State vs. Baylor
  • September 27, 2014
  • 07:20 PM