Image Mask

Gordon Summers

Senior Data Scientist

Gordon.Summers@CabriGroup.com

Gordon is a seasoned machine learning expert with over two decades of experience implementing analytics solutions at a Fortune 200 company. In his free time he enjoys rock climbing and reading geology books.

Like what you see? Connect with Gordon.

Related

The Man Behind the Scenes: An Interview with Nate Watson

on April 20

Women in Tech: A Visualization from Tableau Public

on April 18

After the first weekend of basketball, our Machine Learning Prediction tool has good results.

We had two measures of success: We wanted to win at least 46% of our picks and we wanted to “win” using virtual money bet on the money lines. By both measures, we had success: We correctly picked 6 upsets out of the 13 games we chose (46%) and we won $1,359 off the 6 correctly picked upsets (profit of $59 on $1300 laid ($100 per game) or 5% ROI).

The details:

Overall there were 10 instances where the lower seed won in the first two rounds. This year is on track for fewer lower seeds winning (22%) than the historic rate (26%). So even with “tough headwinds” we still came close to our expectations.

“But CAN, there were multiple lower seed winning that you didn’t pick. Why didn’t the model see Middle Tennessee upsetting Minnesota?” The answer is simple, MT winning was a result of variables that we weren’t measuring. Our picks were based on games that matched our criteria were based on variables found in most (not all) of the games in which the lower seed won in past years. Lower seeds can and will still win, our model was built to predict the highest number of upsets without over picking. This is actually the perfect example of a model, even great ones, will not predict all. However, most, even some, in business, can mean huge revenue increases or monies saved.

Besides we had some really, really close calls that would have put us way, way ahead. There were several games where we had that the lower seed having a good chance of winning and they simply lost (Both Wichita State and Rhode Island had the games tied with under a minute to go). We picked multiple games where the money lines showed Vegas gave no chance of the upset, yet the teams came very close. Our play was to choose games that match the criteria and spread the risk over several probable winners. This wasn’t about picking the only upsets or all of the upsets, this was about picking a set of games that had

Our goal was to not choose games in a vacuum (which is how you bet), but instead to choose games that match the criteria and spread the risk over several probable winners. This wasn’t about picking the only upsets or all of the upsets, this was about picking a set of games that had the highest probability of the lower seed winning. And by our measures of success, we achieved our goal.

We aren’t done quite yet either.

For the next round, we have 5 games that match our criteria:

Wisconsin over Florida
South Carolina over Baylor
Xavier over Arizona
Purdue over Kansas
Butler over North Carolina

**If any games match our predictive criteria in the next round, we’ll post them Saturday before tip-off.

The results of the first rounds:

The Machine Learning algorithm performed as advertised: It identified a set of characteristics from historic data that was predictive of future results. The implications for any business is clear: if you have historic data and you leverage this type of expertise, you can predict the future.

For more information about how we created the Machine Learning algorithm and how we are keeping score, you may read the Machine Learning article here:

http://canworksmart.com/machine-learning-basketball-methodology

If you would like to see how Machine Learning could improve your business, please feel free to reach out to either of us: this can relate to your business contact Gordon Summers of Cabri Group (Gordon.Summers@CabriGroup.com) or Nate Watson of CAN (nate@canworksmart.com).

 


Related

The Man Behind the Scenes: An Interview with Nate Watson

on April 20

Women in Tech: A Visualization from Tableau Public

on April 18

The Cabri Group / CAN Machine Learning Lower Seed Win Prediction tool has made its first round forecast! Without further ado:

East Tennessee St. (13) over Florida (4)
Xavier (11) over Maryland (6)
Vermont (13) over Purdue (4)
Florida Gulf Coast (14) over Florida St. (3)
Nevada (12) over Iowa St. (5)
Rhode Island (11) over Creighton (6)
Wichita St. (10) over Dayton (7)

 

* If the last play in games add another predicted upset, we’ll update that prior to the game starting.

Update: USC (11) over SMU (6)

One of the obvious observations on the predictions is: “Wait, no 8/9 upsets????” Remember these games show the most similar characteristics of the largest historic collection of upsets. This doesn’t mean that there will be no upsets as 8/9 nor that all of the predictions above will hit (remember we are going for 47% upsets) nor that all games not listed will have the favorites win. The games on the list are there because they share the most characteristics with historic times when the lower seed won.

Also, one of the key team members on this project, Matt, is a big Creighton fan (and grad). He was not happy to see Creighton on the list. I’ll speak to that one specifically. In the technical notes, I indicated that one of the many criteria that is being used is was Defensive Efficiency (DE). Machine Learning algorithm (Evolutionary Analysis) doesn’t like it when the lower seed has a large gap of DE between the lower seed and the higher seed. Creighton actually has a lower Defensive Efficiency than Rhode Island. Sorry Matt. Again, it doesn’t mean Creighton won’t win, it only means that the Rhode Island v. Creighton game shares more criteria with a the largest collection of historic upsets than the other games in the tournament.

As we indicated, we will use the odds as well as a count of upsets to determine how well we do as the tournament goes on. We’ll have a new set of predictions on Saturday for the next round of the tournament and a recap coming on Monday.

For more information about how we created the Machine Learning algorithm and how we are keeping score, you may read the Machine Learning article here:

http://canworksmart.com/machine-learning-basketball-methodology



Looking for something?