Image Mask

Nate Watson

President

nate@canworksmart.com

Nate is well known throughout the region as a leader who has helped numerous companies bridge the gap between data overload and actionable intelligence. He has been CAN's President since 2015 and since then has worked ceaselessly to strengthen and expand its operations.

Like what you see? Connect with Nate.

Related

Re-Blog: Why Visualizing Data is Important

on February 24

10 Questions to Ask Before Buying Sales Leads

on April 7, 2014

At the beginning of the project, we set out to show how the 2017 NCAA College Basketball Tournament could be a proving ground for Machine Learning analysis. There are very few places in the world where we can use the same model to predict multiple outcomes in a short period of time, have a ready-made scorecard (Vegas), have the general public understand what we are trying to do, and have a chance to “beat” the algorithm with their own knowledge.

You could say our findings have been a “Slam Dunk” (I couldn’t help myself).

Before diving into the results, I wanted the reader to understand what we were up against. It’s easy to pick chalk (always picking the better seed). In fact, that is how the games are supposed to work. The 8 seed is supposed to beat the 9. And for the most part, the NCAA does a decent job. Historically, only 26% of tournament’s games end in an upset (this includes games from all rounds). That’s 17 out of 64 games. This was never going to be easy.

 

Project Recap

We predicted 20 upsets and got 10 right (50%). We only missed predicting 3 upsets.

Using Vegas as a scorecard and having bet $100 “dollars” on each predicted upset, we would have ended up +$2,605 off our simulated bets (a 30% ROI)–the majority of this coming from long shot underdogs.

Think about this. If we would have bet all chalk on games except the ones the algorithm predicted as upsets, then out of 61 games we would have only missed 13. That’s 79% accurate!

Let’s look at this another way. Our algorithm predicted 77% (10/13) of something that is only 26% likely to happen in the first place. Now think about what you would do if you could identify an unlikely event in your business with 77% accuracy.

  • What would you do if you knew 77% of the customers who were going to leave before they left?
  • What would you do if you knew 77% of failed batches before they happened?
  • What would you do if you knew 77% of your plant’s machine failures before they happened?

Business Scenario

You have a theory that some of your clients would buy more “product” if they were called and offered an upgraded deal. However you don’t want to call all of your clients because you have so many. What you do have is a dataset of past customers that successfully responded to this type of nudge. Using your data, our machine learning algorithm could predict a set of your clients that would be 77% likely to purchase more product if called.

 

Game changer right?

 

Why this is huge

Our Machine Learning lower seed winning project was looking to predict as accurately as we could a lower seeded team winning in the NCAA tournament. Our stated goal from the beginning was to get 47% of our picks correct and a mere 10% ROI. We beat both of those goals. Our Machine Learning algorithm, which uses a custom optimization engine called Evolutionary Analysis, looked at a comparison of 207 different metrics of college basketball teams and their results in prior tournaments. It selected ranges of those 207 measures that best matched up with historic wins by lower seeded teams. We then confirmed that the range was predictive by testing the selected ranges against a “clean” historic data set. This comparison is how we got our goal percent and ROI. We then published our forecasts before each round was played – the results speak for themselves.

While we still have 3 games to go, our initial point that Machine Learning can help you be better at making decisions from your data has been proven. Implementing Machine Learning isn’t hard so long as your business has these three characteristics:

  • A data set with a large number of characteristics
  • A measure of success to optimize upon
  • A desire to learn from data to make changes in your organization

 

If this sounds like something that your business could use, please contact Nate Watson of CAN (Nate@CanWorkSmart.com) or Gordon Summers of Cabri Group (Gordon.Summers@CabriGroup.com) today.

 


Prediction Results

Here is a summary of our picks from the beginning of the project ($ indicates our successful pick where “money” was made):

East Tennessee St. over Florida
$ Xavier over Maryland
Vermont over Purdue
Florida Gulf Coast over Florida St.
Nevada over Iowa St.
$ Rhode Island over Creighton
$ Wichita St. over Dayton
$ USC over SMU
$ Wisconsin over Villanova
$ Xavier over Florida St.
Rhode Island over Oregon (tied with a minute to go)
Middle Tennessee over Butler
Wichita St. over Kentucky (tied with a minute to go)
Wisconsin over Florida (OT last second shot)
$ South Carolina over Baylor
$ Xavier over Arizona
Purdue over Kansas
Butler over North Carolina
$ South Carolina over Florida
$ Oregon over Kansas

And for those who are curious, our algorithm has detected one Final Four upset for this weekend:

Oregon over North Carolina

For more information about how we created the Machine Learning algorithm and how we kept score, please read our Machine Learning technical document. Additionally you can find results for the whole tournament here.



We are doing better than anticipated even after the heartbreaker of a game last night between Wisconsin and Florida and are still ahead $165.

 

Here are our Round 4 Potential Upsets:

 

South Carolina over Florida

Oregon over Kansas

 

We will have a weekend breakdown of all of our picks on Monday.

 

For more information about how we created the Machine Learning algorithm and how we are keeping score, you may read the Machine Learning article here:

 

http://canworksmart.com/machine-learning-basketball-methodology

 


Related

The Man Behind the Scenes: An Interview with Nate Watson

on April 20

Women in Tech: A Visualization from Tableau Public

on April 18
machine learning prediction

Machine Learning and the NCAA Men’s Basketball Tournament Methodology

 <<This article is meant to be the technical document following the above article. Please read the following article before continuing.>>

“The past may not be the best predictor of the future, but it is really the only tool we have”

 

Before we delve into the “how” of the methodology, it is important to understand “what” we were going for: A set of characteristics that would indicate that a lower seed would win. We use machine learning to look through a large collection of characteristics and it finds a result set of characteristics that maximizes the number of lower seed wins while simultaneously minimizing lower seed losses. We then apply the result set as a filter to new games. The new games that make it through the filter are predicted as more likely to have the lower seed win. What we have achieved is a set of criteria that are most predictive of a lower seed winning.

 

This result set is fundamentally different than an approach trying to determine the results of all new games whereby an attempt is made to find result set that would apply to all new games. There is a level of complexity and ambiguity with a universal model which is another discussion entirely. By focusing in on one result set (lower seed win) we can get a result that is more predictive than attempting to predict all games.

 

This type of predictive result set has great applications in business. What is the combination of characteristics that best predict a repeat customer? What is the combination of characteristics that best predict a more profitable customer? What is the combination of characteristics that best predict an on time delivery? This is different from just trying to forecast a demand by using a demand signal combined with additional data to help forecast. Think of it as the difference between a stock picker that picks stocks most likely to rise vs. forecasting how far up or down a specific stock will go. The former is key for choosing stocks the later for rating stocks you already own.

 

One of the reasons we chose “lower seed wins” is that there is an opportunity in almost all games played in the NCAA tournament for there to be a data point. There are several games where identical seeds play. Most notably, the first four games do involve identical seeds and the final four can possibly have identical seeds. However, that still gives us roughly 60 or so games a year. The more data we have, the better predictions we get.

 

The second needed item is more characteristics. For our lower seed win we had >200 different characteristics for years 2012-2015. We used the difference between the characteristics of the two teams as the selection. We could have used the absolute characteristics for both teams as well. As the analysis is executed, if a characteristic is un-needed it is ignored. What the ML creates is a combination of characteristics. We call our tool, “Evolutionary Analysis”. It works by adjusting the combinations in an ever improving manner to get result. There is a little more in the logic that allows for other aspects of optimization, but the core of Evolutionary Analysis is finding a result set.

The result set was then used as a filter on 2016 to confirm that the result is predictive. It is possible that the result set from 2012-2015 doesn’t actually predict 2016 results. Our current result set as a filter on 2016 data had 47% underdog wins vs. the overall population. The historic average is 26% lower seed wins and randomly, the 47% underdog win result could happen about 3.4% of the time. Our current result is therefore highly probable as a predictive filter.

 

The last step in the process is to look at those filter criteria that have been chosen and to check to see if they are believable. For example, one of the criteria that was Defensive Efficiency Rank. Evolutionary Analysis chose a lower limit of … well it set a lower limit, let’s just say that. This makes sense, if a lower seed has a defense that is ranked so far inferior to the higher seed, it is unlikely to prevail. A counter example is that the number of blocks per game was not a criteria that was chosen. In fact, most of the >200 criteria were not used, but that handful of around ten criteria set the filter that chooses a population of games that is more likely to contain a lower seed winning.

 

And that is one of the powerful aspects of this type of analysis, you don’t get the one key driver, or even two metrics that have a correlation. You get a whole set of filters that points to a collection of results that deviates from the “normal”.

 

Please join us as we test our result set this year. We’ll see if we get around 47%. Should be interesting!

 

If you have questions on this type of analysis or machine learning in general, please don’t hesitate to contact Gordon Summers of Cabri Group (Gordon.Summers@CabriGroup.com) or Nate Watson at CAN (nate@canworksmart.com).

**Disclaimer: Any handicapping sports odds information contained herein is for entertainment purposes only. Neither CAN nor Cabri Group condone using this information to contravene any law or statute; it’s up to you to determine whether gambling is legal in your jurisdiction. This information is not associated with nor is it endorsed by any professional or collegiate league, association or team. Machine Learning can be done by anyone, but is done best with professional guidance.

 

 

 


Contemporary Analysis (CAN) and Cabri Group and have teamed up to use Machine Learning to predict the upsets for the NCAA Men’s Basketball Tournament. By demonstrating the power of ML through our results, we believe more people can give direction to their ML projects.

 

Machine Learning (ML) is a powerful technology and many companies rightly guess that they need to begin to leverage ML. Because there are so few successful ML people and projects to learn from, there is a gap between desire and direction. 

 

We will be publishing a selection of games in the 2017 NCAA Men’s Basketball Tournament. Our prediction tool estimates games where the lower seed has a better than average chance of winning against the higher seed. We will predict about 16 games from various rounds of the tournament. The historical baseline for lower seeds winning is 26%. Our current model predicted 16 upsets for the 2016 tournament. We were correct in 7 of them (47%), which in simulated gambling gave the simulated gambler an ROI was 10% (because of the odds). Our target for the 2017 tournament will be to get 48% right.

 

Remember, our analysis isn’t to support gambling, but to prove the ability of ML. However, we will be keeping score with virtual dollars. We will be “betting” on the lower seed to win. We aren’t taking into consideration the odds in our decisions, only using them to help score our results.

 

We will be publishing our first games on Wednesday 15th after the first four games are played. We won’t have any selections for the first four games as they are played by teams with identical seeds. Prior to each round, we will publish all games that our tool thinks have the best chance of the lower seed winning. We’ll also publish weekly re-caps with comments on how well our predictions are doing.

 

Understand the technique that finds a group of winners (or losers) in NCAA data can be used on any metric. Our goal is to open up people’s minds onto the possibilities of leveraging Machine Learning for their businesses. If we can predict things as seemingly complex as a basketball tournament (Something that has never been correctly predicted), then imagine what we could do with your data that drives your decisions?

 

If you have questions on this type of analysis or machine learning in general, please don’t hesitate to contact Gordon Summers of Cabri Group (Gordon.Summers@CabriGroup.com) or Nate Watson at CAN (nate@canworksmart.com).

 

Those interested in the detailed description of our analysis methodology can read the technical version of the article found here.

**Disclaimer: Any handicapping sports odds information contained herein is for entertainment purposes only. Neither CAN nor Cabri Group condone using this information to contravene any law or statute; it’s up to you to determine whether gambling is legal in your jurisdiction. This information is not associated with nor is it endorsed by any professional or collegiate league, association or team. Machine Learning can be done by anyone, but is done best with professional guidance.


Related

The Man Behind the Scenes: An Interview with Nate Watson

on April 20

Single Mom of Three Rocks Web Development World

on April 11

We are now accepting applications for the June 2017 cohort of the Omaha Data Science Academy!

Apply at Interface Web School’s website.

Are you interested in predictive analytics? Are you applying for jobs involving machine learning? Would you like to learn how to design and create algorithms? If so, the Oma-DSA may be a perfect fit. The Oma-DSA is designed for people who want to add to their data science knowledge for marketable skills. We use hands on teaching from leading data scientists in the Omaha area to craft courses that will boost your knowledge exponentially. More details at canworksmart.com


Related

The Man Behind the Scenes: An Interview with Nate Watson

on April 20

Single Mom of Three Rocks Web Development World

on April 11

The next Omaha Data Science Academy cohort starts June 13th. Beat the competition and apply early! Applications open on Monday the 16th.

Contact Nate Watson at nate@canworksmart.com or see Interface‘s webpage for more details.

Are you interested in predictive analytics? Are you applying for jobs involving machine learning? Would you like to learn how to design and create algorithms? If so, the Oma-DSA may be a perfect fit. The Oma-DSA is designed for people who want to add to their data science knowledge for marketable skills. We use hands on teaching from leading data scientists in the Omaha area to craft courses that will boost your knowledge exponentially. More details at canworksmart.com


Related

The Man Behind the Scenes: An Interview with Nate Watson

on April 20

Women in Tech: A Visualization from Tableau Public

on April 18

Every week CAN will highlight a past or present CAN employee as part of a CAN alumni network series. This week we feature Grant Stanley.

Grant Stanley founded Contemporary Analysis in 2008. For 6 years he served as CEO and president before handing off the company to Nate Watson to pursue new ventures. In 2014, Stanley launched Bric. Bric is a managing software system designed specifically for creative agencies. Today we highlight a post on Bric’s blog about the art of time tracking and the importance of the data it collects:

https://getbric.com/time-tracking-needs-new-purpose/


Related

The Man Behind the Scenes: An Interview with Nate Watson

on April 20

Women in Tech: A Visualization from Tableau Public

on April 18

Every week CAN will highlight a past or present CAN employee as part of a CAN alumni network series. First up to bat is Eric Burns

Eric Burns is a former employee of Contemporary Analysis. In 2011, he brought on CAN’s first international clients. Today he is the CEO and founder of Gazella Wifi Marketing, which turns restaurant guest information into a marketing tool. He continues to be an active member of CAN’s alumni network.

Here are his thoughts on analyzing wifi marketing:

http://blog.gazellawifi.com/10000-visits-to-a-coffee-shop-wifi-marketing-data



No matter what kind of website you are running, web hosting is the first pillar of Website Analytics. In this day and age, it is crucial to rank high with Google and Bing if you expect to draw organic traffic to your site. Many people never make it to past page one–no one makes it past page two. It doesn’t matter if your website is a blog, an online marketplace, or simply a static landing page detailing what your business is and how you can be reached. The end goal is to create warm leads of people interested in what you are selling. How do we do this? By measuring who comes to the site, what they read, and how long they stay. But, how do you know your metrics aren’t skewed by unreliable hosting? The answer lies in understanding the measurements. Here are a few things you should know:

First Pillar: Reliable Web Hosting

Uptime and Speed are the key words here. People simply get tired waiting for a site that is ‘temporarily down’ or loading slowly, so they will invariably hit the back button to return to the Search Engine Results Pages (SERPs), never to return. It is through the SERPs that you gain organic traffic, so if you don’t have reliable hosting and your site is often down or slow to load, you won’t gain anything from all that SEO you so painfully worked for (or paid for) to move you up to that coveted first page of Google. Look for hosting from providers such as Flywheel or Best Web Hosting that gives you tools to maximize your up-time as well as speed. Then, and only then, can rest assured you won’t lose traffic due to inaccessible or slow web pages.

Second Pillar: Metrics

Once you are assured that you have a web hosting company that will keep your site up and running at speeds that won’t frustrate visitors, it’s time to see start tracking who is coming to your websites. Externally, this can be done by harvesting Wi-fi log-ins (Gazella Wifi), or internally (Google Analytics). Ideally, you want to track who came, how long they stayed, whether or not moved about through internal pages, or simply didn’t like what they saw and left. It should also be stated, keeping people on your website by leading them to other content is a necessity to keep them on your website, but ultimately your goal is to get them to download something or sign up for something.

How does web hosting have an impact on this? Remember, you can’t get accurate results if you are losing visitors due to unreliable hosting! Your numbers will not paint an accurate picture so your analytics will be skewed.

Third Pillar: Predictive Analytics

Once you have a stable hosting company and are measuring metrics, you can begin to do things like predict which advertisements or offers get a person to buy/interact. This is done through predictive analytics. Contemporary Analysis (CAN) uses data collected from its web site train its machine learning model to offer its potential customers articles they might be interested in. This is similar to what Amazon does with products, Netflix does with movies, and you can do with the right data.

 

Hosting, Metrics, and Analytics are the three elements that must work together if any type of website is to survive, grow, and provide leads. Without reliable hosting, your drop metrics are skewed and your click-through rates are diminished. Your website metrics will be skewed due to bounce rates due, not to bad content, but to speed and hosting. If your site is being damaged by poor hosting, I highly recommend changing hosting companies. With reliable hosting, your metrics will be accurate and your data can be used to predict customer interactions.

Your company will thank you for it.


Related

The Man Behind the Scenes: An Interview with Nate Watson

on April 20

Single Mom of Three Rocks Web Development World

on April 11

CAN is excited to announce, in partnered with the Interface Web School, the creation of Omaha’s first Data Science Academy (Oma-DSA). 

This is something we have been working on for a long time. It is actually a continuation of a service we currently offer to clients where we train a company’s first data scientist. We feel this unique person, trained in both data science and business problem solving, is needed by their company to help implement the ideology more than produce mathematical models or produce visualizations.

In the past, we heard that while companies know how to find and hire a data scientist, they fear not being able to utilize this person or even know how to correctly scope how to use predictive analytics in their business. This caused them to not execute or to execute poorly and leave a bad taste in the organization’s mouth.

CAN has discovered that having a data science advocate (instead of just a data scientist) usually fixes the hangup with implementation in most companies trying to use data science for the first time. The realization there was a considerable lack of talent when looking to fill this need, led us to develop a school that teaches not only entry level data science, but also how to address the political red tape prevalent in changing how an operation thinks and makes decisions.

This academy will help CAN reach its goal of putting a data science advocate in every company in Omaha. While audacious, we feel this is a must to keep Omaha companies relevant in an economy where we are not just in competition from a company down the street but from every other company doing similar work around the world.

 

Details.

This certificate will teach some of the most important techniques and tools necessary to introduce data science into company culture, get necessary political buy-in, find, manipulate, and analyze the data present inside your company’s database, make predictions of outcomes, and create visualizations that can help non-technical users understand and see the identified trends and patterns inside the data.  

The Oma-DSA is designed to help set a company down the road of data discovery and data-driven decision making. While not the heavy mathematician or economist created by four year degrees, the graduate will leave the Academy with the confidence and the skills of an entry level data scientist and be able to have conversations with business units, build predictive analytical MVPs, and be able to know and manage the skill sets needed for future data scientist projects.

The Certificate Consists of 4 Modules: 

  • Basics of Python Programming
  • Data Manipulation and Management
  • Statistics and Computational Modeling
  • Data Visualization

 

All classes meet 2 nights per week for 22 weeks over the course of 28 weeks for a total of 154 hours of in-class instruction to complete the certificate.

 

For more information on course offerings and to apply, go to https://interfaceschool.com/course/data-science-academy/.

You may also contact Nate Watson, director of the academy, at nate@canworksmart.com if you have specific questions about offerings or custom classes. 

 



Looking for something?