Related

Happy Fourth of July from CAN

on July 4

Bloggers Writing About Tableau

on June 27

After the first weekend of basketball, our Machine Learning Prediction tool has good results.

We had two measures of success: We wanted to win at least 46% of our picks and we wanted to “win” using virtual money bet on the money lines. By both measures, we had success: We correctly picked 6 upsets out of the 13 games we chose (46%) and we won $1,359 off the 6 correctly picked upsets (profit of $59 on $1300 laid ($100 per game) or 5% ROI).

The details:

Overall there were 10 instances where the lower seed won in the first two rounds. This year is on track for fewer lower seeds winning (22%) than the historic rate (26%). So even with “tough headwinds” we still came close to our expectations.

“But CAN, there were multiple lower seed winning that you didn’t pick. Why didn’t the model see Middle Tennessee upsetting Minnesota?” The answer is simple, MT winning was a result of variables that we weren’t measuring. Our picks were based on games that matched our criteria were based on variables found in most (not all) of the games in which the lower seed won in past years. Lower seeds can and will still win, our model was built to predict the highest number of upsets without over picking. This is actually the perfect example of a model, even great ones, will not predict all. However, most, even some, in business, can mean huge revenue increases or monies saved.

Besides we had some really, really close calls that would have put us way, way ahead. There were several games where we had that the lower seed having a good chance of winning and they simply lost (Both Wichita State and Rhode Island had the games tied with under a minute to go). We picked multiple games where the money lines showed Vegas gave no chance of the upset, yet the teams came very close. Our play was to choose games that match the criteria and spread the risk over several probable winners. This wasn’t about picking the only upsets or all of the upsets, this was about picking a set of games that had

Our goal was to not choose games in a vacuum (which is how you bet), but instead to choose games that match the criteria and spread the risk over several probable winners. This wasn’t about picking the only upsets or all of the upsets, this was about picking a set of games that had the highest probability of the lower seed winning. And by our measures of success, we achieved our goal.

We aren’t done quite yet either.

For the next round, we have 5 games that match our criteria:

Wisconsin over Florida
South Carolina over Baylor
Xavier over Arizona
Purdue over Kansas
Butler over North Carolina

**If any games match our predictive criteria in the next round, we’ll post them Saturday before tip-off.

The results of the first rounds:

The Machine Learning algorithm performed as advertised: It identified a set of characteristics from historic data that was predictive of future results. The implications for any business is clear: if you have historic data and you leverage this type of expertise, you can predict the future.

For more information about how we created the Machine Learning algorithm and how we are keeping score, you may read the Machine Learning article here:

http://canworksmart.com/machine-learning-basketball-methodology

If you would like to see how Machine Learning could improve your business, please feel free to reach out to either of us: this can relate to your business contact Gordon Summers of Cabri Group (Gordon.Summers@CabriGroup.com) or Nate Watson of CAN (nate@canworksmart.com).

 


Related

Happy Fourth of July from CAN

on July 4

Bloggers Writing About Tableau

on June 27

The Cabri Group / CAN Machine Learning Lower Seed Win Prediction tool has made its first round forecast! Without further ado:

East Tennessee St. (13) over Florida (4)
Xavier (11) over Maryland (6)
Vermont (13) over Purdue (4)
Florida Gulf Coast (14) over Florida St. (3)
Nevada (12) over Iowa St. (5)
Rhode Island (11) over Creighton (6)
Wichita St. (10) over Dayton (7)

 

* If the last play in games add another predicted upset, we’ll update that prior to the game starting.

Update: USC (11) over SMU (6)

One of the obvious observations on the predictions is: “Wait, no 8/9 upsets????” Remember these games show the most similar characteristics of the largest historic collection of upsets. This doesn’t mean that there will be no upsets as 8/9 nor that all of the predictions above will hit (remember we are going for 47% upsets) nor that all games not listed will have the favorites win. The games on the list are there because they share the most characteristics with historic times when the lower seed won.

Also, one of the key team members on this project, Matt, is a big Creighton fan (and grad). He was not happy to see Creighton on the list. I’ll speak to that one specifically. In the technical notes, I indicated that one of the many criteria that is being used is was Defensive Efficiency (DE). Machine Learning algorithm (Evolutionary Analysis) doesn’t like it when the lower seed has a large gap of DE between the lower seed and the higher seed. Creighton actually has a lower Defensive Efficiency than Rhode Island. Sorry Matt. Again, it doesn’t mean Creighton won’t win, it only means that the Rhode Island v. Creighton game shares more criteria with a the largest collection of historic upsets than the other games in the tournament.

As we indicated, we will use the odds as well as a count of upsets to determine how well we do as the tournament goes on. We’ll have a new set of predictions on Saturday for the next round of the tournament and a recap coming on Monday.

For more information about how we created the Machine Learning algorithm and how we are keeping score, you may read the Machine Learning article here:

http://canworksmart.com/machine-learning-basketball-methodology


Related

Happy Fourth of July from CAN

on July 4

Bloggers Writing About Tableau

on June 27
machine learning prediction

Machine Learning and the NCAA Men’s Basketball Tournament Methodology

 <<This article is meant to be the technical document following the above article. Please read the following article before continuing.>>

“The past may not be the best predictor of the future, but it is really the only tool we have”

 

Before we delve into the “how” of the methodology, it is important to understand “what” we were going for: A set of characteristics that would indicate that a lower seed would win. We use machine learning to look through a large collection of characteristics and it finds a result set of characteristics that maximizes the number of lower seed wins while simultaneously minimizing lower seed losses. We then apply the result set as a filter to new games. The new games that make it through the filter are predicted as more likely to have the lower seed win. What we have achieved is a set of criteria that are most predictive of a lower seed winning.

 

This result set is fundamentally different than an approach trying to determine the results of all new games whereby an attempt is made to find result set that would apply to all new games. There is a level of complexity and ambiguity with a universal model which is another discussion entirely. By focusing in on one result set (lower seed win) we can get a result that is more predictive than attempting to predict all games.

 

This type of predictive result set has great applications in business. What is the combination of characteristics that best predict a repeat customer? What is the combination of characteristics that best predict a more profitable customer? What is the combination of characteristics that best predict an on time delivery? This is different from just trying to forecast a demand by using a demand signal combined with additional data to help forecast. Think of it as the difference between a stock picker that picks stocks most likely to rise vs. forecasting how far up or down a specific stock will go. The former is key for choosing stocks the later for rating stocks you already own.

 

One of the reasons we chose “lower seed wins” is that there is an opportunity in almost all games played in the NCAA tournament for there to be a data point. There are several games where identical seeds play. Most notably, the first four games do involve identical seeds and the final four can possibly have identical seeds. However, that still gives us roughly 60 or so games a year. The more data we have, the better predictions we get.

 

The second needed item is more characteristics. For our lower seed win we had >200 different characteristics for years 2012-2015. We used the difference between the characteristics of the two teams as the selection. We could have used the absolute characteristics for both teams as well. As the analysis is executed, if a characteristic is un-needed it is ignored. What the ML creates is a combination of characteristics. We call our tool, “Evolutionary Analysis”. It works by adjusting the combinations in an ever improving manner to get result. There is a little more in the logic that allows for other aspects of optimization, but the core of Evolutionary Analysis is finding a result set.

The result set was then used as a filter on 2016 to confirm that the result is predictive. It is possible that the result set from 2012-2015 doesn’t actually predict 2016 results. Our current result set as a filter on 2016 data had 47% underdog wins vs. the overall population. The historic average is 26% lower seed wins and randomly, the 47% underdog win result could happen about 3.4% of the time. Our current result is therefore highly probable as a predictive filter.

 

The last step in the process is to look at those filter criteria that have been chosen and to check to see if they are believable. For example, one of the criteria that was Defensive Efficiency Rank. Evolutionary Analysis chose a lower limit of … well it set a lower limit, let’s just say that. This makes sense, if a lower seed has a defense that is ranked so far inferior to the higher seed, it is unlikely to prevail. A counter example is that the number of blocks per game was not a criteria that was chosen. In fact, most of the >200 criteria were not used, but that handful of around ten criteria set the filter that chooses a population of games that is more likely to contain a lower seed winning.

 

And that is one of the powerful aspects of this type of analysis, you don’t get the one key driver, or even two metrics that have a correlation. You get a whole set of filters that points to a collection of results that deviates from the “normal”.

 

Please join us as we test our result set this year. We’ll see if we get around 47%. Should be interesting!

 

If you have questions on this type of analysis or machine learning in general, please don’t hesitate to contact Gordon Summers of Cabri Group (Gordon.Summers@CabriGroup.com) or Nate Watson at CAN (nate@canworksmart.com).

**Disclaimer: Any handicapping sports odds information contained herein is for entertainment purposes only. Neither CAN nor Cabri Group condone using this information to contravene any law or statute; it’s up to you to determine whether gambling is legal in your jurisdiction. This information is not associated with nor is it endorsed by any professional or collegiate league, association or team. Machine Learning can be done by anyone, but is done best with professional guidance.

 

 

 


Related

Happy Fourth of July from CAN

on July 4

Bloggers Writing About Tableau

on June 27

The first run of the Omaha Data Science Academy proved successful. Already 4 of the 6 graduates have found jobs in a related field. Silicon Prairie News found this noteworthy and published an article about it here.

Silicon Prairie News is a newsroom and community forum focused on start-ups in the Great Plains/MidWest region. Silicon Prairie is a venture of AIM, a non-profit organization centered around building community through technology.

For some information on the Oma-DSA, contact nate@canworksmart.com.


Related

Happy Fourth of July from CAN

on July 4

Bloggers Writing About Tableau

on June 27

Every week CAN will highlight a past or present CAN employee as part of a CAN alumni network series. This week we feature Grant Stanley.

Grant Stanley founded Contemporary Analysis in 2008. For 6 years he served as CEO and president before handing off the company to Nate Watson to pursue new ventures. In 2014, Stanley launched Bric. Bric is a managing software system designed specifically for creative agencies. Today we highlight a post on Bric’s blog about the art of time tracking and the importance of the data it collects:

https://getbric.com/time-tracking-needs-new-purpose/


Related

Happy Fourth of July from CAN

on July 4

Bloggers Writing About Tableau

on June 27

The Omaha Data Science Academy: What is it?

In 2008, Contemporary Analysis (CAN) began helping companies build predictive analytics capabilities, mostly through project based work. Last year, CAN recognized a rising need in companies: more and more, businesses needed to bring PA capabilities in house but lacked the staff to do so.  So, in mid-2015, CAN switched from project based work to a staff augmentation model. This last year, the need has grown exponentially as CAN has been asked to train staff for more companies, sometimes two at a time. CAN decided it needed a better, standardized way to train individuals to be part of data science team.

In July of 2016 Contemporary Analysis (CAN) announced the open enrollment for the Omaha Data Science Academy (Oma-DSA), the ultimate goal of which is to train a data scientist for every company in Omaha. With the help of Interface Web School and through the CONNECT re-education grant, the Oma-DSA was born.

Nebraska’s role in the development of the Oma-DSA:

 

Coursework in the Oma-DSA is designed to provide training to those who already have business acumen and don’t need another degree just to qualify for an entry-level data science job. They really only need skill-based training. This goal led CAN to partner with the CONNECT Grant in Nebraska. This federal grant provides Nebraska’s underemployed workforce with skill training and financial support to begin careers in IT with companies throughout the state. The partnership was perfect as both CONNECT and CAN seek to bolster Nebraska’s professional workforce with more highly trained individuals.

 

Interface’s role in development and administration:

 

In search for an example of how to teach an academy, CAN connected with Shonna Dorsey of the Interface Web School. Interface offers courses to bolster skills and knowledge of technology and online softwares to help strengthen the workforce. It appeals the most to people who may already have degrees and careers, but are looking for new opportunities. Class schedules are flexible for busy lives.

 

CAN was excited because Interface is both a platform for learning and a platform for teaching. They offer students an immersive learning program lead by industry experts and a professional network that connects students and businesses throughout the Midwest.

 

“We understand,” commented Shonna, “that first and foremost it takes talented people to build talented people.

 

This was in complete agreement with how CAN thought and wanted to run the data science, and the partnership was set.

 

“Interface is helping us setup the platform and teaching us the very detailed structure that goes into running an Academy such as the DSA”, commented Nate Watson, president of CAN and administrator of the Oma-DSA, “without them, we would still be back at step one.”

Who teaches the Oma-DSA?

 

The answer to this question sets CAN apart from many other data science courses. The Oma-DSA is taught by the data scientists who work at CAN. Each day the professors spend their time solving a problem for a client and then teach the students those same techniques and solutions. With the DSA, there are no textbooks, students are taught scenarios that are sometimes only hours old.  

 

What was the outcome of the first iteration?

 

In December of 2016, the DSA graduated 6 entry level data scientists. Four have already been hired  by local companies looking to implement data science into their daily managerial tasks. Multiple others companies have shown interest in the graduates and many others are excited to see what the next group of graduates will have to offer.

 

What did CAN learn from the first run of the academy?

 

Although the first run was successful, CAN is building improvements for  second run of the Oma-DSA starting in January. The eighteen week course will be divided into 4 modules: Python programming, statistics and mathematical modeling, database design, and data visualization using Tableau. These can be taken individually in any order. When all four are completed, the graduate receives a Fundamentals of Data Science Certificate.

 

The modular system is also significant because it allows students or company to enroll their employee in just one module. If a person were to only want Tableau and not the entire certificate, the module format allows them to enroll in only one module. This also allows a student to test out of a module as well. A Data Base Administrator, for example, won’t have to take a database design class anymore. They can enroll in the other three and receive a certificate.

 

What is the future for the Oma-DSA?

 

In the second half of 2017, CAN hopes to offer masters-level classes in Tableau and machine learning to continue education after the Fundamentals certificate. CAN is also researching customized classes in vertical-specific problems and solutions.  

The next class begins January 23, 2017. You can apply here.

There is nothing else like the Oma-DSA in the Omaha, NE and great plains area. This means that Omaha has the potential to be known internationally as a hub for budding data scientists. Not only that, but it also means that companies in Omaha have an enormous advantage by their proximity to highly educated and expertly trained data scientists.


Related

Happy Fourth of July from CAN

on July 4

Bloggers Writing About Tableau

on June 27

Every week CAN will highlight a past or present CAN employee as part of a CAN alumni network series. First up to bat is Eric Burns

Eric Burns is a former employee of Contemporary Analysis. In 2011, he brought on CAN’s first international clients. Today he is the CEO and founder of Gazella Wifi Marketing, which turns restaurant guest information into a marketing tool. He continues to be an active member of CAN’s alumni network.

Here are his thoughts on analyzing wifi marketing:

http://blog.gazellawifi.com/10000-visits-to-a-coffee-shop-wifi-marketing-data


Related

Happy Fourth of July from CAN

on July 4

Bloggers Writing About Tableau

on June 27

This summer, long time employee Nate Watson took over as president and owner of Contemporary Analysis. Situated in his new role, Watson has impressive plans for CAN’s future.

Since 2008 Contemporary Analysis (CAN) helped over 100 companies use predictive analytics to find patterns in their business data. CAN uses data businesses already collect, then explores those patterns to figure out what will likely happen. CAN has worked with some of the largest companies in the midwest including Kiewit, Gavilon, Mutual of Omaha, Blue Cross/Blue Shield, and West. CAN holds the reputation of solving the hardest problems in the Omaha data science field.

In mid-2014, after 150+ completed projects, co-founder and CEO, Grant Stanley decided it was time for a new leader to run the company. Stanley appointed then Senior Project Manager Nate Watson to run daily operations while he worked on a new project implementing machine learning into project planning and time management. Stanley’s new company, Bric, launched in late 2014.

Over the next year, Watson kept his eyes open for new ideas on how to make the culture and ideology of CAN work in today’s world. The idea for a new staff-augmentation model (see below) came as a cross between the need to provide a solution to companies that didn’t require massive political buy-in and budgets to build a POC. This idea struck a cord with two friends of Watson who decided to invest in the new ideology and buyout Stanley.

CAN’s motto is “Empower the great to build something greater.” Watson chose two investors who believe in empowering CAN to be something greater.

 

Through the transition, the mission of the company remained unaltered, albeit expanded. Watson partnered with two investors to help him with the buyout.

CAN’s new investors are Nick and Carrie Rosenberry. Both Nebraska natives, the Rosenberrys recently moved back to Nebraska after a stint in Minnesota. They bought into the business because they see a promising future in the data science industry.

“We were looking for a company poised to be on the bleeding edge of a bleeding edge industry. CAN completely fit the bill,” said Carrie Rosenberry.

Carrie is from Tekamah, Nebraska. She received her BS in Mechanical Engineering from UNL while also participating in the Raikes School. She then attended University of Minnesota Law School, where she graduated Magna Cum Laude. She will serve as General Counsel for CAN.

Watson remarked, “Having a lawyer on your team means we can build the ideology behind both the investment group and the agri-tech incubator (scheduled for development next year) using someone who understands the ultimate goal of CAN.”

Nick Rosenberry hails from Scottsbluff, Nebraska. He graduated from UNO with a Bachelors and Masters in Architectural Engineering before getting his MBA from the Carlson School of Management at the University of Minnesota. He serves as Chairman of the Board as well as general wisdom of business management for CAN.

With an on-team lawyer as well as a MBA on the board, Watson believes he has the team built to bring data science to every company regardless of vertical or size.

With the Rosenberrys on his side, Watson unveils a new business plan.

 

While not drastically changing their core business, CAN wants to change how companies interact with data science consultants. CAN aims to shift its main business model from a project model to a staff augmentation model. Previously, when a company needed a project done, they hired CAN, CAN did the job, the company paid CAN, and CAN moved onto a different project.

A staff augmentation model, on the other hand, means that CAN actually provides a data scientist to work directly for the client. By giving businesses the option of hiring a part-time data-scientist, companies no longer need to sift through projects and create extra budgets. It instead allows a company to test out how a data scientist would work in their culture, figure out how to implement ideology, and create the necessary roadmap for success long after CAN’s data scientist has been replaced with their own.

This however, has created a new problem: how and where to recruit the talent necessary to continue data science initiatives after CAN as a consultant has left?

CAN believes the answer lie in one of its new creations, the Omaha Data Science Academy (Oma-DSA). The Oma-DSA is a twelve week course designed to train entry-level data scientists who have business acumen but lack a few of the key skills needed before they take on corporate projects.

The Oma-DSA is designed to augment a person’s existing degree with advice and training from real data science experts in the field. This should provide talent, trained in entry level data science for companies to hire to run their new capabilities.

The first run of the Oma-DSA is this September.

Nate Watson and everyone involved with Contemporary Analysis is ecstatic about these new ventures.

 

CAN has always empowered the great. Under Nate Watson’s new ownership, CAN now has the time and resources to empower greatness within itself.

For more information on the Oma-DSA, or anything you liked about this article, contact Nate Watson below. 



As a data scientist at CAN you will have the unique experience of working at a startup and also consulting with Fortune 500 companies. You get to solve critical problems that other people have not been able to solve. Your work environment will be fast paced, and you be expected to work independently.

Data Scientists have to embrace complexity. They have to understand math, computer science, and business. They have to be able to work with data to find patterns, and use those patterns to create value for the business.

As a data scientist you have to figure out how to solve the problem, find the tools you need, and build the solution. This takes a level of determination that is rare, but required. As a data scientist you will have to be willing to get dirty sifting, sorting, structuring, categorizing, analyzing and presenting data.
Read more…


Winning Political Campaigns with Predictive Analytics

 

Elections are won based on the individual decisions of thousands of individual voters. With Predictive Analytics, even small campaigns are now able to micro-target the voters they need, talk about the issues voters care about, and excite the voters enough to turnout and vote, all without excited those voters who would turnout and vote against the candidate.

 

Download our eBook to learn how Contemporary Analysis (CAN) is helping campaigns implement Predictive Analytics and how this slight change is giving campaigns a distinct competitive advantage to win!

 



Looking for something?