At Contemporary Analysis we know data inside and out. However, if you don’t know what could be done with data you’ll forever be in the dark.
Decision makers at companies, who haven’t already embraced data, don’t have the time or the desire to dive deep into data driven decisions. They need a way to understand the high level concept at best. The people who are already working with data need that buy in to help fund projects.
Enter Data Hierarchy, a series designed to be educational and even a little entertaining. The series helps decision makers understand the value of doing things right from the start. This includes having a vision of where you’re going to help avoid costly mistakes.
Originally released on our blog, Data Hierarchy is available as an E-Book for free. To get a copy simply fill out the information below and you’ll be taken to the download page.
At Contemporary Analysis we do Data Science Consulting. A large number of the people reading this just said “huh?” and that’s fine…that’s what this is for.
Data Science is in integral part of everyday life at this point and you just don’t know it. As a society we’re generating more data than ever before. Smart businesses are tapping into that data to do things that were previously unheard of.
Take Facebook for example. 20 years ago Facebook didn’t exist, now people are addicted to it and seemingly can’t live without it. But even then, people are still weary of the dreaded “Facebook algorithm” that cuts 50% of the posts you might want to see. That algorithm is data science at work
That’s right, you’ve generated enough data that Facebook wrote some code to cut 50% of your friends out of your life. You didn’t interact with them enough, they didn’t post enough, there are hundreds of reasons why that system feels like your college roommates buddy from down the hall with the cat doesn’t need to be at the top of your feed. It also looks at what you read on a regular basis and then tries to predict what you would want to read next.
So to help people truly understand what we do as a company, and to help you hire us. (let’s be honest) We put together a series on the sophistication of data usage as businesses mature that we call the Business Data Hierarchy. The goal of this series is to help people and companies understand where they are now, and where they could go with data driven decision making.
We’ve written the series to be informative and insightful, with a splash of humor mixed in to keep you awake through the whole process. If you like it or if you feel like someone needs to read this…we ask that you share the info or…better yet…get them in touch with us and we’ll bring the show to you! The pyrotechnic guys tell us we’ll need a 25’ ceiling for the fire and lasers…Hey, it’s a good show.
…this will also be the longest post of the entire series, don’t worry!
When you look at Data, and what it can do for you and your company, there are six different levels of Data Hierarchy. It’s a hierarchy because each level is codependent on another.
These levels are important to understand because jumping from one to another, without a long term goal, can be cost prohibitive. This is even more devastating when you finally get your executive level to believe in the power of data, and it breaks the bank in the execution.
There are consultants with lovely summer and winter homes who have paid for them “skipping” to the end and then back billing/building the solutions.
To insulate against catastrophic failure of a data-driven initiative we at Contemporary Analysis (CAN) have created a Data Hierarchy to help companies understand where they are and more importantly, where they are going. This understanding helps drive the strategy and vision needed to be successful. These levels are
(CAN) have created a Data Hierarchy to help companies understand where they are and more importantly, where they are going. This understanding helps drive the strategy and vision needed to be successful. These levels are:
Reporting: Tracking and “What happened?”
Business Intelligence: “What just happened?”
Descriptive Data: “Why did that happen?”
Predictive Data: “What is going to happen next?”
Prescriptive Data: “What should we do to make it happen?”
Omnipotent AI (Skynet): “Automated Doing of its own recommendations” a.k.a. “Terminator Movies”
Every business is trying to move “forward”. If you work for a company whose response is anything but “forward” or “more” start polishing up your resume, you’ll need it sooner than later.
Most companies are so focused on today’s business they don’t know what the path to the future looks like.
Imagine you tell a CEO you’re going to walk a mile to get another 1 million in sales. Most CEO’s would look at the distance and agree that a short distance is worth the time and effort to get the additional revenue.
You and your team(s) work feverishly to get from point A to point B as quickly as possible. You cross the finish line and there’s your 1 million. The CEO checks the box and there it is, project complete.
Now imagine if you told a CEO you’re going to get 20 million in sales. After the confused look and possible laughing subsides you tell them how. Instead of a mile, you have to walk 15 miles. But you’re not going to do them all in 1 year. Instead you’re going to walk that distance over 5-6 years. You’ll measure success with each mile you pass and each mile will result in ROI for the company.
You also let them know that you can cover the ground when and how you want to. If one mile is too tough to work in the time and effort this year, you postpone it to the next. If, as you’re walking, a business need changes and you need to walk a completely different direction you can. The steps remain the same but the road you use to get there is slightly different.
Understanding the long term goal allows you and your team(s) the ability to work smarter not harder. You’re building toward the vision at every turn so you have little to no wasted effort. And, because you’re building over time, you can staff accordingly for each mile and access the right talent at the right time
Part of CAN’s role is being that “Data Visionary” that helps you see over the horizon with possibilities. The hardest part of this whole process is getting the decision makers in an organization to embrace the culture of change.
“We’ve done it this way for __X__ years and it works just fine.” Is becoming the leading indicator of a dying business. If you’re 40 years old the technology available today wasn’t even conceptualized when you were in grade school. “We’ve done it this way for 50 years…” means you’re already behind the curve.
The posts that will follow will walk you through each level of the Business Data Hierarchy concept. We’ll be sure to include examples that are relatable. The subject matter can be a bit dry, so we’ll also make sure we include some humor along the way to keep things lively. We’re a Data Science Consulting firm..not monsters after all.
Posts will be made to our blog and to our LinkedIn page. We encourage you to share, re-post, forward, email, and contact us if you have questions at any time.
CAN partners with a number of software providers to best serve our clients. What works best for your particular case, we will build in. While we have no preferences, we certainly have favorites. We are Vendors, Support, and Teach how to use:
Contemporary Analysis (CAN) and Cabri Group and have teamed up again to use Machine Learning to predict the 2018 NCAA Men’s Basketball Tournament. This is different than last year as we are picking the entire 2018 bracket instead of just upsets.
Historically, only 26% of tournament’s games end in an upset (this includes games from all rounds). That’s 17 out of 64 games. Last year we did really good. Only failing to predict 3 upsets and getting 50% of our predictions right. We are going to need to improve a bunch to win that 1M/year for life from Berkshire Hathaway–including that wee bit about having to work for Berkshire Hathaway to be eligible. This year we added far more variables and used an ensemble model. Will we be perfect? Probably not. Here is the problem with using Machine Learning to try and predict a perfect bracket:
A). Error propagates itself through the bracket. This is why the odds of a perfect bracket are around 1:128 billion. If you pick San Diego State to upset Houston-
Side note: The machine learning is in fact, picking Houston by the slimmest of margins. However, if San Diego State wins, the machine learning is actually picking them to go on to beat Michigan, Providence, and then Ohio State to win the entire region.
and then Houston actually wins, you will lose the entire region. Perfection may have to do with a 6/11 game that no one would normally care about except its the tournament, and everyone cares about every game.
B). Machine Learning and Predictive Analytics aren’t about being 100% accurate. You wouldn’t want to pay for that kind of accuracy even if it were possible. We are trying to be less wrong for companies. This is why predicting upsets made sense and the whole 2018 NCAA Bracket is so hard. Figuring out who is most likely to be an outlier (churn) is something we do all the time. And, we can error on the side of being wrong. We would just tell you to call both Houston and San Diego State (in this instance) because calling them to talk to them about staying at your company has no ill effect. (i.e. there is very little cost to being wrong in this example.) There is a huge cost to being wrong in the tournament in the later rounds as you are predicting the next game based on your assumption of correctly predicting the last game.
Without further ado, here is what the Machine Learning algorithm predicted as the bracket:
If you have questions on this type of analysis or machine learning in general, (or if we are perfect and you would like to congratulate us), please don’t hesitate to contact:
Gordon Summers of Cabri Group (Gordon.Summers@CabriGroup.com), or
Nate Watson at CAN (firstname.lastname@example.org).
Now for some disclaimers:
Understand the technique that finds a group of winners (or losers) in 2018 NCAA bracket can be based on any metric. Our analysis isn’t to support gambling, but to open up people’s minds onto the possibilities of leveraging Machine Learning for their businesses. If we can predict things as seemingly complex as a basketball tournament (Something that has never been correctly predicted), then imagine what we could do with your data that drives your decisions
We will be keeping score using the very traditional 1,2,4,8,16 point process.
**Any handicapping sports odds information contained herein is for entertainment purposes only. Neither CAN nor Cabri Group condone using this information to contravene any law or statute; it’s up to you to determine whether gambling is legal in your jurisdiction. This information is not associated with nor is it endorsed by any professional or collegiate league, association or team. Machine Learning can be done by anyone, but is done best with professional guidance.
At the beginning of the project, we set out to show how the 2017 NCAA College Basketball Tournament could be a proving ground for Machine Learning analysis. There are very few places in the world where we can use the same model to predict multiple outcomes in a short period of time, have a ready-made scorecard (Vegas), have the general public understand what we are trying to do, and have a chance to “beat” the algorithm with their own knowledge.
You could say our findings have been a “Slam Dunk” (I couldn’t help myself).
Before diving into the results, I wanted the reader to understand what we were up against. It’s easy to pick chalk (always picking the better seed). In fact, that is how the games are supposed to work. The 8 seed is supposed to beat the 9. And for the most part, the NCAA does a decent job. Historically, only 26% of tournament’s games end in an upset (this includes games from all rounds). That’s 17 out of 64 games. This was never going to be easy.
We predicted 20 upsets and got 10 right (50%). We only missed predicting 3 upsets.
Using Vegas as a scorecard and having bet $100 “dollars” on each predicted upset, we would have ended up +$2,605 off our simulated bets (a 30% ROI)–the majority of this coming from long shot underdogs.
Think about this. If we would have bet all chalk on games except the ones the algorithm predicted as upsets, then out of 61 games we would have only missed 13. That’s 79% accurate!
Let’s look at this another way. Our algorithm predicted 77% (10/13) of something that is only 26% likely to happen in the first place. Now think about what you would do if you could identify an unlikely event in your business with 77% accuracy.
What would you do if you knew 77% of the customers who were going to leave before they left?
What would you do if you knew 77% of failed batches before they happened?
What would you do if you knew 77% of your plant’s machine failures before they happened?
You have a theory that some of your clients would buy more “product” if they were called and offered an upgraded deal. However you don’t want to call all of your clients because you have so many. What you do have is a dataset of past customers that successfully responded to this type of nudge. Using your data, our machine learning algorithm could predict a set of your clients that would be 77% likely to purchase more product if called.
Game changer right?
Why this is huge
Our Machine Learning lower seed winning project was looking to predict as accurately as we could a lower seeded team winning in the NCAA tournament. Our stated goal from the beginning was to get 47% of our picks correct and a mere 10% ROI. We beat both of those goals. Our Machine Learning algorithm, which uses a custom optimization engine called Evolutionary Analysis, looked at a comparison of 207 different metrics of college basketball teams and their results in prior tournaments. It selected ranges of those 207 measures that best matched up with historic wins by lower seeded teams. We then confirmed that the range was predictive by testing the selected ranges against a “clean” historic data set. This comparison is how we got our goal percent and ROI. We then published our forecasts before each round was played – the results speak for themselves.
While we still have 3 games to go, our initial point that Machine Learning can help you be better at making decisions from your data has been proven. Implementing Machine Learning isn’t hard so long as your business has these three characteristics:
A data set with a large number of characteristics
A measure of success to optimize upon
A desire to learn from data to make changes in your organization
If this sounds like something that your business could use, please contact Nate Watson of CAN (Nate@CanWorkSmart.com) or Gordon Summers of Cabri Group (Gordon.Summers@CabriGroup.com) today.
Here is a summary of our picks from the beginning of the project ($ indicates our successful pick where “money” was made):
East Tennessee St. over Florida $ Xavier over Maryland
Vermont over Purdue
Florida Gulf Coast over Florida St.
Nevada over Iowa St. $ Rhode Island over Creighton $ Wichita St. over Dayton $ USC over SMU $ Wisconsin over Villanova $ Xavier over Florida St.
Rhode Island over Oregon (tied with a minute to go)
Middle Tennessee over Butler
Wichita St. over Kentucky (tied with a minute to go)
Wisconsin over Florida (OT last second shot) $ South Carolina over Baylor $ Xavier over Arizona
Purdue over Kansas
Butler over North Carolina $ South Carolina over Florida $ Oregon over Kansas
And for those who are curious, our algorithm has detected one Final Four upset for this weekend:
The Tableau data visualization above, found at Tableau Public, shows the “Top 100 Songs of All Time Lyrics”. Click here to hover over each square and see what words were used in which lyrics. Tableau is a software that converts data into graphs, charts, and images.
CAN’s data scientists love sorting through piles and piles of spreadsheets and numerical data, but it’s not for everyone. There are some amazing tools that convert raw data into visualizations. They help bring out the story of data, so everyone can understand it.
Here’s an old favorite from our blog about the importance of visualization. It’s a way for us at CAN to gear up for the next round of Tableau students at the Omaha Data Science Academy!
We are still accepting applicants for the third round of the Oma-DSA! You can apply here. We accept applications until three weeks before the start date, and start a waiting list after the spots are filled.
Thinking about buying sales leads? Here are 10 questions that you should ask first.
1. What is the minimum purchase?
List brokers try to capture as much of your marketing budget as possible. They do this by setting minimum purchase amounts and charging for filtering: both encourage larger purchases. So while you might find a broker with low minimum purchases, there is a good chance they charge high fees to filter their lists.
The key is to find balance. Often, buying an extra thousand sales leads won’t cost as much as the first thousand. However, you might not want to use them all. You want to avoid using sales leads that don’t fit your target audience, because interrupting the wrong people is a good way to erode the credibility of your brand (and is a waste of your time and resources). Buying names and contact information is the cheapest part of marketing and selling. You should only use the leads that are the best fit for what you sell; even if that means not using every name. Read more…
Every sales organization requires three things: sales managers, salespeople, and sales leads. In principal, the formula is simple: the sales team will meet their quota if the sales manager focuses the salespeople on the right sales leads.
Most sales organizations know how to find salespeople and sales managers, leaving sales leads. There are 4 sources of sales leads: 1.) referrals, 2.) conferences and trade shows, 3.) inbound marketing and 4.) proactive sales. Each sources has its pros and cons: the key is selecting the right sources for what you sell.
For example, there are businesses where referrals are often the best or the only way to grow. These “word-of-mouth” businesses tend to offer services that are intimate, offer solutions to frequent problems, and have limited marketing resources.
However, most businesses need more than one sources of leads to maximize revenue. Not having the right combination of sources stagnates growth and increases your cost of client acquisition. Different lead sources vary in the amount of upfront investment, sophistication required, and payback period.
What if you knew which prospects to focus on for the best results?
Working with a Top 10 Online University, CAN used predictive analytics and data science to find patterns in their admissions data to help them make better decisions and focus their efforts.
We developed a model showing which prospects were most likely to convert, which needed extra attention, and which were unlikely to enroll at all. Armed with these insights, they are able put their most valuable resources — time and money — towards building relationships with the prospects that mattered, instead of wasting their efforts trying to engage uninterested individuals. Read more…
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.