From the archives "Why Become a Data Scientist?"

Did you know CAN’s blog is full of sound data science related advice dating back to the beginning of CAN? In case you didn’t, we make it a habit to regularly re-post our favorites. What follows are reasons why you should consider becoming a data scientist. If it grabs you – check out the Omaha Data Science Academy. It might be the first step in your data science career.

Why Become a Data Scientist?

Too few of today’s college students realize they want to be data scientists when they graduate. We believe that data scientists are the future, and that we are on edge of a data science revolution. Therefore, we decided to explain why to become a Data Scientist.

1. As a data scientist, you have incredible access across the business.

Your job of modeling specific business strategies and forecasts requires you to have broad access across your company. People look to you to bridge the gap between business theory and relevant data.

This is a tough role because it requires you to develop and implement a strategy to create consensus in order to implement the results of your work. Since the days of the English Luddites (the anti-technology loom weavers) there have been people who are against technological progress and the efficiency it brings to the economy. The best data scientists will be able to manage the political and social change that comes from their work. Data Science success isn’t only about making work more productive, it is also about helping other people adjust and succeed.

2. Being a Data Scientist is a specialized field.

The requirements to be a data scientist are long, because the decisions they make impact thousands of people. Data scientists usually have a 3.5 GPA or higher. They must have the ability to learn and share several different forms of knowledge, including principles of computer science, high business acumen, and complex math. Learn more about how to become a data scientist.

3. You have the opportunity to work with top level management extremely early in your career.

While this sounds great, it is also challenging. You need to be comfortable giving board room presentations to people who don’t understand what you’re talking about. A specific aspect of your position is to clearly articulate why your results are useful and valid — and do it without math speak. Learn more about presenting business intelligence.  

4. The best data scientists never settle, and question everything.

Whereas a statistician starts with a data set and a problem, a data scientist has a more difficult task. A really great data scientist will constantly ask, “are we solving the right problem?” Often the perceived problem won’t match your data, requiring you to look at everything from a new perspective.

A data scientist spends his energy asking machines questions and then trying to validate the answers, instead of spending energy trying to address the question directly. This requires a different work process, one that requires humility and understanding. Data scientist know while they are the ‘go to’ person in the organization, they don’t have all the answers. However, at the end of the day they are still responsible for finding the answers, which is why they get paid the big bucks. Data Science is a complex science as opposed to a simple science. 

5. You use artificial intelligence to automate the most routine, frustrating jobs known to mankind.

Instead of doing routine tasks, you can be responsible for automating the most tedious aspects of business, while saving your customers money and making enterprise more efficient.

If you are interested in learning more about Data Science and Predictive Analytics, download our free eBook — Predictive Analytics: The Future of Business Intelligence.

Happy Fourth of July from CAN

Cheers to happy and safe Fourth of July! For our celebration, we’re sharing this Tableau visualization about the growth of our United States.
Data + history = fun. Check it out here.

Software engineers working on project and programming in company

Python or R – CAN’s Advice on How to Choose

The age-old Python or R debate always rages here at CAN. While we have a pretty impressive staff of data scientists who all have their individual quirks (Some like to run in their spare time, some bird watch, some of them binge-watch obscure sci-fi), they have something in common. They work hard, around the clock if they have to, to accomplish projects, and put their best foot forward for clients.

But, they do differ in one big way. Some use Python and some use R
So, today, we let them debate: Python v. R — which one is for you?

If you’re completely new to the computer programming discussion

Webopedia defines computer programming language as “A vocabulary and set of grammatical rules for instructing a computer to perform specific tasks.” How does one talk to computers? In code. It’s gets tricky, however, because there are a lot of different codes that computers can understand. There are not just 10, 20, or 30 different computer languages that exist. There are hundreds and hundreds of languages. You can browse a full list here. Python and R are just two of the most popular for data science.

For some additional help, we’ve compiled a list of terms that will help you understand the background of this topic (inspired by LinkedIn).

Programmatic thinking. It’s exactly what it sounds like. It’s a way of thinking that you have to turn on when you learn computer programming. It means seeing the large problem as a series of smaller steps. It also requires being able to transcribe ideas into a code that computers understand.
Compiled and interpreted languages. Compiled languages require the user to compile and build code before it can run. Interpreted languages can read code directly without compiling.
API. API stands for application programming interface. Basically, it’s instructions put out by the program designers for accessing the full functions of the language and softwares.
Pseudocode. It’s like code, but not. It’s shorthand for standard code and helps programmers with outlining before they dig into bigger coding tasks.
Armed with a few definitions, let’s jump into the debate.

Python v. R: Where to Start

First, we’re going to hit at the hard truth. In order to succeed in the data science world, you need to be familiar with both languages (or at least good at one and familiar with the other). Particularly in Omaha, where CAN is headquartered and data analyst jobs are highly competitive, knowledge of both languages gives you a leg up on the competition. In fact we have training classes through the Omaha Data Science Academy that teach both. 
But that’s not what you want to hear, we know that. So we’re still going to break the two down and tear them apart in comparison.

Both Python and R are good at . . .

Python and R are both free to download, and the learning curve is about the same once you’ve already mastered some basic programming skills. They’re both impressive to master, so in that way you can’t go wrong. No one will shame you for mastering one and not the other.

Python Positives

Python is know for data munging, data wrangling, website scraping, web app building, and data engineering.
Let’s say you’re tackling a project with a lot of disparate data. Maybe you’re collecting sales data from the past 5 years for a company to help them predict new trends. The problem is that the company has had several turns in management, and that data is stored in multiple locations. Python would be more helpful in this situation. It succeeds as a software for gathering data from many databases and making it one.
If you already know Java or C, Python is going to come more naturally for you. The similarities coincide for your benefit.
It is an object-oriented programming language (see above), so it’s easy to write large scale and robust code. And, some people say there is data to prove that more business owners are looking for those proficient in Python over other languages.

Positives of R

R has better visualization tools than Python. It’s also been around a lot longer, which means there are more online support communities than Python (think: APIs). There are over 5,000 softwares you can find on the internet to run alongside R to boost its capabilities.
R is known for being great at statistical modeling, graphing, and converting math to code.
Perhaps you’re working on a project for a company that has a nice and neat database. The problem is, it’s difficult for most people to look at a bunch of numbers and understand trends. R is the most helpful for these situations, as it can successfully take data and make it into graphs and pictures for others to understand it.

Let’s talk to CAN

In attempt to settle this debate, we’ve brought in some professional opinions.
Matt Hoover, Director of Data Visualization, Flywheel: Matt sees R used as a more efficient math language, emphasis on the word “math”. It can achieve in one line of code what Python needs several lines to accomplish. R’s specialty is research, statistics, and data analysis, so it’s more efficient on the stats side. He continues, “Python is way more flexible as a language overall and can be used to do a wider range of things.” Matt sees R used in more learning settings than on the field, and sees Python used for more high-level data science.
Essentially, R is easier to learn and better on the math/statistics side, but overall Python has more capabilities.
Gordon Summers, Senior Data Scientist, CAN: Gordon’s advice is a bit more far-reaching. He says, “The hardest thing about picking between Python and R isn’t choosing which one to start learning, it is in choosing when it is time to stop learning it”. Basically, Gordon’s advice is to not focus so much on which language to master, but instead realize that something new could come along at any time, so don’t invest too much time in one.

In summation

If you work consistently with clean data, and your goal is to dissect the data and creative visualizations from it, go with R. If you have messy data that you need to “wrangle,” Python is more helpful.
Still stuck? Answer the following questions to help you navigate the Python v. R world.

  • What are your teammates using? Maybe you just got a job in data science and can’t decide which one to learn. Look around – what are you friends and fellow employees using? Are they successful in their work?
  • What are the data trends of you job market? It wouldn’t be inappropriate for you to call up a company who just posted a data science job and ask what they would prefer. Get a feel more the market, decide from there.
  • Whose data are you working with? Is the data messy and needs to be gathered? Python is your answer. Is your data clean and needs to be visualized? Go with R.

You can’t go wrong

Neither Python nor R is perfect. Both will have downfalls, but there are packages that exist to help alleviate those pains. Examples of libraries that can help alleviate problems can be found at https://elitedatascience.com/r-vs-python-for-data-science.
To summarize more thoughts by Gordon Summers, the IT world is changing. He says, “To do development is to use the application and to use the application is to do development. There is no IT person and no business user. The person is both a developer and a business user. One of the reasons that larger organization have struggled to embrace Python and R is that frequently there is an organizational barrier between IT and Business.” When you enter the programming language, data science, or IT world, be ready to be flexible. Businesses are still struggling to figure out where IT fits in their company. The best advice is to be adaptable and to understand where you are going so you can understand the best way to get there.

Oh, and not to complicate the entire argument, but about the time we get the R v Python debate settled, Scala might just come from the back of the pack to win the whole thing. After all, Twitter is in part written in Scala and Hadoop choose to write Spark in Scala.  Social Media Speed and Big Data Prowess? Perhaps this dark horse isn’t the long shot after all.

Bloggers Writing About Tableau

Tableau is a data visualization software that CAN uses daily with our customers. We even have our own Tableau expert on staff: Matt Hoover.
But Tableau isn’t just for those who pursue predictive analytics, like us. Tableau is a really awesome tool for anyone who has interesting data: bird watchers, marathon runners, dog walkers, etc. It takes data and makes it look pretty, so anyone can understand it.
It’s fascinating what people can do with Tableau. Are you hooked on the Tableau world? We found a cool article on Data Science Central by Kenneth Black called “Top 10 Bloggers Writing About Tableau”. We’ve re-published that top 10 list below for the Tableau lovers who read our blog.
Tableau bloggers to check out:

Companies:

Individuals:

If you find some helpful or interesting information on one of these Tableau blogs, comment below! We’d love to hear it.

From the CAN Vault: History of Predictive Analytics Since 1689

History is a fascination for us at CAN for two reasons. The first is that we find our own history pretty fascinating. Did you know that CAN has been around for 9 years? Pretty cool.
The second reason is that we want the world to know that predictive analytics isn’t a new field of the 21st century. It’s been around for a long, long time in some form or another. Intrigued? Check out a piece we wrote back in 2013: “The History of Predictive Analytics: Since 1689”.
This post is part of our “From the CAN Vault” series that highlights some of the gems of our blog from the past 9 years. These articles are written by current staff but also members of our alumni network. This week’s throwback was written by Tadd Wood, who was a data scientist at CAN for 7 years and now lives in Silicon Valley.

iPhone v. Galaxy

The iPhone versus Galaxy debate. There just doesn’t seem to be a clear way to compare them. Until now.
We found a data visualization on Tableau Public by Sarah Lewin that breaks down the two smartphones so buyers can make an educated choice, or just finally understand their different capabilities. Check out “The Smartphone Breakdown” here. Scroll over blocks for more details.
Tableau is a data visualization software that data scientists at CAN use daily. In fact, we even have a Tableau expert on our team — Matt Hoover. If you have any questions or ideas about Tableau, talk to Matt at matt@canworksmart.com.
Tableau Public is an extension of Tableau, where the public posts their projects for all to see, be amazed at, and enjoy.

The Midwest Does Tech

Do you think the Midwest is just a bunch of old barns and prairie grass? Think again.
Inc. Magazine just reviewed some of the reasons why tech startups are flourishing in Chicago. (You can read the full article here.) It’s true that the Midwest does tech just as well as anywhere else in the country. Low cost and high quality of life add up to be a winning formula for nurturing new tech companies.
This knowledge strengthens our resolve that Omaha can be and will be great at startups, we just need to find our culture and voice. We wrote about this very thing last week in our article titled “why CAN loves Omaha so much.”
The parallels between Omaha and Chicago are evident. In Zoë Henry’s Chicago article she writes:

“The overall quality of life in Chicago may make up for its perceived sluggishness, at least compared to  New York City. A recent ‘Livability’ report, published by the Economist Intelligence Unit (E.I.U.), finds that Chicago ranks No. 33 for its stability, which is also the second-highest of any U.S. city.

Larkins, for his part, nods to the overall ‘ho, hum’ attitude taken by Midwesterners. Windy though Chicago may be, there’s little bluster. ‘We believe in karma and doing favors,’ he says.”

 
Karma and doing good favors? Sounds a lot like our homesteader mentality. Like Omaha, Chicago is a city of do-ers, of people who don’t aspire for fame, but a job well done. Chicago produces more tech talent than any other Midwest city — but the problem is getting that talent to stay local and not hit the coasts. Omaha suffers the same malady: the brain drain.
That’s why articles like the one linked above and our musings from last week are so important to the Great Plains economy. The truth is Chicago, Omaha, Kansas City, Des Moines, St. Louis, Minneapolis, and many others are full of investors looking to fund start-ups. With lower costs of living and diversity in industry, start-ups have a higher chance of success here than on the coasts.
But let’s keep one thing straight about the Chicago and Omaha comparison. Chicago may be bigger, but more people doesn’t equal better. Omaha’s cost of living is a lot cheaper than Chicago and our attitude is right. Where do you want your money to go?
Join the conversation. What are your thoughts on tech start-ups in the Midwest and Midwest tech in general? Comment below.

Where in the world CAN you find us?

In the next few years, CAN is predicted to be among the nation’s leaders in data science. We have an impressive resume to back this up. We’ve worked with multiple Fortune 500 hundred companies, and many more Fortune 1000 companies all over the globe and have built a solid reputation among local Omahans for producing experts in data science and IT.
A year ago we joined up with Interface Web School and created the Omaha Data Science Academy (Oma-DSA). Our dedication to educating our community — from recent college graduates to single moms looking for a new career — has bolstered our reputation as not only an upstanding professional service but also a down-to-earth, human-to-human, educational source.
We also have an impressive network of CAN alums (Oma-DSA grads and former CAN employees) to prove this educational dedication. When you work for CAN, we don’t just want you to improve our company. We want you to improve yourself. That’s why so many of our graduates and employees have moved on to work for businesses all over the country, and to start their own businesses.
We have AIM and The Startup Collaborative to thank for our continued success. As CAN turns a year older, a lot of people are stopping us to ask this question:
 

Will CAN stay in Omaha?

Of course we will.
Does that shock you? To some it might seem like the next logical step for CAN is to move to a bigger city where there are more businesses, take on investment, and find more talent. But’s that not how we see it. We think Omaha is perfect.
Let’s examine the two mottos of the area: Omaha and Council Bluffs (Omaha’s sister city across the river from Omaha).
Omaha’s motto is “We Don’t Coast”. This has two meanings. This first is literal. Obviously, we don’t have coasts. And to add to that, we don’t mountain either. But the metaphorical meaning is even more true: we’re a city of do-ers. We don’t slack. We don’t waste time. We act on our ideas. We collaborate. We make waves.
Council Bluff’s motto is “Unlike Anywhere Else. On Purpose.”. Our friendly, work-hard, be-nice attitude, does not come naturally; rather, it’s something we strive to be. We know it puts us ahead. We don’t have to try to be hip and trendy, our values are more genuine. We’re happy with our identity.
What is Omaha’s identity? Let’s dive a little deeper.
 

Omaha’s bragging rights

Every town in the country has that one special thing that draws tourists, like a giant ball of yarn or a natural history museum. Omaha has more than gimmicks, however. We have the whole package.

We have the people.

We spoke a little about our personality in reference to our mottos. One phrase that CAN likes to use to describe the good people Omaha is “homesteader mentality.” We’re not afraid to put in hours and work. The 9-5 workday is not always our mode of operation when a job needs to be done. We work overtime to accomplish goals, and we approach problems with the end goal of figuring out a solution, not just putting in hours. 
Anecdote: Our original Founder used to tell people we only worked half days…7am to 7pm…
We also have a different kind of people. Because Omaha doesn’t have the density of other big cities, we have had to make our actions “on purpose”. In Omaha, people are willing to give advice and collaborate. Since everyone is connected in someway, with the right idea anyone could have lunch with 1 of the 5 billionaires that live in Nebraska. On the coasts, these people would be way to pre-occupied to give just any idea the time of day.
Most importantly, the people of Omaha understand good enough. We’re not caught up in the fame game. You could say we’re slow and steady, because we strive for improvement, not perfection. We don’t cut corners, we work hard to build ourselves up. Just like CAN has done in the past 9 years.

We are the place.

We’re not just small town in the middle of the country. Omaha has a global reach that surpasses many of the other bigger cities in the MidWest. Omaha is a business community. 
For instance, not so long ago Omaha was the equivalent of Silicon Valley for telecommunication. A lot of these big corporations are still headquartered in Omaha. Omaha currently has 4 Fortune 500 company headquarters.
Our agricultural companies ship grain all over the world, and we manage money for people across all six inhabited continents.
This spirit of success continues at every level. You can see it in the halls of the start-up collaborative. People young and old come to work on tech-related ideas and products that they believe will boost up Omaha’s name even more. 
Omaha is also a major transportation hub, with the largest N/S and E/W interstates running through the city and the nation’s largest railroad located downtown. In four hours you can go just about anywhere in the country from our airport, making Omaha a major crossroads of tech and transport.
Not to mention our cost of living is among the lowest in the country. Less personal expenses mean you can invest more in your business. More investments equals more momentum for success.
Still, starting a business in Nebraska has its challenges. Constraints, however, often produce creative solutions. With a state population of 1.8 million, isolation has been CAN’s biggest constraint. Isolation has forced CAN to learn to build a national client base using blogging, social networks, and virtual meetings.
We knew that this results-oriented culture of Omaha would help us create a business that provides real value for our customers and help to keep our business focused on the long-term instead of quick wins.

We have some cool things.

You probably don’t need a whole lot more convincing, but here’s a few websites that list awesome social and cultural aspects of Omaha:

  1. “8 Reasons to Move to Omaha, NE”
  2. “7 Reason why Omaha is the Best City in the US to Live In”
  3. “30 Things You Need to Know About Omaha Before You Move There”

Food, fashion, furry things. When we at CAN aren’t busy working on projects, we have unique opportunities to explore the MidWest culture.  
 

CAN Loves Omaha

Between the travel opportunities, the global reach, the hard-working mentality, and the room for growth, CAN couldn’t choose a better city for their headquarters. Omaha is producing more tech talent every year at half the price of those on the coast. CAN wants to hire this talent, give them real data analytics experience, and send them wherever they need to go. Without the resources and mentality of Omaha, this would not be possible.
In the end, CAN is going anywhere. We believe in our community, we believe in improvement, and we believe in sticking with our roots. 

Analyzing Omaha Mayoral Election Data Uncovers Voting Patterns

Last week, incumbent Jean Stothert won re-election in the 2017 Omaha Mayoral Election, defeating challenger Health Mello by a 53-to-47 percent margin. Let’s take a closer look at how the Republican fared in the polls over her Democratic challenger.

This map was created by the Omaha World-Herald staff shortly after election results were announced on May 9th. It was widely shared on social media in the days following the election and simply represents the results from each precinct — red for Stothert, blue for Mello. It shows a striking political contrast between East and West Omaha. Using this visualization, it appears the city can be divided into two halves, with Republicans on one side and Democrats mostly on the other.
Although the map is technically accurate, it is somewhat misleading because it only shows the data in one dimension. This is not meant to discredit the creator of the map – this is an effective visualization that succinctly tells one story. It is also impressive how fast the map was put together after the election results were announced. With that said, the map doesn’t account for important metrics such as number of voters per precinct or margin of victory. To provide this context and better understand voter behavior throughout the city, I used election data from the Douglas County Election Commission’s website and created some improved, in-depth Omaha Mayoral Election visualizations. Make sure to click on the images if you would like to go to the interactive version of these visualizations.
 

Red City | Blue City – Omaha Divided


 

Election vs Primary Performance by Precinct


 
Were you surprised by the results of the election? What other metrics would you like to see in these data visualizations?

Check out Data Science Central

Open up any computer at the CAN headquarters and you’ll see our favorite data science website as our homepage: datasciencecentral.com. Articles, webinars, resources, ideas, tools, you name it. If it’s related to data science, it’s there.
Today, we’d like to share some knowledge from Rick Riddle. Check out his article “How can organizations successfully convert big data into real-world decisions?” by clicking on the link.
It’s all about how to apply the stacks and stacks of data your business has on file to real-life decisions. Which is what CAN is all about: using your data to help you. It may seem an overwhelming task, but CAN’s data scientists love challenges. Read over this article and let us know if you think we can help you with your data needs.
Contact Nate Watson at support@canworksmart.com.

Featured Posts – Click the Brain
Archives
CAN Jewels