How to Become a Data Scientist – Part 1
I cannot say that becoming a data scientist is the most glamorous job but I definitely agree that it is one of the most challenging and rewarding jobs.
While there is no standard definition of what a data scientist does, the role of a data scientist involves working with data to identify meaningful patterns and insights that are otherwise hidden with the objective of helping businesses take data-driven decisions.
To do this a data scientist will require a mix of skills and abilities. He should have strong analytical skills with a solid aptitude for maths and statistics. To be able to apply statistical techniques on the data, the data scientist should also have programming skills and be familiar with programming tools and languages such as R and python. Since a data scientist will work with real world data, there is one more challenge. You will hardly find a case where you get clean organized data. In most cases, data will have gaps, and will have missing context. The data scientist’s job also involves finding the right data, retrieving it from multiple systems, and then wrangling it to make it suitable for analysis. In most of the cases, a data scientist will work on solving business problems, such as identifying customer buying behaviour to improve their shopping experience, or to predict certain behaviour in the future. In this context, it is essential that a data scientist adopts a problem solving attitude and has a contextual understanding of how business runs in order to derive meaningful results.
Let’s look at some examples of the kinds of problems that data scientists work on.
- Predict Click-through rate: You work with a mobile advertising company, and you have been asked to predict if a given mobile ad will be clicked or not. To do so you will work with historical ads data with various attributes such as ad formats, ad sizes, category of products being advertised, the mobile devices where ads are displayed and a ton of other factors. Using all this data you will build a model that could predict the chances of a particular mobile ad being clicked.
- Sentiment analysis: There are many use cases for sentiment analysis. For example, using twitter data to find out what people in US think about Indian food. The data scientists in this case, will analyse tweets to find their answers. The insights in tweets will help you in your market research while planning to open Indian restaurant chain in the US. It could also provide you other insights such as in which states there would be more demand for Indian food compared to others. Or, you could find what people in the US like or not like about Indian food. For example, ‘Too spicy’.
- Predict Product Sales: Based on a data set of product features and historical sales of product, predict the online sales of a consumer product.
