The Future of Data Science

Author: Rajdeep Arora

“We are moving slowly into an era where Big Data is the starting point, not the end,” writes Pearl Zhu in her book ‘Digital Master’.

In this blog, Rajdeep Arora, a Senior Data Scientist at Tiger, provides valuable insights on the future of data science as a career choice. Rajdeep spends his weekends mentoring students of MIT’s Applied Data Science Program. He shares his personal journey in data science and highlights the importance of mentorship in the evolving field of data science.

Surging Data Science Jobs

The last decade has been the era for capturing data thanks to advancements in storage cost reduction. It’s why Hadoop, Big Data, and ETL Developers have stayed relevant in the market. Now, companies are looking to use this data to draw insights and take predictive actions, making data scientists and ML engineers increasingly relevant. Going by current trends, there will soon be a huge demand for Data scientists who can manage multiple models with relevant MLE and MLOps skills.

Challenges In Recruitment

The biggest gap lies in understanding the work description of a data scientist. Often, candidates pursue the title `Data Scientist` without comprehending its responsibilities. Our challenge is in ensuring clarity of various profiles such as data engineers, data analysts, data Ops, business analysts, software engineers, data scientists, ML Engineers & ML Ops, and subsequently mapping hiring requirements.

Is Data Science the next big trend like coding?

Let’s first differentiate the two kinds of data scientists. Applied data scientists focus on implementing existing algorithms for specific use-cases. Research data scientists look for new solutions to unique problems by reading academic papers and coming up with implementation strategies.

Your career is your choice. Both are exciting.

If business problems fascinate you, pick Applied Science. Applied Scientists are developers-turned-scientists, who spend more than half their time writing and testing ML code. The rest is spent interacting with clients, understanding their issues, converting business problems into reasonable data problems, and discussing model results.

If solving longevity kind of problems interests you, then understanding various ML-based approaches and developing intuition is far more important. You need academic excellence, a background in applied mathematics, and extensive reading in the subject to excel as a research data scientist.

Coding is no longer a trend. It’s a requisite skill to excel in IT. This includes the field of data. If you are passionate about solving data problems, then data science is an ideal career for you.

How do I up my data science game?

You will need a variety of sources to develop mastery of a topic. The first step of understanding the whys and hows can be gathered from academic courses (edX, Lagunita, or books). For the second involving implementation specifics, there are various blogs, articles, official documents, and virtual courses (Udemy & Coursera). Books are better sources of information than videos (the latter are an abstraction and might not fully capture the content).

By way of an example: If I wanted to learn a specific topic (let’s say BERT), I would first read up on the specific use cases it solves (Question & Answers or Sentiment Analysis) and what existed before (RNN, LSTM); understand their (RNN, LSTM) limitations; how BERT solved it (Attentions) and the current state-of-the-art techniques. Once I create a mental map, I move on to understanding implementation tools & techniques (PyTorch, Tensorflow, JohnSnow NLP, etc) and select courses/videos accordingly.

Freshers in Data Science – The Path & Challenges

Honestly, I would suggest you be patient. Data science is the cherry on top of a cake that can be baked in various ways. You could start off as an analyst understanding relevant data issues and building insights or as a software/data engineer building data pipelines (building blocks for any model). You could also begin on the business side and by acquiring programmatic skills, you could switch to data science. Data science is usually an organic career advancement, not an inorganic jump or a starting point.

The biggest challenge is to make freshers focus on building blocks and processes before jumping to modeling. One great suggested reading is “Hidden Technical Debt in Machine Learning Systems’’ by Google for making any ML solution. The reading suggests building ML models take 5%-10% of the effort and the rest is taken up on building ancillary services.

Understanding the business problem first along with proper exploratory data analysis is extremely essential. My recommendation would be to start small, focus on specific goals, attain them and keep moving forward while focusing on delivering incremental values; at the same time thinking big on what business problem are we solving. Read extensively, there is a lot to learn from the failure of Zillow’s flipping business for anyone starting their career in data science.

Professionals Pivoting into Data Science

I believe you are only as good a data scientist as the problems you know to solve. Both data problems and your domain of work can be highly nuanced. Finance and Marketing can have vastly different problems. However, under the hood, both might use similar classification principle models. The value lies in building the model construct that can solve these problems, which a lot of freshers struggle with. Having the business acumen to be able to pivot business problems to a data problem is an invaluable skill.

Professionals pivoting into data science need a basic understanding of ML building blocks (regressions, clustering, classification, time series, neural networks, and recommendations), the ability to select the appropriate block based on the use-case, and implementation-focused skills to bring their ML solutions to reality.

Mentoring at MIT – The Experience and Learnings

Mentoring has become the best part of my life! I used to be a teaching assistant at the University of Texas, Dallas. Later, I was an Assistant Tutor with UC Berkeley’s extension and I am now the AI/Data Science Mentor for the MIT program via Byju’s Great Learning. My motivation comes from seeing the amount of focus and effort students put into learning these essential skills. I love the challenge of explaining complicated topics in a relatable manner.

Students come with their own experiences and biases. Sometimes, you have to unlearn biases to learn something relevant. This has been the area of struggle for most. My role as a mentor is to first make them understand the first principles and then build appropriate mental maps to get the intuition behind ML Concepts.

I learned a lot on this journey. From my students, I learned how they plan to use ML tools to solve problems in their industry. From the faculty at MIT, I learned about the latest research and how we can pollinate from other ideas. For instance, I recently learned how to use matrix factorization to solve a completely different sales forecasting problem that outperforms most state-of-the-art approaches.

I also learned people’s perceptions of data science and the gaps in their understanding. As a mentor, I try to bridge this gap and make realistic plans for individuals so they can succeed in their career path of data science. I share my mistakes frequently so my students can relate to them and learn from them.

This often makes me reflect on my past. If I could give my past self, when I was fresh off the boat, some advice, it would be:

– Prioritize tasks and build first principle understanding
– Prioritize relationships with the important people in your life

Maintaining both and improving every day with incremental enhancements, can ensure success in this field.

In Conclusion

Data is now an essential fabric of the global economy. In the words of scientist Andrew McAfee, “The world is one big data problem.” And the answers lie in data science.

The boom in the data science job market is accompanied by various challenges as elaborated in this blog.

We at Tiger Analytics believe the key to a better future lies in clearly understanding the various aspects of this science by fostering a culture of learning and mentorship at the individual and organizational levels.

 

Tags:
0 Comments

Leave a reply

Your email address will not be published. Required fields are marked *

*

©2022 Tiger Analytics. All rights reserved.

Log in with your credentials

Forgot your details?