According to Wikipedia, "data science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics." In other words, data science combines different methods of analysis of large data volumes and provides tools and artificial intelligence (AI) applications to facilitate and visualize data analyses. Since early 2010s, data science is considered to be one of the most prospective, promising and highest paid jobs in IT.

Data science is closely related to machine learning and is defined as a field of computer science that gives computers the ability to learn without being explicitly programmed (according to Arthur Samuel).

As such, the ultimate goal of machine learning is to teach computers and devices to solve different complex tasks using algorithms that can learn from data and make predictions on data and that operate by designing a model from sample inputs to enable data-driven business intelligence and other outputs rather than following static program instructions. Human face and objects recognition, self-driving car technologies, IBM Watson, text content understanding, voice recognition, sales prediction, books and movies recommendations based on user behavior and preferences - these are just some of machine learning applications in real life.

Although machine learning algorithms enable intellectual, or smart data analysis and AI applications for data science, its use cases and formats of processed data are a way broader.

So, what do data scientists and machine learning engineers do?

Data scientist

If we take a look at most of job openings today, we'll see a huge variety of tasks and responsibilities assigned to data scientists / Big data specialists by different companies. Yet, we can distinguish some of the common requirements for data science specialists.

Standard Responsibilities:

  • Highlight, aggregate and synthesize data from various structured and unstructured sources
  • Explore, develop and apply intellectual learning to real-world data, draw important conclusions and create relevant use cases based on them
  • Analyze and present data collected by your organization through different sources
  • Design, build and deploy new processes for data modeling and analysis
  • Create prototypes, algorithms, and predictive modeling
  • Fulfill data analysis requests and report outcomes to respective organizational departments

Domain / Industry Specific Responsibilities:

  • Discrete mathematics, statistics and statistical analysis
  • Machine learning algorithms
  • Data warehousing skills (relational and non-relational databases), SQL and other query language
  • Data analysis and modeling tools:
    • R
    • Python (NumPy/SciPy)
    • MATLAB
    • SPSS/SAS
  • Hadoop and relevant technologies including Pig, Hive, etc
  • Java
  • Data discovery and visualization
  • Domain knowledge and subject matter expertise (of critical importance!)
  • Strong social and communicative skills

how to become machine learning expert

Note that data scientist isn't supposed to program; knowledge of Matlab, SPSS and SAS would suffice in most of cases. Therefore, this job is often sought by business analysts and data analysts rather than software developers. However, additional skills such as software programming, Python, Java, Hadoop and data warehousing are appreciated and add up 5% to 14% to the average salary, according to Payscale.com.

As such, data scientist's job can be interesting for both programmers and applied math / statistics specialists.

Salaries

Here're average data scientist salaries per country (3+ years of experience, gross and per annum):

Ukraine: $18,000 - $30,000

United States: $60,408 - $141,500

United States (Chicago): $55,000 - $125,000

United Kingdom: $40,000 - $60,000

Germany: $70,000 - $91,000

Norway: $64,000 - $80,000

Machine Learning (ML) Engineer

Compared to data science, machine learning is a more technical job that's rather close to a classical software engineering. Machine learning has more in common with software development than data science.

Required skills and responsibilities:

  • Strong skills in one or several programming languages (e.g. Java, R and / or Python) and databases (SQL, Hadoop)
  • Smaller focus on data analytics and larger focus on machine learning algorithms
  • Data modeling (Matlab, SPSS и SAS)
  • Ability to use available libraries for different stacks such as Mahout, Lucene for Java, NumPy/SciPy for Python
  • Ability to build distributed applications using Hadoop and other solutions

Additionally, you may need:

  • Skills in Natural Language Processing (NLP), Computer Linguistics, Sentiment Analysis for text processing, understanding and assessment
  • Computer Vision for image and video recognition
  • Digital Signal Processing for work with sounds, sensor data and other signals
  • Skills in building Recommender Systems

Such requirements are a rare case when it comes to data scientists.

As you see, ML engineer's job requires skills and knowledge of software engineering and, thus, is perfectly suited for experienced developers. Average programmers are often challenged with machine learning tasks in the course of project development and that's how they migrate to the machine learning domain.

Pros and Cons

First and foremost, it's very exciting to build apps that go beyond applied programming. This job makes your brain work faster and more efficiently by having you conduct numerous experiments, read scientific journals, and seek non-trivial solutions when trying to reach your goal. And let's point out that the outcome isn't always positive.

"Normally, we, developers, have to write "if - then - else" business logic cases which help software products work much faster than a human brain. That's because a PC's computational power outperforms that of human beings. However, using this method, we can't build a program that would be smarter than its developer. In data science, we build systems that are by far smarter than us. We teach them to become teachable and make autonomous decisions based on data analysis results. So, building systems that are smarter than humans is a magic that attracts me most when it comes to data science," Nick Z., data scientist at Intersog.

Second, machine learning allows companies and startups build smart products and services that provide users with an unprecedented opportunity to have more mature tools for task resolution; no standard software development methods can do so. It explains why non-IT businesses are leaning towards data science and machine learning. Companies and their ambience generate huge amounts of data which can give a significant competitive advantage. As such, machine learning specialists are in high demand today, especially in the developed economies, and their current supply doesn't meet the global demand.

That's how average salaries of software engineer, data scientist and machine learning specialist compare in the United States:

Software engineer: $102K per annum

Machine learning specialist: $112K per annum

Data scientist: $117K per annum

How to start a career in machine learning / data science?

If you're interested in pursuing a career in ML or data science and don't know where to start, here're some tips from Intersog:

In addition, there're a lot of free and paid books and resources on the above.

After you've gained skills in R and have understood the building blocks of data analytics and ML, try to improve your skills by participating in Kaggle's data science competitions.

Also, consider attending local events and conferences pertaining to data science and machine learning and learn from subject matter experts.

Are you looking to hire data scientists and / or machine learning specialists for your in-house or offshore project?
Let's talk!

 

 

 

 

Vik is our Brand Journalist and Head of Online Marketing / PR with 11+ years of international experience in IT B2B. He's also a guest blog contributor to Business2community, SitePoint, Journal of mHealth, Wearable Valley and other IT portals. You can contact him directly on LinkedIn.

Leave a comment

Get a Quote