Data Science versus Machine Learning — a brief description.

Sukanya Bag
4 min readJun 26, 2020

In this article, I clarify the various roles of the data scientist, and how data science compares and overlaps with related fields such as machine learning, deep learning, AI, statistics, IoT, operations research, and applied mathematics. As data science is a broad discipline, I start by describing the different types of data scientists that one may encounter in any business setting: you might even discover that you are a data scientist yourself, without knowing it. As in any scientific discipline, data scientists may borrow techniques from related disciplines, though we have developed our own arsenal, especially techniques and algorithms to handle very large unstructured data sets in automated ways, even without human interactions, to perform transactions in real-time or to make predictions of the future!

Type A (Analytics) versus Type B (Builder) data scientist :

Type A Data Scientist : These guys can code well enough to work with data but is not necessarily an expert. The Type A data scientist may be an expert in experimental design, forecasting, modelling, statistical inference, or other things typically taught in statistics departments.At Google, Type A Data Scientists are known variously as Statistician, Quantitative Analyst, Decision Support Engineering Analyst, or Data Scientist, and probably a few more.

Type B Data Scientist : The B is for Building. Type B Data Scientists share some statistical background with Type A, but they are also very strong coders and may be trained software engineers. The Type B Data Scientist is mainly interested in using data “in production.” They build models which interact with users, often serving recommendations.

The way I explain the heavy term “Data Science” may be quite a simple yet amazing way for you to understand it as well !

It’s simply A + B + C = D — Analytics + Business + Computer Science = Data Science

In a startup, data scientists generally wear several hats, such as executive, data miner, data engineer or architect, researcher, statistician, modeler (as in predictive modeling) or developer.

While the data scientist is generally portrayed as a coder experienced in R, Python, SQL, Hadoop and statistics, this is just the tip of the iceberg, made popular by data camps focusing on teaching some elements of data science.

MACHINE LEARNING VS. DEEP LEARNING :

Before digging deeper into the link between data science and machine learning, let’s briefly discuss machine learning and deep learning.

1. The main difference between deep learning and machine learning is due to the way data is presented in the system. Machine learning algorithms almost always require structured data, while deep learning networks rely on layers of ANN (artificial neural networks).

2. Machine learning algorithms are designed to “learn” to act by understanding labeled data and then use it to produce new results with more datasets. However, when the result is incorrect, there is a need to “teach them”.

3. Deep learning networks do not require human intervention, as multilevel layers in neural networks place data in a hierarchy of different concepts, which ultimately learn from their own mistakes. However, even they can be wrong if the data quality is not good enough.

4. Data decides everything. It is the quality of the data that ultimately determines the quality of the result.

NOTE IT !

  1. Because machine learning algorithms require bulleted data, they are not suitable for solving complex queries that involve a huge amount of data.
  2. Although in this case we have seen the use of Deep learning to solve a minor query, the real use of deep learning neural networks is on a much larger scale. In fact, given the number of layers, hierarchies, and concepts that these networks handle, Deep learning is only suitable for performing complex calculations, not simple ones.
  3. Both of these subsets of AI are somehow connected to data, which makes it possible to represent a certain form of “intelligence.” However, you should be aware that deep learning requires much more data than a traditional machine learning algorithm. The reason for this is that deep learning networks can identify different elements in neural network layers only when more than a million data points interact. Machine learning algorithms, on the other hand, are capable of learning by pre-programmed criteria.

DATA SCIENCE VS. MACHINE LEARNING : A brief discussion

Machine learning and statistics are part of data science. The word learning in machine learning means that the algorithms depend on some data, used as a training set, to fine-tune some model or algorithm parameters. This encompasses many techniques such as regression, naive Bayes or supervised clustering. But not all techniques fit in this category. For instance, unsupervised clustering — a statistical and data science technique — aims at detecting clusters and cluster structures without any prior knowledge or training set to help the classification algorithm. A human being is needed to label the clusters found. Some techniques are hybrid, such as semi-supervised classification. Some pattern detection or density estimation techniques fit in this category.

Data Science is much more than just machine learning , stepping in Data Science, to become a best practitioner, you can’t just focus on only one part of this process. Dive in and learn, you will be amazed!

Until next time!

Stay tuned.

--

--

Sukanya Bag

I love to teach Machine Learning in simple words! All links at bio.link/sukannya