Can you master Machine Learning if You Suck at Math?

Sukanya Bag
6 min readJul 31, 2020

--

A good news for people looking at this blog :P

“THE REAL PREREQUISITE FOR MACHINE LEARNING ISN’T MATH, IT’S DATA ANALYSIS.”

When beginners get started with machine learning, the inevitable question is “what are the prerequisites? What do I need to know to get started?”

And once they start researching, beginners frequently find well-intentioned but disheartening advice, like the following:

You need to master math. You need all of the following:
– Vector Calculus
– Differential equations
– Mathematical statistics
– Optimization
– Algorithm analysis
– and
– and
– and ……..oh crap I am not into this!

A list like this is enough to intimidate anyone but a person with an advanced math degree.

It’s unfortunate, because I think a lot of beginners lose heart and are scared away by this advice.

If you’re intimidated by the math, (HOPEFULLY I WAS TOO AND I KINDA SUCKED AT MATH) I have some good news for you: in order to get started building machine learning models (as opposed to doing machine learning theory), you need less math background than you think (and almost certainly less math than you’ve been told that you need). If you’re interested in being a machine learning practitioner, you don’t need a lot of advanced mathematics to get started.

But you’re not entirely off the hook.

There are still prerequisites. In fact, even if you can get by without having a masterful understanding of calculus and linear algebra, there are other prerequisites that you absolutely need to know (thankfully, the real prerequisites are much easier to master).

1. MASTERING MATHEMATICS OR STATISTICS IS NOT THE PRIMARY PREREQUISITE FOR MACHINE LEARNING !!

If you’re a beginner and your goal is to work in industry or business, math is not the primary prerequisite for machine learning. That probably stands in opposition to what you’ve heard in the past, so let me explain.

Most advice on machine learning is from people who learned data science in an academic environment.

Before I go on, I want to emphasize that this is not a jab. Using the term “academic” is not meant to be an insult. People who work in academia frequently build the tools that people in industry use. And through research, they also push the field forward. I admire these people.

In an academic environment, individuals are rewarded (largely) for producing novel research, and in the context of ML, that truly does require a deep understanding of the mathematics that underlies machine learning and statistics.

In industry though, in most cases, the primary rewards aren’t for innovation and novelty. In industry, you’re rewarded for creating business value. In most cases, particularly at entry levels, this means applying existing, “off the shelf” tools. The critical fact here, is that existing tools almost all take care of the math for you.

I hope now you know that you have a friend called Sklearn ?

Cool!

“OFF THE SHELF” MANY TOOLS TAKE CARE OF THE MATH FOR YOU !

Almost all of the common machine learning libraries and tools take care of the hard math for you. This includes R’s caret package as well as Python’s scikit-learn. This means that it’s not absolutely necessary to know linear algebra and calculus to get them to work.

This is great news for a beginning data scientist who wants to get started with machine learning. You can call an R function from caret or a function from Python’s scikit-learn and it will take care of all of the mathematics for you. Knowing how all that mathematics works “under the hood” is neither necessary nor sufficient for building predictive models as a beginner.

To be clear, I’m not suggesting that these tools do all the work for you. You still need to be well-practiced at applying them. You need to have a solid understanding of the heuristics, best practices, and rules of thumb associated with making them work well. Again though, much of the knowledge required to make these tools perform well does not require matrix algebra and calculus.

MOST DATA SCIENTISTS DON’T DO MUCH MATH

YES YOU HEARD IT RIGHT !!

I think many beginners have an inaccurate image in their minds of what data scientists actually do. They imagine that data scientists spend their days pensively standing at a whiteboard, scribbling math equations between sips of coffee.

Even I am a beginner and i used to spend sleepless nights worrying about these things!

So how much math does a data scientist actually do?

If we’re talking about entry level data scientists to intermediate level data scientists, I’d estimate that they spend less than 5% of their time actually doing mathematics. And quite frankly, 5% is probably a bit generous.

Even if we talk about machine learning only, you’ll still only spend less than 5% of your time doing math. (And quite frankly, most entry-level data scientists won’t spend much of their time on ML.) When you build a model, you will spend very, very little time doing any math.

The reality is that in industry, data scientists just don’t do much higher level math.

But most data scientists do spend a huge amount of their time getting data, cleaning data, and exploring data. This applies both to data science generally, and machine learning specifically; and it particularly applies to beginners.

If you want to get started with machine learning, the real prerequisite skill that you need to learn is EXPLORATORY DATA ANALYSIS.

You absolutely need to to know data analysis.

Data analysis is the first skill you need in order to get things done.

It’s the real prerequisite for getting started with machine learning as a practitioner.

(Note that as this blog continues, I’m going to use the term “data analysis” as a shorthand for “getting data, cleaning data, aggregating data, exploring data, and visualizing data.”)

This is particularly true for beginners. Although at high levels there are some data scientists who need deep mathematical skill, at a beginning level — I repeat — you do not need to know calculus and linear algebra in order to build a model that makes accurate predictions.

HAPPY FOLKS? ;)))))))

80% OF YOUR WORK WILL BE DATA PREPARATION, EDA, AND VISUALIZATION

When you’re building machine learning models, 80% of your time will be spent getting data, exploring it, cleaning it, and analyzing results (using data visualization).

To be a little more blunt about it, if you don’t know calculus and linear algebra, you can still build useful models, but if you aren’t really proficient with data analysis, you’re screwed.

LASTLY….

BEGINNERS DO NEED SOME MATH FOR MACHINE LEARNING

Yes. You don’t have to be a PRO at math or Statistics but of course you have to know the concepts behind the Machine Learning algorithms, when to use them, why to use them and what hyper-parameter tunings will yield best results or predictions through the model you made!

BECAUSE REMEMBER YOUR MODEL SHOULD BE A GENERALIZED MODEL AND WORK PERFECTLY WITH ANY REAL WORLD DATASET!

However, when people tell you that you absolutely need to know calculus, differential equations, optimization theory, linear algebra, and more just to get started building machine learning models, this is flat out wrong.

I’ll briefly summarize it here: to get started learning practical machine learning, an entry level data scientist needs to have basic comfort working with numbers, calculating percentages, etc. You need at least as much math skill as a college freshman at a good university. You’ll also need knowledge of basic statistics … about as much knowledge as you’d get in a basic “Introduction to Statistics” course. That is, you need to understand concepts like mean, standard deviation, variance, and other things you’d learn in an intro stats class.

Also, you should follow these blogs and communities and YouTube channels (no matter you are a beginner or a pro)

THESE ARE MUST-

  1. TOWARDS DATA SCIENCE
  2. DATA SCIENCE CENTRAL
  3. KAGGLE
  4. KRISH NAIK’S PLAYLISTS ON MACHINE/DEEP LEARNING (YOU TUBE)
  5. ANALYTICS VIDHYA
  6. DATA QUEST
  7. KDNuggets
  8. SMARTDATA COLLECTIVE
  9. DATACAMP
  10. AND MEDIUM ITSELF

ALL THE VERY BEST FUTURE DATA SCIENTIST !

Happy (Machine) Learning!

Until next time..!

--

--

Sukanya Bag

I love to teach Machine Learning in simple words! All links at bio.link/sukannya