R vs Python
Summary : This is not a technical comparison of R vs Python. Instead, the goal of this chapter is to let you make informed choices. If you are a beginner programmer, pick Python. If you are experienced it doesn’t matter .
Do I need to be a programmer
If you are a beginner, the short answer is YES – read on for more.
Before we start to understand the pros and cons of R vs Python vs MATLAB, do you really need to be a programmer ? Can you be a Data Scientist or Machine Learning engineer or Data Analyst without programming ?
- Can you be a biologist without knowing how to use a microscope ?
- Can you be an army general without knowing how to use a gun ?
If you answered YES to the questions above, think again.
Unless you are a domain expert, or a PhD in statistics ( in which case you might already Know R or MATLAB) you need programming experience. For example, your chances of landing a data scientist job almost comes down by 80% if you can’t code. And almost all engineering jobs in AI ( machine learning engineer, neural language processing engineer, image processing engineer etc ) need very good programming experience.
Well, we all know how often an army general uses his gun in this everyday life – I am guessing not very often. But as a beginner, you don’t have a choice ( unless you are a domain expert, where engineers with programming experience help you build the system ).
How difficult is programming
I know some folks don’t like programming. If you are reading this section, then probably you are one among them. But programming in data science is easy. why ?
If you are comfortable with EXCEL or some kind of data processing tool, you will feel right at home with programming in data science. The idea is all about manipulating data. For example, when you open an excel file with the median home rates in an area, and you are asked to compute stats like the average home prices, what would you do ? use some formulas or macros – right ?
As a programmer you do the same thing in a programming language – and when I say programming in data science is easy, I say it in comparison to say web development or mobile app development.At this point you would have to just trust and dive in. We promise we will make this whole experience as less painful as possible if you follow the step by step curriculum.
Python and R are the top languages when it comes to Data Science. It is a big debate if you should choose Python or R and there are pros and cons for each. However, without going too much into the details, if you are a beginner we suggest you choose python. Python is a broad purpose language and if you have any kind of programming experience ( C or C# or Java etc ), you would feel right at home with Python.
When it comes to deep learning ( which is a special kind of machine learning method ), almost all the packages are originally written in Python.
That is not to say R is a bad choice. R is a language written by Statisticians for Statisticians. So, if you have some kind of statistics background, you already probably know R and you can just stick with R without having to learn python. If not, just stick with Python. Also, if you are coming from a traditional programming background, R has a bit of a learning curve ( rather a bit of unlearning ).
That being said, almost anything that you can do in Python can be done in R and vice-versa. If you wanted to go from LA to SFO, a Honda Accord will do just as good a job as a Toyota Camry.
MATLAB, Octave, Julia and more
While Python and R are the de facto tools these days for ML and Deep Learning (DL), there are many other tools like MATLAB, Octave, Julia that can do the job. However, if you are a beginner, there is really no need to learn any of these packages. If you already come from these backgrounds ( like a PhD in Math ), you might want to pick up R.