Hello Data Scientists,
Let me continue from my last blog “What skills one need to be a Successful Data Scientist?” where I had discussed the importance of Data Scientists to be an all-rounder with Analytical skills, Understand Mathematics, Statistical Analysis, usage of Power Tools, Domain expert and finally presentation/visualization skills. Taking it forward in this blog I will delve into Why I as Data Scientist prefer R over other tools. Choosing R does demean the power of other tools like SAS, Python etc. Every tool has its own capability and limitation.
“Excellence is never an accident. It is always the result of high intention, sincere effort, and intelligent execution; it represents the wise choice of many alternatives – choice, not chance, determines your destiny.” ― Aristotle
To be proficient one should know theory and practical aspect of the topics. To be productive one should be proficient but right choice of tool make one smart. While ago when I was exploring what programming, tool should I tried performing objective comparison. Before I get into pros and cons, let me make it very clear this was my personal choice and I would suggest you to evaluate for your needs. It is no way an easy choice to pick one language by default for data analysis. This blog will help you objectively decide which language to pick based on different parameters.
|Install||Easy to install||Easy to Install||Need server and detailed requirements||Need detailed requirements|
|Easy to learn||Easy to understand syntax||There is a learning curve involve initially, later easy to use||Interactive UI and easy to learn||Good UI|
|Visualization||Good||Excellent graphical output||Excellent visualization||Good|
|Purpose||Programming language with statistical capability||Statistical purpose only||DATA mining, and Statistical analysis||Extended later for statistical purpose|
|Usability||More lines of code to get output||Low level language. Less line of code to get output||High level language like SQL||Pull down menu drive and 4 GL command.|
|Adopted by||Engineers and Software professionals||Researchers & Consultants||Enterprise level software||Enterprise level software|
|Architecture||Like any programming language||Consumes memory||Well managed on servers||Well managed|
|Graphical interface||Editor driven||R editor by default but R Studio is easy to use UI||Interactive and accessible UI||Good built in interactive UI|
|Support||Good support||Very strong open source community||Managed very well because of usage||Decent support because of restricted preference.|
It is very clear no single tool is a clear winner thus if one wants to process huge data in TBs, i would recommend to pick up SAS. Processing descent amount of data for which one does not need huge machine power i.e., laptop\desktop will suffice in that case go for R. Having shared quantitative and qualitative comparison, today morning i read an article in LinkedIn where it was very clearly articulated Python is grabbing market share because lot of developers are looking for a programming language which can help them integrate with other application.
However, let me now list my top 10 reasons to go for R:
- Open source software.
- Easy to install across platforms.
- Standalone computing and individual servers.
- Extensive library of statistical packages.
- Extra ordinary Data Visualization.
- RStudio is big plus, easy to use IDE.
- Easy to integrate with other packages like Excel, SAS.
- Easy to create scripts and pass on to other stakeholders.
- Trend for R in flying high, it’s in thing in Data Statistical category.
- Higher average salary for R practitioners.
Kindly refer below links to know more
R is one of the statistical programming language I chose to start with however as market dynamics change and we mature, based on the need we should be ready to pick up new tool\language like Python or SAS or SPSS etc. R language is extensible enough to do complex statistical calculations and by far the first choice for statistician to for it.
In my next blog, I will begin with “how to install R”. Though it is a matter of choice however I will pick R for windows because I have a machine with windows OS.
Thank you once for patience and your precious time screening through this article, I hope it must have insightful and aided you in deciding which language to pick to be a successful Data Scientist. Kindly share your valuable and kind opinions. Please do not forget to suggest what you would like to understand and hear from me in my future blogs.
Outstanding Outliers:: “AG”