Skip to main content

What is Difference between Data Science, Machine Learning, Deep Learning, AI and StatisticsI


In this article, I will be telling you about the different roles of data scientist and how data science compares and overlaps with related fields such as Machine Learning, Deep Learning, AI, Applied Mathematics, Linear Algebra and Statistics. Data Science is a broad discipline, I start by describing the different types of data scientists that one may encounter in any business setting. You might discover that you are a data scientist yourself, without knowing it. As with any scientific discipline, Data science may borrow techniques from related disciplines. Pictures are the best way to represent anything, from below given pictorial representation you can understand that what is the overlapping of Different fields with Data science.




1- Different Types  of Data Scientist
Here in this section, I will be talking about types of Data Scientist
♦ The Type A Data Scientist can code well enough to work with data but is not necessarily an expert. The Type A data scientist may be an expert in experimental design, forecasting, modelling, statistical inference, or other things typically taught in statistics departments. Generally speaking though, the work product of a data scientist is not "p-values and confidence intervals" as academic statistics sometimes seems to suggest (and as it sometimes is for traditional statisticians working in the pharmaceutical industry, for example). At Google, Type A Data Scientists are known variously as Statistician, Quantitative Analyst, Decision Support Engineering Analyst, or Data Scientist, and probably a few more. ♦ Type B Data Scientist: The B is for Building. Type B Data Scientists share some statistical background with Type A, but they are also very strong coders and may be trained software engineers. The Type B Data Scientist is mainly interested in using data "in production." They build models which interact with users, often serving recommendations (products, people you may know, ads, movies, search results)
2- Machine Learning versus Deep Learning
♦ Before digging deeper into the link between data science and machine learning, let's briefly discuss machine learning and deep learning. Machine learning is a set of algorithms that train on a data set to make predictions or take actions in order to optimize some systems. For instance, supervised classification algorithms are used to classify potential clients into good or bad prospects, for loan purposes, based on historical data. The techniques involved, for a given task (e.g. supervised clustering), are varied: naive Bayes, SVM, neural nets, ensembles, association rules, decision trees, logistic regression, or a combination of many.

→ AI (Artificial intelligence) is a subfield of computer science, that was created in the 1960s, and it was (is) concerned with solving tasks that are easy for humans, but hard for computers. In particular, a so-called Strong AI would be a system that can do anything a human can (perhaps without purely physical things). This is fairly generic and includes all kinds of tasks, such as planning, moving around in the world, recognizing objects and sounds, speaking, translating, performing social or business transactions, creative work (making art or poetry), etc. .

→ NLP (Natural language processing) is simply the part of AI that has to do with language (usually written).


→Machine learning is concerned with one aspect of this: given some AI problem that can be described in discrete terms (e.g. out of a particular set of actions, which one is the right one), and given a lot of information about the world, figure out what is the “correct” action, without having the programmer program it in. Typically some outside process is needed to judge whether the action was correct or not. In mathematical terms, it’s a function: you feed in some input, and you want it to produce the right output, so the whole problem is simply to build a model of this mathematical function in some automatic way. To draw a distinction with AI, if I can write a very clever program that has human-like behaviour, it can be AI, but unless its parameters are automatically learned from data, it’s not machine learning. Deep learning is one kind of machine learning that’s very popular now. It involves a particular kind of mathematical model that can be thought of as a composition of simple blocks (function composition) of a certain type, and where some of these blocks can be adjusted to better predict the final outcome.

Comments

  1. Do you Want articles on any particular topic ..comment here will try to make on that..

    ReplyDelete

Post a Comment

Leave a Message !

Popular posts from this blog

Complete Tutorial on Business Analytics

CRISP-DM (Cross Industry Standard Process for Data Mining)                                    The framework is made up of 6 steps: Business Issue Understanding Data Understanding Data Preparation Analysis/Modeling Validation Presentation/Visualization The map outlines two main scenarios for a business problem: Data analysis Predictive analysis Data analysis refers to the more standard approaches of blending together data and reporting on trends and statistics and helps answer business questions that involve understanding more about the dataset such as "On average, how many people order coffee and a donut per transaction in my store in any given week?" Predictive analysis will help businesses predict future behavior based on existing data such as "Given the average coffee order, how much coffee can I expect to sell next week if I were to add a new brand of coffee?" Business Issue Understanding "This initial phase focuses on understanding the project objectives a

What is a Type II Error?

  A Type II error is a false negative in a test outcome, where something is falsely inferred to not exist. This usually means incorrectly accepting the null hypothesis (H0), which is the testing statement that whatever is being studied has no statistically significant effect on the problem. An example would be a drug trial that incorrectly concludes the prescribed medication had no effect on the patient’s ailment, when in fact the disease was cured, but subsequent exams caused a false positive showing the patient was still sick. Null Hypothesis and Statistical Significance In practice, the difference between a false positive and a false negative is usually not so clear-cut. Since the tests are most often quantitively rather than qualitatively based, the results tend to be expressed in a confidence interval value less than 100%, rather than a simple Yes/No decision. This question of how likely the results are to be found if the null hypothesis is true is called statistical significance.

The Art of Winning Kaggle Competitions

 What is Kaggle? Learning data science can be overwhelming. Finding a community to share code, data, and ideas can se e m also seem like an overwhelming as well as farfetched task. But, there is one spot where all of these characteristics come together. That place is called Kaggle. Looked at more comprehensively, Kaggle is an online community for data scientists that offers machine learning competitions, datasets, notebooks, access to training accelerators, and education. Anthony Goldbloom (CEO) and Ben Hamner (CTO) founded Kaggle in 2010, and Google acquired the company in 2017. Kaggle competitions have improved the state of the machine learning art in several areas. One is mapping dark matter; another is HIV/AIDS research. Looking at the winners of Kaggle competitions, you’ll see lots of XGBoost models, some Random Forest models, and a few deep neural networks. The Winning Recipe of Kaggle Competitions involves the following steps: Step one   is to start by reading the competition gu