Machine learning has become the sexiest and very trendy technology in this world of technologies, Machine learning is used every day in our life such as Virtual assistance, for making future predictions, Videos surveillance, Social media services, spam mail detection, online customer support, search engine resulting prediction, fraud detection, recommendation systems, etc. In machine learning, Regression is the most important topic that needed to be learned. There are different types of Regression techniques which we will know in this article.
Introduction:
Regression algorithms such as Linear regression and Logistic regression are the most important algorithms that people learn while they study about Machine learning algorithms. There are numerous forms of regression that are used to perform regression and each has its own specific features, that are applied accordingly. The regression techniques are used to find out the relationship between the dependent and independent variables or features. It is a part of data analysis that is used to analyze the infinite variables and the main aim of this is forecasting, time series analysis, modeling.
What is Regression?
Regression is a statistical method that mainly used for finance, investing and sales forecasting, and other business disciplines that make attempts to find out the strength and relationship among the variables.
There are two types of the variable into the dataset for apply regression techniques:
Dependent Variable that is mainly denoted as Y
Independent variable that is denoted as x.
And, There are two types of regression
Simple Regression: Only with a single independent feature /variable
Multiple Regression: With two or more than two independent features/variables.
Indeed, in all regression studies, mainly seven types of regression techniques are used firmly for complex problems.
Linear regression
Logistics regression
Polynomial regression
Stepwise Regression
Ridge Regression
Lasso Regression
Linear regression:
It is basically used for predictive analysis, and this is a supervised machine learning algorithm. Linear regression is a linear approach to modeling the relationship between scalar response and the parameters or multiple predictor variables. It focuses on the conditional probability distribution. The formula for linear regression is Y = mX+c.
Where Y is the target variable, m is the slope of the line, X is the independent feature, and c is the intercept.
Additional points on Linear regression:
There should be a linear relationship between the variables.
It is very sensitive to Outliers and can give a high variance and bias model.
The problem of occurring multi colinearity with multiple independent features
Logistic regression:
It is used for classification problems with a linear dataset. In layman’s term, if the depending or target variable is in the binary form (1 0r 0), true or false, yes or no. It is better to decide whether an occurrence is possibly either success or failure.
Additional point:
It is used for classification problems.
It does not require any relation between the dependent and independent features.
It can after by the outliers and can occur underfitting and overfishing.
It needs a large sample size to make the estimation more accurate.
It needs to avoid collinearity and multicollinearity.
Polynomial regression:
The polynomial regression technique is used to execute a model that is suitable for handling non-linear separated data. It gives a curve that is best suited to data points, rather than a straight line.
The polynomial regression suits the least-squares form. The purpose of an analysis of regression to model the expected y value for the independent x of the dependent variable.
The formula for this Y= β0+ β0x1+e
Additional features:
Look particularly for curve towards the ends to see if those shapes to patterns make logical sense. More polynomials can lead to weird extrapolation results.
Step-wise Regression:
Itis used for statistical model fitting regression with predictive models. It is done automatically.
The variable is supplemented or removed from the explanatory variable set at every step. The main approaches for the regression are reverse elimination and bidirectional elimination and step by step approaches.
The formula for this: b = b(sxi/sy)
Additional points:
This regression provides two things, the very first one is to add prediction for each steep and remove predictors fro each step.
It starts with the most significant predictor into the ML model and then adds features for each step.
The backward elimination starts with all the predictors into the model and then removes the least significant variable.
Ridge Regression:
It is a method that used when the dataset having multicollinearity which means, the independent variables are strongly related to each other. Although the least-squares estimates are unbiased in multicollinearity, So after adding the degree of bias to the regression, ridge regression can reduce the standard errors.
Additional points:
In this regression, normality is not to be estimated the same as Least squares regression.
In this regression, the value could be varied but doesn’t come to zero.
This uses the l2 regularization method as it is also a regularization method.
Lasso Regression:
Lasso is an abbreviation of the Least Absolute shrinkage and selection operator. This is similar to the ridge regression as it also analyzes the absolute size of the regression coefficients. And the additional features of that are it is capable of reducing the accuracy and variability of the coefficients of the Linear regression models.
Additional points:
Lasso regression shrinks the coefficients aero, which will help in feature selection for building a proper ML model.
It is also a regularization method that uses l1 regularization.
If there are many correlated features, it picks only one of them and shrinks it to the zero.
Learnbay provides industry accredited data science courses in Bangalore. We understand the conjugation of technology in the field of Data science hence we offer significant courses like Machine learning, Tensor Flow, IBM Watson, Google Cloud platform, Tableau, Hadoop, time series, R, and Python. With authentic real-time industry projects. Students will be efficient by being certified by IBM. Around hundreds of students are placed in promising companies for data science roles. Choosing Learnbay you will reach the most aspiring job of present and future. Learnbay data science course covers Data Science with Python, Artificial Intelligence with Python, Deep Learning using Tensor-Flow. These topics are covered and co-developed with IBM.
1. What are the different types of Sampling? Ans: Some of the Common sampling ways are as follows:
Simple random sample: Every member and set of members have an equal chance of being included in the sample. Technology, random number generators, or some other sort of change process is needed to get a simple random sample.
Example—A teacher puts students’ names in a hat and chooses without looking to get a sample of students.
Why it’s good: Random samples are usually fairly representative since they don’t favor certain members.
Stratified random sample: The population is first split into groups. The overall sample consists of some members of every group. The members of each group are chosen randomly.
Example—A student council surveys 100100100 students by getting random samples of 252525 freshmen, 252525 sophomores, 252525 juniors, and 252525 seniors.
Why it’s good: A stratified sample guarantees that members from each group will be represented in the sample, so this sampling method is good when we want some members from every group.
Cluster random sample: The population is first split into groups. The overall sample consists of every member of the group. The groups are selected at random.
Example—An airline company wants to survey its customers one day, so they randomly select 555 flights that day and survey every passenger on those flights.
Why it’s good: A cluster sample gets every member from some of the groups, so it’s good when each group reflects the population as a whole.
Systematic random sample: Members of the population are put in some order. A starting point is selected at random, and every nth member is selected to be in the sample.
Example—A principal takes an alphabetized list of student names and picks a random starting point. Every 20th student is selected to take a survey.
2. What is the confidence interval? What is its significance?
Ans: A confidence interval, in statistics, refers to the probability that a population parameter will fall between two set values for a certain proportion of times. Confidence intervals measure the degree of uncertainty or certainty in a sampling method. A confidence interval can take any number of probabilities, with the most common being a 95% or 99% confidence level.
3. What are the effects of the width of the confidence interval?
The confidence interval is used for decision making
The confidence level increases the width of
The confidence interval also increases
As the width of the confidence interval increases, we tend to get useless information also.
Useless information – wide CI
High risk – narrow CI
4. What is the level of significance (Alpha)?
Ans: The significance level also denoted as alpha or α, is a measure of the strength of the evidence that must be present in your sample before you will reject the null hypothesis and conclude that the effect is statistically significant. The researcher determines the significance level before conducting the experiment.
The significance level is the probability of rejecting the null hypothesis when it is true. For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference. Lower significance levels indicate that you require stronger evidence before you will reject the null hypothesis.
Use significance levels during hypothesis testing to help you determine which hypothesis the data support. Compare your p-value to your significance level. If the p-value is less than your significance level, you can reject the null hypothesis and conclude that the effect is statistically significant. In other words, the evidence in your sample is strong enough to be able to reject the null hypothesis at the population level.
5. What are Skewness and Kurtosis? What does it signify?
Ans: Skewness: It is the degree of distortion from the symmetrical bell curve or the normal distribution. It measures the lack of symmetry in the data distribution. It differentiates extreme values in one versus the other tail. The asymmetrical distribution will have a skewness of 0.
There are two types of Skewness: Positive and Negative
Positive Skewness means when the tail on the right side of the distribution is longer or fatter. The mean and median will be greater than the mode.
Negative Skewness is when the tail of the left side of the distribution is longer or fatter than the tail on the right side. The mean and median will be less than the mode.
So, when is the skewness too much?
The rule of thumb seems to be:
If the skewness is between -0.5 and 0.5, the data are fairly symmetrical.
If the skewness is between -1 and -0.5(negatively skewed) or between 0.5 and 1(positively skewed), the data are moderately skewed.
If the skewness is less than -1(negatively skewed) or greater than 1(positively skewed), the data are highly skewed.
Example
Let us take a very common example of house prices. Suppose we have house values ranging from $100k to $1,000,000 with the average being $500,000.
If the peak of the distribution was left of the average value, portraying a positive skewness in the distribution. It would mean that many houses were being sold for less than the average value, i.e. $500k. This could be for many reasons, but we are not going to interpret those reasons here.
If the peak of the distributed data was right of the average value, that would mean a negative skew. This would mean that the houses were being sold for more than the average value.
Kurtosis: Kurtosis is all about the tails of the distribution — not the peakedness or flatness. It is used to describe the extreme values in one versus the other tail. It is actually the measure of outliers present in the distribution.
High kurtosis in a data set is an indicator that data has heavy tails or outliers. If there is a high kurtosis, then, we need to investigate why do we have so many outliers. It indicates a lot of things, maybe wrong data entry or other things. Investigate!
Low kurtosis in a data set is an indicator that data has light tails or a lack of outliers. If we get low kurtosis(too good to be true), then also we need to investigate and trim the dataset of unwanted results.
Mesokurtic: This distribution has kurtosis statistics similar to that of the normal distribution. It means that the extreme values of the distribution are similar to that of a normal distribution characteristic. This definition is used so that the standard normal distribution has a kurtosis of three.
Leptokurtic(Kurtosis > 3): Distribution is longer, tails are fatter. The peak is higher and sharper than Mesokurtic, which means that data are heavy-tailed or profusion of outliers.
Outliers stretch the horizontal axis of the histogram graph, which makes the bulk of the data appear in a narrow (“skinny”) vertical range, thereby giving the “skinniness” of a leptokurtic distribution.
Platykurtic: (Kurtosis < 3): Distribution is shorter; tails are thinner than the normal distribution. The peak is lower and broader than Mesokurtic, which means that data are light-tailed or lack of outliers. The reason for this is because the extreme values are less than that of the normal distribution.
6. What are Range and IQR? What does it signify?
Ans: Range: The range of a set of data is the difference between the highest and lowest values in the set.
IQR(Inter Quartile Range): The interquartile range (IQR) is the difference between the first quartile and the third quartile. The formula for this is:
IQR = Q3 – Q1
The range gives us a measurement of how spread out the entirety of our data set is. The interquartile range, which tells us how far apart the first and third quartile is, indicates how to spread out the middle 50% of our set of data is.
7. What is the difference between Variance and Standard Deviation? What is its significance?
Ans: The central tendency mean gives you the idea of an average of the data points( i.e center location of the distribution) And now you want to know how far are your data points from mean So, here comes the concept of variance to calculate how far are your data points from mean (in simple terms, it is to calculate the variation of your data points from mean)
Standard deviation is simply the square root of variance. And the standard deviation is also used to calculate the variation of your data points (And you may be asking, why do we use standard deviation when we have variance. Because in order to maintain the calculations in same units i.e suppose mean is in 𝑐𝑚/𝑚, then the variance is in 𝑐𝑚2/𝑚2, whereas standard deviation is in 𝑐𝑚/𝑚, so we use standard deviation most)
8. What is selection Bias? Types of Selection Bias?
Ans: Selection bias is the phenomenon of selecting individuals, groups, or data for analysis in such a way that proper randomization is not achieved, ultimately resulting in a sample that is not representative of the population.
Understanding and identifying selection bias is important because it can significantly skew results and provide false insights about a particular population group.
Types of selectionbias include:
Sampling bias: a biased sample caused by non-random sampling
Time interval: selecting a specific time frame that supports the desired conclusion. e.g. conducting a sales analysis near Christmas.
Exposure: includes clinical susceptibility bias, protopathic bias, indication bias. Read more here.
Data: includes cherry-picking, suppressing evidence, and the fallacy of incomplete evidence.
Attrition: attrition bias is similar to survivorship bias, where only those that ‘survived’ a long process are included in an analysis, or failure bias, where those that ‘failed’ are only included
Observer selection: related to the Anthropic principle, which is a philosophical consideration that any data we collect about the universe is filtered by the fact that, in order for it to be observable, it must be compatible with the conscious and sapient life that observes it.
Handling missing data can make selection bias worse because different methods impact the data in different ways. For example, if you replace null values with the mean of the data, you adding bias in the sense that you’re assuming that the data is not as spread out as it might actually be.
9. What are the ways of handling missing Data?
Delete rows with missing data
Mean/Median/Mode imputation
Assigning a unique value
Predicting the missing values using Machine Learning Models
Using an algorithm that supports missing values, like random forests.
10. What are the different types of the probability distribution? Explain with example?
Ans: The common Probability Distribution is as follows:
Bernoulli Distribution
Uniform Distribution
Binomial Distribution
Normal Distribution
Poisson Distribution
1. Bernoulli Distribution: A Bernoulli distribution has only two possible outcomes, namely 1 (success) and 0 (failure), and a single trial. So the random variable X which has a Bernoulli distribution can take value 1 with the probability of success, say p, and the value 0 with the probability of failure, say q or 1-p.
Example: whether it’s going to rain tomorrow or not where rain denotes success and no rain denotes failure and Winning (success) or losing (failure) the game.
2. Uniform Distribution: When you roll a fair die, the outcomes are 1 to 6. The probabilities of getting these outcomes are equally likely and that is the basis of a uniform distribution. Unlike Bernoulli Distribution, all the n number of possible outcomes of a uniform distribution are equally likely.
Example: Rolling a fair dice.
3. Binomial Distribution: A distribution where only two outcomes are possible, such as success or failure, gain or loss, win or lose and where the probability of success and failure is the same for all the trials is called a Binomial Distribution.
Each trial is independent.
There are only two possible outcomes in a trial- either a success or a failure.
A total number of n identical trials are conducted.
The probability of success and failure is the same for all trials. (Trials are identical.)
Example: Tossing a coin.
4. Normal Distribution: Normal distribution represents the behavior of most of the situations in the universe (That is why it’s called a “normal” distribution. I guess!). The large sum of (small) random variables often turns out to be normally distributed, contributing to its widespread application. Any distribution is known as Normal distribution if it has the following characteristics:
The mean, median, and mode of the distribution coincide.
The curve of the distribution is bell-shaped and symmetrical about the line x=μ.
The total area under the curve is 1.
Exactly half of the values are to the left of the center and the other half to the right.
5. Poisson Distribution: A distribution is called Poisson distribution when the following assumptions are valid:
Any successful event should not influence the outcome of another successful event.
The probability of success over a short interval must equal the probability of success over a longer interval.
The probability of success in an interval approaches zero as the interval becomes smaller.
Example: The number of emergency calls recorded at a hospital in a day.
11. What are the statistical Tests? List Them.
Ans: Statistical tests are used in hypothesis testing. They can be used to:
determine whether a predictor variable has a statistically significant relationship with an outcome variable.
estimate the difference between two or more groups.
Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.
Common Tests in Statistics:
T-Test/Z-Test
ANOVA
Chi-Square Test
MANOVA
12. How do you calculate the sample size required?
Ans: You can use the margin of error (ME) formula to determine the desired sample size.
t/z = t/z score used to calculate the confidence interval
ME = the desired margin of error
S = sample standard deviation
13. What are the different Biases associated when we sample?
Ans: Potential biases include the following:
Sampling bias: a biased sample caused by non-random sampling
Under coverage bias: sampling too few observations
Survivorship bias: error of overlooking observations that did not make it past a form of the selection process.
14. How to convert normal distribution to standard normal distribution?
Standardized normal distribution has mean = 0 and standard deviation = 1
To convert normal distribution to standard normal distribution we can use the
formula: X (standardized) = (x-µ) / σ
15. How to find the mean length of all fishes in a river?
Define the confidence level (most common is 95%)
Take a sample of fishes from the river (to get better results the number of fishes > 30)
Calculate the mean length and standard deviation of the lengths
Calculate t-statistics
Get the confidence interval in which the mean length of all the fishes should be.
16. What do you mean by the degree of freedom?
DF is defined as the number of options we have
DF is used with t-distribution and not with Z-distribution
For a series, DF = n-1 (where n is the number of observations in the series)
17. What do you think if DF is more than 30?
As DF increases the t-distribution reaches closer to the normal distribution
At low DF, we have fat tails
If DF > 30, then t-distribution is as good as the normal distribution.
18. When to use t distribution and when to use z distribution?
The following conditions must be satisfied to use Z-distribution
Do we know the population standard deviation?
Is the sample size > 30?
CI = x (bar) – Z*σ/√n to x (bar) + Z*σ/√n
Else we should use t-distribution
CI = x (bar) – t*s/√n to x (bar) + t*s/√n
19. What are H0 and H1? What is H0 and H1 for the two-tail test?
H0 is known as the null hypothesis. It is the normal case/default case.
For one tail test x <= µ
For two-tail test x = µ
H1 is known as an alternate hypothesis. It is the other case.
For one tail test x > µ
For two-tail test x <> µ
20. What is the Degree of Freedom?
DF is defined as the number of options we have:
DF is used with t-distribution and not with Z-distribution
For a series, DF = n-1 (where n is the number of observations in the series)
21. How to calculate p-Value?
Ans: Calculating p-value:
Using Excel:
Go to the Data tab
Click on Data Analysis
Select Descriptive Statistics
Choose the column
Select summary statistics and confidence level (0.95)
By Manual Method:
Find H0 and H1
Find n, x(bar) and s
Find DF for t-distribution
Find the type of distribution – t or z distribution
Find t or z value (using the look-up table)
Compute the p-value to the critical value
22. What is ANOVA?
Ans: ANOVA expands to the analysis of variance, is described as a statistical technique used to determine the difference in the means of two or more populations, by examining the amount of variation within the samples corresponding to the amount of variation between the samples. It bifurcates the total amount of variation in the dataset into two parts, i.e. the amount ascribed to chance and the amount ascribed to specific causes.
It is a method of analyzing the factors which are hypothesized or affect the dependent variable. It can also be used to study the variations amongst different categories, within the factors, that consist of numerous possible values. It is of two types:
One way ANOVA: When one factor is used to investigate the difference between different categories, having many possible values.
Two way ANOVA: When two factors are investigated simultaneously to measure the interaction of the two factors influencing the values of a variable.
23. What is ANCOVA?
Ans: ANCOVA stands for Analysis of Covariance, is an extended form of ANOVA, that eliminates the effect of one or more interval-scaled extraneous variable, from the dependent variable before carrying out research. It is the midpoint between ANOVA and regression analysis, wherein one variable in two or more populations can be compared while considering the variability of other variables.
When in a set of independent variables consist of both factor (categorical independent variable) and covariate (metric independent variable), the technique used is known as ANCOVA. The difference independent variables because of the covariate are taken off by an adjustment of the dependent variable’s mean value within each treatment condition.
This technique is appropriate when the metric independent variable is linearly associated with the dependent variable and not to the other factors. It is based on certain assumptions which are:
There is some relationship between the dependent and uncontrolled variables.
The relationship is linear and is identical from one group to another.
Various treatment groups are picked up at random from the population.
Groups are homogeneous in variability.
24. What is the difference between ANOVA and ANCOVA?
Ans: The points given below are substantial so far as the difference between ANOVA and ANCOVA is concerned:
The technique of identifying the variance among the means of multiple groups for homogeneity is known as Analysis of Variance or ANOVA. A statistical process which is used to take off the impact of one or more metric-scaled undesirable variable from the dependent variable before undertaking research is known as ANCOVA.
While ANOVA uses both linear and non-linear models. On the contrary, ANCOVA uses only a linear model.
ANOVA entails only categorical independent variables, i.e. factor. As against this, ANCOVA encompasses a categorical and a metric independent variable.
A covariate is not taken into account, in ANOVA, but considered in ANCOVA.
ANOVA characterizes between-group variations, exclusively to treatment. In contrast, ANCOVA divides between-group variations to treatment and covariate.
ANOVA exhibits within-group variations, particularly individual differences. Unlike ANCOVA, which bifurcates within-group variance in individual differences and covariate.
25. What are t and z scores? Give Details.
T-Score vs. Z-Score: Overview: A z-score and a t score are both used in hypothesis testing.
T-score vs. z-score: When to use a t score:
The general rule of thumb for when to use a t score is when your sample:
Has a sample size below 30,
Has an unknown population standard deviation.
You must know the standard deviation of the population and your sample size should be above 30 in order for you to be able to use the z-score. Otherwise, use the t-score.
Z-score
Technically, z-scores are a conversion of individual scores into a standard form. The conversion allows you to more easily compare different data. A z-score tells you how many standard deviations from the mean your result is. You can use your knowledge of normal distributions (like the 68 95 and 99.7 rule) or the z-table to determine what percentage of the population will fall below or above your result.
The z-score is calculated using the formula:
z = (X-μ)/σ
Where:
σ is the population standard deviation and
μ is the population mean.
The z-score formula doesn’t say anything about sample size; The rule of thumb applies that your sample size should be above 30 to use it.
T-score
Like z-scores, t-scores are also a conversion of individual scores into a standard form. However, t-scores are used when you don’t know the population standard deviation; You make an estimate by using your sample.
T = (X – μ) / [ s/√(n) ]
Where:
s is the standard deviation of the sample.
If you have a larger sample (over 30), the t-distribution and z-distribution look pretty much the same.
To know more about Data Science, Artificial Intelligence, Machine Learning, and Deep Learning programs visit our website www.learnbay.co
Watch our Live Session Recordings to precisely understand statistics, probability, calculus, linear algebra, and other math concepts used in data science.
To get updates on Data Science and AI Seminars/Webinars – Follow our Meetup group.
Learnbay provides industry accredited data science courses in Bangalore. We understand the conjugation of technology in the field of Data science hence we offer significant courses like Machine learning, Tensor Flow, IBM Watson, Google Cloud platform, Tableau, Hadoop, time series, R, and Python. With authentic real-time industry projects. Students will be efficient by being certified by IBM. Around hundreds of students are placed in promising companies for data science roles. Choosing Learnbay you will reach the most aspiring job of present and future. Learnbay data science course covers Data Science with Python, Artificial Intelligence with Python, Deep Learning using Tensor-Flow. These topics are covered and co-developed with IBM.
Machine Learning works with “models” and “algorithms”, and both play an important role in machine learning where the algorithm tells about the process and model is built by following those rules.
Algorithms have derived by the statistician or mathematician very long ago and those algorithms are studies and applied by the individuals for their business purposes.
A model in machine learning nothing but a function that is used to take some certain input, perform a certain operation which is told by algorithms to its best on the given input, and gives a suitable output.
Some of the machine learning algorithms are:
Linear regression
Logistic regression
Decision tree
Random forest
K-nearest neighbor
K-means learning
What is an algorithm in Machine learning?
An algorithm is a step by step approach powered by statistics that guides the machine learning in its learning process. An algorithm is nothing but one of the several components that constitute a model.
There are several characteristics of machine learning algorithms:
Machine learning algorithms can be represented by the use of mathematics and pseudo code.
The effectiveness of machine learning algorithms can be measured and represented.
With any of the popular programming languages, machine learning algorithms can be implemented.
What is the Model in Machine learning?
The model is dependent on factors such as features selection, tuning parameters, cost functions along with the algorithm the model just not fully dependent on algorithms.
Model is the result of an algorithm when we implement the algorithm with the code when we train the algorithms with the real data. A model is something that tells what your program learned from the data by following the rules of those algorithms. The model is used to predict the future result that is observed by the algorithm implementation of small data.
Model = Data + Algorithm
A model contains four major steps that are:
Data preprocessing
Feature engineering
Data management
performance measurement.
How the model and algorithms work together in machine learning?
For example:
y = mx+c is an equation for a line where m is the slope of the line and c is the y-intercept, this is nothing but linear regression with only one variable. similarly, the decision tree and random forest have something like the Gini index and K-nearest having Euclidean distance formula.
So take the linear regression algorithm:
Start with a training set with x1, x2,…, and y.
Find out the parameters c0, c1, c2 with the random variables.
Find out the learning rate alpha
Then repeat the following updates such as c0 = co-alpha +h(x)-y and for c1, c2 also.
Repeat these processes till converged.
when you employing this algorithm, you are employing these exact 5 steps in your model without changing the steps, your model initiated by the algorithm and also treat all the dataset same.
If you want to apply that algorithm to the model, the model finds out the value of m and c that we don’t know, then how will you find out? suppose you have 3 variables that are having values of x and y now your model will find the value of m1, m2, m3, and c1, c2, c3 for three variables. The model will work with three slopes and three intercepts to find out the result of the dataset to predict the future.
The “algorithm” might be treating all the data the same but it is the “model” that actually solves the problems. An algorithm is something that you use to train the model on the data.
After building a model, a data science enthusiasts test it to get the accuracy of that model and fine-tuning to improve the results.
This article may help you yo understand about the algorithm and model in Machine learning, In summary, an algorithm is a process or a technique that we follow to get the result or to find the solution of a problem. And a model is a computation or a formula that formed as an output of an algorithm that takes some input, so you can say that you are building a model using a given algorithm.
Learnbay provides industry accredited data science courses in Bangalore. We understand the conjugation of technology in the field of Data science hence we offer significant courses like Machine learning, Tensor Flow, IBM Watson, Google Cloud platform, Tableau, Hadoop, time series, R, and Python. With authentic real-time industry projects. Students will be efficient by being certified by IBM. Around hundreds of students are placed in promising companies for data science roles. Choosing Learnbay you will reach the most aspiring job of present and future. Learnbay data science course covers Data Science with Python, Artificial Intelligence with Python, Deep Learning using Tensor-Flow. These topics are covered and co-developed with IBM.
COVID-19 is an inevitable unfortunate situation, each one of us have got to fight it being at home, I just hope this all ends soon. Many lives have changed since Corona came into existence, people died, families suffered, economy collapsed, companies are temporarily logged off, jobs are lost and productivity has stopped. None of us can help in anyway unless by maintaining our personal hygiene by staying at home. Sounds hopeless isn’t it? But what if I told you that by slightly changing your perspective towards this situation you can find hope of this pandemic to end and can hope of a brighter than ever future. This is everything about how to get the most out of COVID-19 pandemic. We all used to whine about going to office, convinced ourselves that the reason behind being not-so-productive is the less time we get due to the work, but now that option is no more available to blame upon because we are literally at our home with enough personal time. This is the best opportunity to build a habit, learn new things and to renovate your lifestyle. The best way to utilize this abundance of time is by learning new things, so why not learn something that will not only be as a new good habit but also help you to level up your game in your professional life? Something that will help for big time, that is popular, trendy and the sexiest. Yes, I am talking about Data Science. Learning Data Science is the best thing to do especially right now Firstly, Data Science is not easy to learn, one must have to dedicate enough time upon studying its vast concepts and methods so what else could be a better time than right now? You will have enough time to properly understand its workings. Job opportunities? Let me start it with saying that Data Science is one of the highest paying field, Data Scientists gets paid in crazy numbers and the best part is that the pay is steadily increasing to every new month. Because Data science is such a field, it demands maximum dedication of work from Data Scientists as they will have to work with abundant sizes of data that is generated everyday. So the field doesn’t forgets to pay off well to its hardworking employers. Do not think it is only “Data Scientist” you will become by studying Data Science, refer this blog to know different job opportunities through Data Science. Data Science is anytime the best option to choose for career but please reconsider before thinking about choosing it, because the field demands mandatory discipline and persistence which is not something everyone can manage to provide. Talk to the professional executives regarding whether your profile is suitable to become a Data scientist or not. Do not hesitate to desire Data Science if you are a non-programmer or from a non-technical background, most of the successful Data Scientists belongs from a pure non-tech background, so you can too. How will Data Science help you? I have said enough about the present popularity of Data science, let me tell the future of it. Data Science is a field that will not stop until aliens attack on us, why you ask? because it runs by “data”, something that we all generate everyday. For now only 65% of the world population are having access to internet, only such ratio of people are creating 1.5 Quintilian bytes of data everyday, which is too much huge to handle by data scientists. It is estimated that within 5 years atleast 80% of the world population will have access to internet, which ofcourse further increases the everyday amount of data generation. So as each day rolls by, the number of internet users keep increasing resulting in proportional increase in everyday data. So Data scientists will have more than enough amount of data everyday to work with, their business will only keeps increasing. Due to the “corona effect” many industries have already faced loss, many more companies yet to face in near future but the statistics have reported that only the IT field will have opportunity to quickly recover from the pandemic effect. It is evident because the fuel for IT is the virtual data, which keeps on increasing at any day, so the field has least chances to face downfall. How to study Data Science? One of the best way to study Data Science(especially now) is through a good Data Science course. But firstly I want you to be aware of the fact that not every Data Science course will help you to get ready for the most desired field of the decade, so it is important to choose the course that indeed makes you a Data Scientist. Learnbay is a Bangalore based Data Science course providing institute, that has been helping students to realize their dream of becoming Data Scientist in a very less price. Students will be certified by IBM and will be benefited by other more helpful features, do check that out. There are so many Data Science courses in the market but unfortunately only some are managing to make their students as a real Data Scientist, so it is necessary to choose the right course. CONCLUSION Even though Corona has created discomfort environment, it has atleast given the opportunity to rebuild our lifestyle, this is the best opportunity to prepare for accomplishing great things. Include Data Science in the list of your practices, it will worth every ounce of your efforts.
Human language is an unsolved problem that there are more than 6500 languages worldwide. The tons of data are generating every day as we speak, we text, we tweet, from voice to text on every social application and o get the insights of these text data we need technology as NLP. If you know there are two types of data are there one is structured and unstructured data. Structured data used for Machine learning models and unstructured data is used for Natural language processing. There are only 21% of structured data is available, so now you can estimate how much NLP is required to handle unstructured data.
To get the insights of the dataset of unstructured data to take out the important information from it. The important technique to analyze the text data is text mining. Text mining is the technique to extract useful information from the unstructured data by identifying and exploring a large amount of text data. Or we can say that text mining is used to convert the unstructured data to the structured dataset.
Normalization, lemmatization, stemming, tokenization is the technique in NLP to get out the insights from the data.
Now we will see how text stemming works?
Stemming is the process of reducing inflection in words to their “root” forms such as mapping a group of words to the same stem. Stem words mean the suffix and prefix that have added to the root word. It is the process to produce grammatically variants of root words. A stemming is provided by the NLP algorithms that are stemming algorithms or stemmers. The stemming algorithm removes the stem from the word. For example, eats, eating, eatery, they are made from the root word “eat“. so here the stemmer removes s, ing, very from the above words to take out meaning that the sentence is about eating something. The words are nothing but different tenses forms of verbs.
This is the general idea to reduce the different forms of the word to their root word. Words that are derived from one another can be mapped to a base word or symbol, especially if they have the same meaning.
As we can not sure that it will give us a 100% result so we have two types of error in stemming they are: over stemming and under stemming.
Over stemming occurs when there are too many words have cut out. This could be known as non-sensical items, where the meaning of the word has lost, or it can not be able to distinguish between two stems or resolve the same stem where they should differ from each other.
For example, take out the four words university, universities, universal, and universe. A stemmer that resolves these four stems to “Univers” that is over stemming. It should be the universe stemmer that stemmed together and university, universities stemmed together they all four are not fit for the single stem.
Under stemming: Under-stemming is the opposite of stemming. It comes from when we have different words that actually are forms of one another. It would be nice for them to all resolve to the same stem, but unfortunately, they do not.
This can be seen if we have a stemming algorithm that stems from the words data and datum to “dat” and “datu.” And you might be thinking, well, just resolve these both to “dat.” However, then what do we do with the date? And is there a good general rule? So there under stemming occurs.
Learnbay provides industry accredited data science courses in Bangalore. We understand the conjugation of technology in the field of Data science hence we offer significant courses like Machine learning, Tensor Flow, IBM Watson, Google Cloud platform, Tableau, Hadoop, time series, R and Python. With authentic real-time industry projects. Students will be efficient by being certified by IBM. Around hundreds of students are placed in promising companies for data science roles. Choosing Learnbay you will reach the most aspiring job of present and future. Learnbay data science course covers Data Science with Python, Artificial Intelligence with Python, Deep Learning using Tensor-Flow. These topics are covered and co-developed with IBM.
Gaussian distribution is a bell-shaped curve, it follows the normal distribution with the equal number of measurements right side and left side of the mean value. Mean is situated in the center of the curve, the right side values from the mean are greater than mean value and the left side values from mean are smaller than the mean. It is used for mean, median, and mode for continuous values. You all know the basic meaning of mean, median, and mod. The mean is an average of the values, the median is the center value of the distribution and the mode is the value of the distribution which is frequently occurred. In the normal distribution, the values of mean, median, and are all same. If the values generate skewness than it is not normally distributed. The normal distribution is very important in statistics because it fits for many occurrences such as heights, blood pressure, measurement error, and many numerical values.
A gaussian and normal distribution is the same in statistics theory. Gaussian distribution is also known as a normal distribution. The curve is made with the help of probability density function with the random values. F(x) is the PDF function and x is the value of gaussian & used to represent the real values of random variables having unknown distribution.
There is a property of Gaussian distribution which is known as Empirical formula which shows that in which confidence interval the value comes under. The normal distribution contains the mean value as 0 and standard deviation 1.
The empirical rule also referred to as the three-sigma rule or 68-95-99.7 rule, is a statistical rule which states that for a normal distribution, almost all data falls within three standard deviations (denoted by σ) of the mean (denoted by µ). Broken down, the empirical rule shows that 68% falls within the first standard deviation (µ ± σ), 95% within the first two standard deviations (µ ± 2σ), and 99.7% within the first three standard deviations (µ ± 3σ).
Python code for plotting the gaussian graph:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
import math
mu =0
variance =1
sigma = math.sqrt(variance)
x = np.linspace(mu -3*sigma, mu +3*sigma,100)
plt.plot(x, stats.norm.pdf(x, mu, sigma))
plt.show()
The above code shows the Gaussian distribution with 99% of the confidence interval with a standard deviation of 3 with mean 0.
Learnbay provides industry accredited data science courses in Bangalore. We understand the conjugation of technology in the field of Data science hence we offer significant courses like Machine learning, Tensor Flow, IBM Watson, Google Cloud platform, Tableau, Hadoop, time series, R and Python. With authentic real-time industry projects. Students will be efficient by being certified by IBM. Around hundreds of students are placed in promising companies for data science roles. Choosing Learnbay you will reach the most aspiring job of present and future. Learnbay data science course covers Data Science with Python, Artificial Intelligence with Python, Deep Learning using Tensor-Flow. These topics are covered and co-developed with IBM.
The supervised learning algorithm is widely used in the industries to predict the business outcome, and forecasting the result on the basis of historical data. The output of any supervised learning depends on the target variables. It allows the numerical, categorical, discrete, linear datasets to build a machine learning model. The target variable is known for building the model and that model predicts the outcome on the basis of the given target variable if any new data point comes to the dataset.
The supervised learning model is used to teach the machine to predict the result for the unseen input. It contains a known dataset to train the machine and its performance during the training time of a model. And then the model predicts the response of testing data when it is fed to the trained model. There are different machine learning models that are suitable for different kinds of datasets. The supervised algorithm uses regression and classification techniques for building predictive models.
For example, you have a bucket of fruits and there are different types of fruits in the bucket. You need to separate the fruits according to their features and you know the name of the fruits follow up its corresponding features the features of the fruits are independent variables and name of fruits are dependent variable that is out target variable. We can build a predicting model to determine the fruit name.
There are various types of Supervised learning:
Linear regression
Logistic regression
Decision tree
Random forest
support vector machine
k-Nearest neighbors
Linear and logistic regression is used when we have continuous data. Linear regression defines the relationship between the variables where we have independent and dependent variables. For example, what would be the performance percentage of a student after studying a number of hours? The numbers of hours are in an independent feature and the performance of students in the dependent features. The linear regression is also categorized in types those are simple linear regression, multiple linear regression, polynomial regression.
Classification algorithms help to classify the categorical values. It is used for the categorical values, discrete values, or the values which belong to a particular class. Decision tree and Random forest and KNN all are used for the categorical dataset. Popular or major applications of classification include bank credit scoring, medical imaging, and speech recognition. Also, handwriting recognition uses classification to recognize letters and numbers, to check whether an email is genuine or spam, or even to detect whether a tumor is benign or cancerous and for recommender systems.
The support vector machine is used for both classification and regression problems. It uses the regression method to create a hyperplane to classify the category of the datapoint. sentiment analysis of a subject is determined with the help of SVM whether the statement is positive or negative.
Unsupervised learning algorithms
Unsupervised learning is a technique in which we need to supervise the model as we have not any target variable or labeled dataset. It discovers its own information to predict the outcome. It is used for the unlabeled datasets. Unsupervised learning algorithms allow you to perform more complex processing tasks compared to supervised learning. Although, unsupervised learning can be more unpredictable compared with other natural learning methods. It is easier to get unlabeled data from a computer than labeled data, which needs manual intervention.
For example, We have a bucket of fruits and we need to separate them accordingly, and there no target variable available to determine whether the fruit is apple, orange, or banana. Unsupervised learning categorizes these fruits to make a prediction when new data comes.
Types of unsupervised learning:
Hierarchical clustering
K-means clustering
K-NN (k nearest neighbors)
Principal Component Analysis
Singular Value Decomposition
Independent Component Analysis
Hierarchical clustering is an algorithm that builds a hierarchy of clusters. It begins with all the data which is assigned to a cluster of their own. Here, two close clusters are going to be in the same cluster. This algorithm ends when there is only one cluster left.
K-means and KNN is also a clustering method to classify the dataset. k-means is an iterative method of clustering and also used to find the highest value for every iteration, we can select the numbers of clusters. You need to define the k cluster for making a good predictive model. K- nearest neighbour is the simplest of all machine learning classifiers. It differs from other machine learning techniques, in that it doesn’t produce a model. It is a simple algorithm that stores all available cases and classifies new instances based on a similarity measure.
PCA(Principal component analysis) is a dimensionality reduction algorithm. For example, you have a dataset with 200 of the features/columns. You need to reduce the number of features for the model with only an important feature. It maintains the complexity of the dataset.
Reinforcement learning is also a type of Machine learning algorithm. It provides a suitable action in a particular situation, and it is used to maximize the reward. The reward could be positive or negative based on the behavior of the object. Reinforcement learning is employed by various software and machines to find the best possible behavior in a situation.
Main points in Reinforcement learning –
Input: The input should be an initial state from which the model will start
Output: There are much possible output as there are a variety of solution to a particular problem
Training: The training is based upon the input, The model will return a state and the user will decide to reward or punish the model based on its output.
The model keeps continues to learn.
The best solution is decided based on the maximum reward.
Learnbay provides industry accredited data science courses in Bangalore. We understand the conjugation of technology in the field of Data science hence we offer significant courses like Machine learning, Tensor Flow, IBM Watson, Google Cloud platform, Tableau, Hadoop, time series, R and Python. With authentic real-time industry projects. Students will be efficient by being certified by IBM. Around hundreds of students are placed in promising companies for data science roles. Choosing Learnbay you will reach the most aspiring job of present and future. Learnbay data science course covers Data Science with Python, Artificial Intelligence with Python, Deep Learning using Tensor-Flow. These topics are covered and co-developed with IBM.
XGBoost classifier is a Machine learning algorithm that is applied for structured and tabular data. XGBoost is an implementation of gradient boosted decision trees designed for speed and performance. XGBoost is an extreme gradient boost algorithm. And that means it’s a big Machine learning algorithm with lots of parts. XGBoost works with large complicated datasets. XGBoost is an ensemble modeling technique.
What is ensemble modeling?
XGBoost is an ensemble learning method. Sometimes, it may not be sufficient to rely upon the results of just one machine learning model. Ensemble learning offers a systematic solution to combine the predictive power of multiple learners. The resultant is a single model that gives the aggregated output from several models.
The models that form the ensemble, also known as base learners, could be either from the same learning algorithm or different learning algorithms. Bagging and boosting are two widely used ensemble learners. Though these two techniques can be used with several statistical models, the most predominant usage has been with decision trees.
Unique features of XGBoost:
XGBoost is a popular implementation of gradient boosting. Let’s discuss some features of XGBoost that make it so interesting.
Regularization: XGBoost has an option to penalize complex models through both L1 and L2 regularization. Regularization helps in preventing overfitting
Handling sparse data: Missing values or data processing steps like one-hot encoding make data sparse. XGBoost incorporates a sparsity-aware split finding algorithm to handle different types of sparsity patterns in the data
Weighted quantile sketch: Most existing tree based algorithms can find the split points when the data points are of equal weights (using a quantile sketch algorithm). However, they are not equipped to handle weighted data. XGBoost has a distributed weighted quantile sketch algorithm to effectively handle weighted data
Block structure for parallel learning: For faster computing, XGBoost can make use of multiple cores on the CPU. This is possible because of a block structure in its system design. Data is sorted and stored in in-memory units called blocks. Unlike other algorithms, this enables the data layout to be reused by subsequent iterations, instead of computing it again. This feature also serves useful for steps like split finding and column sub-sampling
Cache awareness: In XGBoost, non-continuous memory access is required to get the gradient statistics by row index. Hence, XGBoost has been designed to make optimal use of hardware. This is done by allocating internal buffers in each thread, where the gradient statistics can be stored
Out-of-core computing: This feature optimizes the available disk space and maximizes its usage when handling huge datasets that do not fit into memory.
Solve the XGBoost mathematically:
Here we will use simple Training Data, which has a Drug dosage on the x-axis and Drug effectiveness in the y-axis. These above two observations(6.5, 7.5) have a relatively large value for Drug Effectiveness and that means that the drug was helpful and these below two observations(-10.5, -7.5) have a relatively negative value for Drug Effectiveness, and that means that the drug did more harm than good.
The very 1st step in fitting XGBoost to the training data is to make an initial prediction. This prediction could be anything but by default, it is 0.5, regardless of whether you are using XGBoost for Regression or Classification.
The prediction 0.5 corresponds to the thick black horizontal line.
Unlike unextreme Gradient Boost which typically uses regular off-the-shelf, Regression Trees. XGBoost uses a unique Regression tree that is called an XGBoost Tree.
Now we need to calculate the Quality score or Similarity score for the Residuals.
Here λ is a regularization parameter.
So we split the observations into two groups, based on whether or not the Dosage<15.
The observation on the left is the only one with Dosage<15. All of the other residuals go to the leaf on the right.
When we calculate the similarity score for the observations –10.5,-7.5,6.5,7.5 while putting λ =0 we got similarity =4 and
Hence the result we got is:
Learnbay provides industry accredited data science courses in Bangalore. We understand the conjugation of technology in the field of Data science hence we offer significant courses like Machine learning, Tensor Flow, IBM Watson, Google Cloud platform, Tableau, Hadoop, time series, R and Python. With authentic real-time industry projects. Students will be efficient by being certified by IBM. Around hundreds of students are placed in promising companies for data science roles. Choosing Learnbay you will reach the most aspiring job of present and future. Learnbay data science course covers Data Science with Python, Artificial Intelligence with Python, Deep Learning using Tensor-Flow. These topics are covered and co-developed with IBM.
As Data Science holds the grand title of sexiest job of the decade the idea of pursuing it becomes still more uneasy, more to that there is the viral notion of “Data Science is the toughest to build career in”, but it is infact much more sorted to become a part of it. All you need to have is smartness and consistency in your efforts. Let us know briefly about it down in the blog.
Quick fact: A Data Scientist is not obliged to have knowledge on all the programming languages that has ever existed, people think a data scientist must know everything but it is too fictitious to be true. You will be asked to have expert knowledge in at-least 2-3 languages and on the tools that are frequently used in the field, get confident with absolute knowledge in those concepts that you choose in the beginning, further get into other concepts slowly and steadily. Make sure you will attain absolute knowledge on the languages and concepts that you start learning with, they act as the foundation for your knowledge over the field.
All the insights given on further are provided directly by Data Scientists in various interviews, merging all of the information we can understand the responsibilities, environment, requirements in the field of Data Science.
Data Science for interns
Getting into the field of Data Science When asked about how to easily get into the field, a data scientist told that there are many ways to do that by coming up with any one of important processes of the field. All that is required is right knowledge upon whatsoever process you choose and must be swift in pacing up with the levels of the processes. There are Data science course available in various colleges and also in education centers that will not only help you have the knowledge on the fundamentals of the field but also support you in finding a deserving company.
Being an intern in Data Science Because of the high popularity the field has gained it would nervous even the brightest scholar on the pressure of being the best, but all one must do is relax because the Data Science activity happens in a team and everything an intern would need is the absolute knowledge on the language and tools they have preferred. A Data Scientist explained how daunting it will be especially as an intern because no matter how many different languages they learn the insecurity of other person is more talented than me would haunt now and then, maybe this is because the dynamic behavior of the field. Every Data Scientist will be put in a team, for the sole reason of it is impossible for any individual to be skilled in all the programming languages.
If we interpret the importance of teamwork in a Venn diagram there will be intersection and overlapping of various languages one among another. This way one pack of team will be knowledgeable of all the required languages but be always humble towards what you do not know, social etiquette is also necessary. As an intern your focus must be on following the patterns of how the activity works, analyse which language will be appropriate to learn because one’s journey of learning Data Science does not end when they get a job, but it starts from there.
Ideal education background to become a Data Scientist This is another issue seen among the aspirants of being from an education background that is nowhere related to technical field. Addressing to this issue a Data Scientist revealed that there are so many data scientists from different background fields like biology, physics, psychology, business and are still triumphing through their way. Even if you are from technical background you will still have to study Data Science because the concepts of it are different from any other technical field, there are specific concepts that must be learnt and practiced, also since it is a dynamic the requirements and essentials will be changing regularly so it is necessary to be well groomed before stepping into the field.
For the people of domain or education background different to it must find themselves a training center that will ease them into the field by providing knowledge of the concepts right from its beginning. Getting into the field is easy but for sustaining in it you will have to pace up your game by handling the toughness and by updating yourself according to the trend in the air.
Useful information by a Data Scientist of sources where you can learn Data Science easy and efficiently: Podcasts: Data Skeptic Blogs: Data Science Central
Exploratory Data Analysis refers to the critical process of performing initial investigations on data so as to discover patterns, spot anomalies, to test hypotheses and to check assumptions with the help of summary statistics and graphical representations.
It is always good to explore and compare a data set with multiple exploratory techniques. After the exploratory data analysis, you will get confidence in your data to point where you’re ready to engage a machine learning algorithm and another benefit of EDA is to the selection of feature variables that will be used later for Machine Learning. In this blog, we take Iris Dataset to get the process of EDA.
Importing libraries:
import numpy as np
import pandas as pd
import matplotlib.pyplot as pltLoading the Iris datairis_data= pd.read_csv("Iris.csv")Understand the data:iris_data.shape
(150,5)
iris_data['Species'].value_counts()
setosa 50
virginica 50
versicolor 50
Name: species, dtype: int64 iris_data.columns() Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width','species'],dtype='object')1D scatter plot of the iris data:iris_setso = iris.loc[iris["species"] == "setosa"];
iris_virginica = iris.loc[iris["species"] == "virginica"];
iris_versicolor = iris.loc[iris["species"] == "versicolor"];
plt.plot(iris_setso["petal_length"],np.zeros_like(iris_setso["petal_length"]), 'o')
plt.plot(iris_versicolor["petal_length"],np.zeros_like(iris_versicolor["petal_length"]), 'o')
plt.plot(iris_virginica["petal_length"],np.zeros_like(iris_virginica["petal_length"]), 'o')
plt.grid()
plt.show() 2D scatter plot:iris.plot(kind="scatter",x="sepal_length",y="sepal_width")
plt.show()2D scatter plot with the seaborn library :import seaborn as sns
sns.set_style("whitegrid");
sns.FacetGrid(iris,hue="species",size=4) \
.map(plt.scatter,"sepal_length","sepal_width") \
.add_legend()
plt.show()
Conclusion
Blue points can be easily separated from red and green by drawing a line.
But red and green data points cannot be easily separated.
Using sepal_length and sepal_width features, we can distinguish Setosa flowers from others.
Separating Versicolor from Viginica is much harder as they have considerable overlap.
Pair Plot:
A pairs plot allows us to see both the distribution of single variables and relationships between two variables. For example, let’s say we have four features ‘sepal length’, ‘sepal width’, ‘petal length’ and ‘petal width’ in our iris dataset. In that case, we will have 4C2 plots i.e. 6 unique plots. The pairs, in this case, will be :
Sepal length, sepal width
sepal length, petal length
sepal length, petal width
sepal width, petal length
sepal width, petal width
petal length, petal width
So, here instead of trying to visualize four dimensions which are not possible. We will look into 6 2D plots and try to understand the 4-dimensional data in the form of a matrix.
Learnbay provides industry accredited data science courses in Bangalore. We understand the conjugation of technology in the field of Data science hence we offer significant courses like Machine learning, Tensor Flow, IBM Watson, Google Cloud platform, Tableau, Hadoop, time series, R and Python. With authentic real-time industry projects. Students will be efficient by being certified by IBM. Around hundreds of students are placed in promising companies for data science roles. Choosing Learnbay you will reach the most aspiring job of present and future. Learnbay data science course covers Data Science with Python, Artificial Intelligence with Python, Deep Learning using Tensor-Flow. These topics are covered and co-developed with IBM.