The New Alison App has just launched Download Now

Logistic Regression, LDA & KNN in Python - Predictive Modeling

A free online course about the classification techniques involved in logistic regression, LDA and KNN in Python.

Free Course
Conducting a preliminary analysis of data using a univariate analysis before running a classification model is essential. In this free online course, you will learn how to solve business problems using the logistic regression model, linear discriminant analysis and the k-nearest neighbors technique in Python. Increase your classification techniques, knowledge and skills by studying this comprehensive course.





View course modules


Are you looking to become an expert in solving real-life problems using different classification algorithms in Python? This course will give you the ability to interpret the outcomes of a logistic regression model in Python. You will be able to use these results when making strategic decisions in your organization. Gain insight into the methods of dispersion, which will help you understand the spread of a data set, namely range, standard deviation and variance. Discover that when comparing centres, the mean is not always the best measure of central tendency as outliers heavily influence it, which is the principal reason why the median is preferred over the mean. You will be taught about the advantages of using the mode to measure centres, including the fact that it can be calculated for both quantitative and qualitative data, and that mean and median can only be used for quantitative data. The course will also introduce you to the main Python libraries, which are Pandas, NumPy and Seaborn.

Discover the first key steps in building a machine learning model, where you convert your business problem into a statistical problem, define the dependent and independent variables, and identify whether you want to predict or infer. You will learn about training data and testing data, where training data refers to the information used to train an algorithm and testing data includes only the input data and is used to access the created model’s accuracy or the predictor function made using the training data. Uncover the importance of handling missing values in real-world data and the importance of managing it appropriately since many machine learning algorithms do not support data sets with missing values. You will then study the most common methods of imputing missing values, which are segment-based imputation, impute with zero and impute with median, mean or mode.

Next, you will learn about the linear discriminant analysis technique, which is based on Bayes’ theorem, as the preferred method when the response variable has more than two classes. Discover how, with a given set of predictor values, you can use this technique to calculate the probability of a particular observation belonging to each group and assign the group with the highest probability to that observation. You will then identify the drawbacks of the k-nearest neighbors technique, including that it does not mention each variable’s relationship and the response variable. Finally, you will learn about interpreting the classification models’ results, creating a confusion matrix in Python, evaluating model performance, and dummy variable creation in Python. This course will be of interest to data scientists, executives or students interested in learning about classification techniques. Why wait? Start this course today and become a classification model and problem-solving expert.

Start Course Now