Loading
Nota de Estudos
Study Reminders
Support
Text Version

Data Preprocessing - Lesson Summary

Set your study reminders

We will email you at these times to remind you to study.
  • Monday

    -

    7am

    +

    Tuesday

    -

    7am

    +

    Wednesday

    -

    7am

    +

    Thursday

    -

    7am

    +

    Friday

    -

    7am

    +

    Saturday

    -

    7am

    +

    Sunday

    -

    7am

    +

o create a model, you should have sound business knowledge of the problem you are trying to solve.
Use the acquired business knowledge to gather relevant data.
Univariate analysis is done with data that has only one variable; it doesn’t deal with causes or relationships.
Bivariate analysis is the simultaneous analysis of two variables (attributes). It explores the relationship between the two variables.
A correlation coefficient is a way to put a value to the relationship.
A Dummy Variable or Indicator Variable is an artificial variable created to represent an attribute with two or more distinct categories/levels.
Existence of the following factors in the data increases the error variance and reduces the power of statistical tests:
Missing values
Outliers
Seasonality
Non-usable variables