(If you haven't read Part 1 yet, check it out here.)
Lack of data in time series analysis is a recurring problem.
How we explore Part 1simple imputation techniques or even Regression based models: linear regression, decision trees It can take us very far.
But what if we I need to handle more subtle patterns.And capture detailed fluctuation in complex time series data?
In this article we will explore K-nearest neighbors. The strengths of this model include few assumptions regarding nonlinear relationships in your data; therefore, it becomes a versatile and robust solution for missing data imputation.
we will be using the same simulated energy production data set that you've already seen in Part 1, with 10% missing values, entered at random.
We will attribute missing data using a data set that you can easily generate yourself, allowing you to follow and apply the techniques in real time as you explore the process step by step.