This is a guest post co-authored by Nafi Ahmet Turgut, Mutlu Polatcan, Pınar Baki, Mehmet İkbal Özmen, Hasan Burak Yel and Hamza Akyıldız from Getir.
Can is the pioneer of ultra-fast grocery delivery. The tech company has revolutionized last-mile delivery with its “groceries in minutes” delivery proposition. Getir was founded in 2015 and operates in Turkey, the UK, the Netherlands, Germany, France, Spain, Italy, Portugal and the US. Today, Getir is a conglomerate that incorporates nine verticals under the same brand.
Predicting future demand is one of the most important ideas for Getir and one of the biggest challenges we face. Getir relies heavily on accurate demand forecasts at the SKU level when making business decisions in a wide range of areas, including marketing, production, inventory, and finance. Accurate forecasts are necessary to support inventory hold and replenishment decisions. Having a clear and reliable picture of expected demand for the day or week ahead allows us to adjust our strategy and increase our ability to meet sales and revenue targets.
Getir used Amazon Forecast, a fully managed service that uses machine learning (ML) algorithms to deliver highly accurate time series forecasts, increase revenue by four percent, and reduce waste costs by 50 percent. In this post, we describe how we use Forecast to achieve these benefits. We describe how we built an automated demand forecasting pipeline using Forecast and orchestrated by AWS Step Functions to predict daily demand for SKUs. This solution led to highly accurate forecasts for more than 10,000 SKUs in all the countries where we operate and significantly contributed to our ability to develop highly scalable internal supply chain processes.
Forecast automates much of the time series forecasting process, allowing you to focus on preparing your data sets and interpreting your forecasts.
Step Functions is a fully managed service that makes it easy to orchestrate distributed application components and microservices through visual workflows. Building applications from individual components, each performing a discrete function, helps you scale more easily and change applications more quickly. Step Functions automatically trigger and track each step and retry when there are errors, so your app runs in order and as expected.
Solution Overview
Six people from Getir’s data science team and infrastructure team worked together on this project. The project was completed in 3 months and deployed to production after 2 months of testing.
The following diagram shows the architecture of the solution.
The model pipeline is run separately for each country. The architecture includes four Airflow cron jobs that run on a defined schedule. The pipeline begins with function creation, which first creates the functions and uploads them to Amazon Redshift. A function processing job then prepares the daily functions stored in Amazon Redshift and downloads the time series data to Amazon Simple Storage Service (Amazon S3). A second Airflow job is responsible for triggering the Forecast pipeline through Amazon EventBridge. The pipeline consists of Amazon Lambda functions, which create predictors and forecasts based on parameters stored in Amazon S3. Forecast reads data from Amazon S3, trains the model with hyperparameter optimization (HPO) to optimize model performance, and produces future predictions for product sales. The Step Functions “WaitInProgress” pipeline is then activated for each country, allowing parallel execution of a pipeline for each country.
algorithm selection
Amazon Forecast has six built-in algorithms (ARIMA, ETS, NPTS, Prophet, DeepAR+, CNN-QR), which are grouped into two groups: statistical and neural/deep network. Among those algorithms, deep/neural networks are more suitable for e-commerce forecasting problems, as they support item metadata features, forward-looking features for marketing campaigns and activities, and most importantly, time-series features. related. Deep network/neural algorithms also work very well on sparse data sets and cold start scenarios (introduction of new items).
In general, in our experiments, we found that the neural/deep network models performed significantly better than the statistical models. Therefore, we focused our deep-dive tests on DeepAR+ and CNN-QR.
One of the most important benefits of Amazon Forecast is the scalability and accurate results for many combinations of products and countries. In our tests, both the DeepAR+ and CNN-QR algorithms were successful in capturing trends and seasonality, allowing us to get efficient results on products for which demand changes very frequently.
Deep AutoRegressive Plus (DeepAR+) is a supervised univariate forecasting algorithm based on recurrent neural networks (RNN) created by amazon research. Its main advantages are that it is easily scalable, capable of incorporating relevant covariates in the data (such as related data and metadata), and capable of forecasting cold start elements. Instead of fitting separate models for each time series, build a global model from related time series to handle widely varying scales through scaling and rate-based sampling. The RNN architecture incorporates binomial probability to produce probabilistic forecasts and the authors of DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks.
Finally, we selected the Amazon CNN-QR (Convolutional Neural Network – Quantile Regression) algorithm for our forecast due to its high performance in the backtest process. CNN-QR is a proprietary machine learning algorithm developed by Amazon for forecasting scalar (one-dimensional) time series using Causal Convolutional Neural Networks (CNNs).
As mentioned above, CNN-QR can use time series and related metadata about the items being forecasted. The metadata should include an entry for all unique items in the target time series, which in our case are the products for which we are forecasting demand. To improve accuracy, we used category and subcategory metadata, which helped the model understand the relationship between certain products, including complements and substitutes. For example, for drinks, we provide an additional flag for snacks as the two categories are complementary to each other.
A significant advantage of CNN-QR is its ability to forecast without related future time series, which is important when you cannot provide related functions for the forecast window. This capability, coupled with the accuracy of its forecasting, meant that CNN-QR produced the best results with our data and use cases.
forecast output
Forecasts created through the system are written to separate S3 buckets after they are received by country. Forecasts are then written to Amazon Redshift based on SKU and country with daily jobs. Next, we carry out a daily planning of the stock of products based on our forecasts.
On an ongoing basis, we calculate Mean Absolute Percentage Error (MAPE) ratios with product-based data and optimize feature and model ingestion processes.
Conclusion
In this post, we look at an automated demand forecasting pipeline that we built using Amazon Forecast and AWS Step Functions.
With Amazon Forecast, we improved our country-specific MAPE by 10%. This fueled a four percent revenue increase and reduced our waste costs by 50 percent. In addition, we achieved an 80 percent improvement in our daily forecast training times in terms of scalability. We can forecast over 10,000 SKUs daily in all the countries we serve.
To learn more about getting started building your own pipelines with Forecast, see Amazon Forecast Resources. You can also visit AWS Step Functions to learn more about creating automated processes and orchestrating and creating machine learning pipelines. Happy forecasting and start improving your business today!
About the authors
Nafi Ahmet Turgut he finished his Masters in Electrical and Electronic Engineering and worked as a graduate research scientist. His approach was to build machine learning algorithms to simulate neural network abnormalities. He joined Getir in 2019 and currently works as Senior Data Science and Analytics Manager. His team is responsible for designing, implementing and maintaining end-to-end machine learning algorithms and data-driven solutions for Getir.
happy polish is a Staff Data Engineer at Getir, specializing in designing and building cloud-native data platforms. He loves to combine open source projects with cloud services.
pinar baki received his Master’s Degree from the Department of Computer Engineering at Boğaziçi University. She worked as a data scientist at Arcelik, focusing on parts recommendation models and age, gender and emotion analysis from voice data. She then joined Getir in 2022 as a Senior Data Scientist working on forecasting and search engine projects.
Mehmet Ikbal Ozmen received his master’s degree in economics and worked as a graduate research assistant. Her research area was mainly time series economic modeling, Markov simulations and recession forecasting. She then joined Getir in 2019 and currently works as a Data Science & Analytics Manager. Her team is responsible for forecasting and optimization algorithms to solve the complex problems experienced by supply chain and operation businesses.
Hasan Burak Yel received his Bachelor’s Degree in Electrical and Electronic Engineering from Boğaziçi University. He worked at Turkcell, mainly focused on time series forecasting, data visualization, and network automation. He joined Getir in 2021 and currently works as a Lead Data Scientist with responsibility for Search and Recommendation Engine and Customer Behavior Models.
Hamza Akyıldız received his Bachelor’s Degree in Mathematics and Computer Engineering at Boğaziçi University. She focuses on optimizing machine learning algorithms with her mathematical background. He joined Getir in 2021 and has been working as a data scientist. He has worked on projects related to Personalization and Supply Chain.
Esra Kayabali He is a Senior Solutions Architect at AWS specializing in the analytics domain, including data warehousing, data lakes, big data analytics, real-time and batch data streaming, and data integration. He has 12 years of experience in architecture and software development. He is passionate about learning and teaching cloud technologies.