A weekend ago, AWS had some service outages. Here are some technical details on how we handled the outage:
# FlyData buffers your data and retries until it loads successfully
We noticed the outage when our monitoring systems started alerting us, but we did not lose any data for our customers, and completed loading data to each customer’s Redshift clusters after AWS recovered. This is how we handle AWS outages on our SaaS backend:
1. Buffer after fetching from MySQL data source to multiple endpoints
Once we fetch the data from the MySQL binary log, FlyData will buffer the data to multiple endpoints, immediately. This means that even if our data server instances go down, we won’t lose any data. Everytime we process, we tag each piece of data that comes through, so that we can make sure the data is consistent between the source and destination.
2. Retry until the next component receives the data successfully
Our internal system is separated into layers. Each layer does some data processing before sending to the final destination, the Redshift cluster. Since each layer has the retry logic built in, we can ensure that each process maintains the integrity of the data along the data path. This also means that if a Redshift cluster is down for whatever reason, we will continue to retry and load. Once the cluster becomes available again, the data will flow without any issues.
# FlyData helps make your life easier
So what does this mean for you? You won’t need to wake up in the middle of the night for an AWS outage if you’re using FlyData! We automatically recover your data upload process without issues, even after an outage occurs. Our service can help ease the burden of using Redshift by providing an end-to-end data integration process.