What Is Data Repatriation?

The steep cost of keeping and using data in a public cloud is increasingly becoming a dealbreaker for companies. Organizations are looking for more cost-effective alternatives, which is why we're seeing more and more cases of data repatriation (the act of pulling data currently in the public cloud and rehosting it on-site or on bare metal).

This article is an intro to data repatriation and the effects (both positive and negative) of leaving the public cloud in favor of on-prem bare metal storage. Read on to learn about the main drivers behind data repatriation and see if pulling back cloud-based data is a sound move for your bottom line.

Companies are not only pulling back data from the public cloud—a rising number of organizations are deciding to rehost cloud-based workloads and apps on-site. Learn more about this trend and why it's gaining steam in our cloud repatriation article.

What is data repatriation

What Is Data Repatriation?

Data repatriation is the process of moving data from the public cloud to self-managed storage (such as an on-site dedicated server or a private cloud). Depending on how much data a company decides to rehost, the repatriation is either:

Data repatriation is becoming more and more common as organizations realize the steep costs of keeping large amounts of data in the public cloud. When you have a massive storage need (for example, if you have several petabytes of unstructured data you regularly access), cloud-based storage is not as cost-effective as more traditional solutions.

On average, cloud-based options cost twice as much for usage as on-site data hosting. This stat remains true even when we account for the overhead required for on-site storage, which includes the price of:

The main reason behind the drastic difference in price tags is egress cost. Providers do not charge you for uploading data, but you pay for capacity and data transfers. Transfers that send data outside the provider's infrastructure are costly and often make up the biggest part of the cloud's monthly bill.

The cost is not the only reason companies choose to repatriate data. Other common causes include:

The hybrid cloud enables you to combine multiple IT environments, so there's no need to fully move away from the public cloud even if you're looking to rehost data on-site. Here are some resources to get you more familiar with this deployment strategy:

Pros and cons of data repatriation

Advantages of Data Repatriation

Like most IT decisions, opting for data repatriation has both pros and cons. Let's take a closer look at the most prominent benefits of pulling back public cloud data.

Cost Reductions and Better ROI

Cost is the primary advantage and the leading reason for data repatriation. While an on-prem data center is expensive to set up, public cloud costs start to add up over time as you pay monthly for:

Sooner or later, your total cloud spending will reach the price of on-site hosting equipment. However, by that point, you've spent those funds on operational expenses of cloud computing, and you will own no hardware despite the investment. From that standpoint, on-site hosting has a far superior ROI.

Keep in mind that cloud-based storage fees are also inconsistent and hard to predict. Projected costs quickly exceed the budget since:

With an on-site storage system, costs do not change based on what you do with your data. If your IT needs grow, you'll need to invest in more hardware, but you'll never go "overboard" with a bill the way you can with cloud-based storage.

Here's a real-life example of how much money a business can save with data repatriation—in 2015, Dropbox pulled 600 petabytes of its data off the public cloud and rehosted it at a private data center. As a result, the company has saved an estimated $74.6 million in data storage expenses.

Hands-On Security Over Your Data

Hosting data in a public cloud means the provider is responsible for storing and keeping info safe. This arrangement is a godsend for some, but it might be a dealbreaker for companies looking for a more hands-on approach to data security.

There are also several unique concerns when you keep data in a public cloud:

While failures by providers are rare, public cloud users must know they are a possibility. In August 2018, an AWS error exposed business-critical data of about 31,000 systems belonging to GoDaddy. If the company kept the data on-site, this incident would not happen.

Repatriating data enables a level of proximity to and physical control of data the public cloud cannot offer. You also limit the attack surface by reducing the number of events that can go wrong with your data. Think of it as keeping money in a safe at home versus a safe at a bank—the bank is secure, sure, but you have no saying in how they protect safes, plus they are a prime target for robberies. From that standpoint, there's a strong case that your money is safer at home.

While direct control over data is vital for some use cases, providers go to great lengths to protect cloud-based data. Our article on cloud storage security offers an in-depth look at all the typical measures vendors use to keep customer data safe.

No Risk of Vendor Lock-In

Vendor lock-in occurs when a company becomes too dependent on a cloud provider. If you store data at a vendor for too long and build an app architecture around that storage, lock-in will naturally grow over time. You are then unable to switch to another platform (whether in-house or belonging to another provider) without high switching costs.

Data repatriation ensures your storage never relies on any provider more than your in-house team. Your staff manages the data set, and there is no risk of "getting stuck" with any third party.

Data repatriation explained

Better Latency

While the public cloud provides almost limitless storage capacity, your ability to access and use cloud-based data depends on the Internet connection. Operations suffer lags if you perform processing in-house and the connection to public cloud data is slow.

Lag may not be a problem for some use cases (such as backup and recovery or email operations), but it is detrimental for some workloads, like: 

If you have a latency-sensitive app that relies heavily on a data set, hosting info on-site (or using an edge server) provides much less lag than using a public cloud. You shorten the communication path, plus the in-house team has an opportunity to fine-tune storage, compute, and networking resources to suit the app.

Easier Compliance with Data Regulations

Public cloud providers (especially hyperscalers) work hard to meet government and industry requirements like HIPAA and PCI. However, there's a major concern with meeting regulations in the public cloud: data location. If your business falls under a statute requiring data hosting in a specific region, using the public cloud could land you in a world of legal (and financial) trouble.

Instead of setting up cloud servers in specific regions and relying on a third party to not move info, some organizations prefer to take full control and relocate data to an on-prem system.

Are you storing the PII (personally identifiable information) of your clients? If yes, there's a chance your company must adhere to data-related compliance—check out our GDPR vs CCPA article for an in-depth comparison of the two most prominent data regulations.

Disadvantages of Data Repatriation

Here are the most noticeable challenges and drawbacks of opting for data repatriation:

Unable to assess whether a workload, app, or data set belongs in the cloud or on-site? Our article on on-prem vs cloud hosting helps pick an optimal environment for your software.

How to Repatriate Data?

Monitor the use of cloud resources and periodically compare those costs to alternative storage methods. If it becomes evident that another type of storage offers a higher ROI for your use case, it's time for data repatriation. This process looks like this:

Looking to pull back public cloud data but concerned the team might have difficulty adjusting to a lack of cloud agility? Enter Bare Metal Cloud—BMC enables you to store data on a dedicated physical server yet manage the environment with cloud-like speed and simplicity.

Considering to repatriate a database

When to Repatriate Data?

You should repatriate data when it becomes evident that moving away from the public cloud would benefit one (or more) of the following business fronts:

Here are a few common scenarios in which data repatriation is the right business move:

Data repatriation is just one of the rising trends in the cloud world. Learn what else is going on in our article on cloud computing trends.

Don't Fear Data Repatriation and Always Go for the Optimal Storage Solution

Data repatriation is about pursuing new IT opportunities, optimizing spending, and improving app performance. These three factors come before your commitment to the public cloud, so ensure your team always considers hosting alternatives for every database, workload, and service running in the cloud.