At Vinted, our long-term strategy is to build an international marketplace where users can trade between countries.
Initially, we had separate legal entities, core service instances and databases to serve our users in the countries we operate in.
We called them portals. For example, www.vinted.lt and www.kleiderkreisel.de were distinct portals.
To achieve our goal of building an international marketplace, we needed to merge them into a single entity.
This led us to solve several challenges here at Vinted engineering:
- How to create a seamless flow to help our users migrate
- How to migrate the data between our portals
Migrating data between our portals was an interesting engineering challenge.
In this post, I will share some details on how we did it and what we have learned.
Data migration solution
Abstract diagram

Figure 1
At its core, the data migration was a three-step process:
- Extract and serialise data from the source portal
- Push data to an intermediary database. We chose MySQL for this purpose. This was the job of the
Serializer
classes - Fetch data from the intermediary database, transform it and insert it into the target database. This was the job of the
Creator
classes.
The Portal Merge Database is needed because:
- The migration happens asynchronously. Our background jobs fetch and insert data into it.
- It supports migration job retries. We track migration progress for every model being transferred.
- It’s a place to store ID mapping between source and target portals when migrating related data such as user followers, liked items and more.
- It stores other data related to the migration. This solves various edge cases, such as tracking sold migrated items, invalid migration records, etc.
Why solve these at the application layer?
Because our portals are operated as separate legal entities, we had to ask our users’ permission to migrate their data.
This is due to data privacy laws in the EU. Our users agreed to migrate all or certain parts of their data in a form like this:

Figure 2
Migration states
nil
. Default value. Meaningful in target portal until data serialisation has been completed.Pending
. Migration object ready to be picked up by data serialisation job in the source portal and data migration job in the target portal.In progress
. Respective job in progress.Completed
. All user models are successfully transferred to the target portal database.Failed
. After exhausting the retry limit, a migration is marked as failed.
Moving users between these states allowed us to track how well our system behaved.
Our system was also built to be re-entrant. That means we could run the migration as many times as needed.
In case of failures, we would move a user from Failed
to Pending
for a restart.
This field was also used to display feedback to our users.
We chose to have a shared database between the portals because it was the simplest short-term approach.
As code is shared between the instances, it saved us precious development time so we could get this solution to production as soon as possible.
For example, we did not need to build an API and HTTP clients for fetching and posting data.
This tightly coupled our core service instances. It introduced a problem when migrating the database schema.
Multiple core instances would try to run migration scripts on the same database and crash.
To avoid this problem, we patched our migrations to run only on the target portal production and staging environments:
ActiveSupport.on_load(:active_record) do
if Rails.env.production?
module SharedDbPatch
def with_connection_of(model)
return if model.is_a?(SharedDbModel) && %i(target target_sandbox).exclude?(PORTAL_IDENTIFIER)
super
end
end
ActiveRecord::Migration.prepend(SharedDbPatch)
end
end
The control for accessing this database was placed on the migration statuses belonging to each core instance