I’m going to be working on a BIG data migration project this year and I’m wanting to learn from the community. I have worked on a similar project before when we moved from a data center to the cloud, however, it was awhile ago!
I had a good read of some articles on data migration here on MoT. I’m more wondering about your experience - how did it go? What were the challenges and tooling you used? Is there anything you would do differently when testing a data migration?
Thanks in advance and looking forward to your responses.
In my experience it can be really time consuming to set up the basis from where your migration starts. Having one or multiple images that can be applied before you start your migration test might save a lot of time.
Of course - like always - it depends totally on your cricumstances.
I’m currently also in a data base migration project. Lucky we can start the new system mostly from scratch. The product is about processing much data everyday with nearly no history.
Therefore the data migration itself isn’t a big problem.
As our customers develop partly some SQL on their own, we sometimes say that our product is an advanced database editor. Just to give you an idea what it is about.
Which leads to my focus and big learningsof the lasts months:
It’s one thing to migrate data, it is another thing to ensure that product still works as before (or we find differences we had to accept, but communicate).
We have much SQL which needs to be converted.
One risk are so called “false friends”, functions which are written equally on both systems, but create different results.
Another is that Oracle converts empty strings into null automatically while the other db treats them different.
I mostly test by processing the same input data (multiple GBs) at an old and a new system and compare the output.
I had to spent much effort into getting the data into a comparable format. I dump tables into CSV files and compare them. For easy cases is sufficient to do it with the normal db tool (but both dbs with the same tool to have the same sorting applied).
Partly I develop my own tools for dumping and comparing the data. Especial sorting is important and not that easy doable by hand (X000 tables).
My generic approach is to first sort a table by column names (a->z) and then by rows (each as concatenated string), after I removed generic IDs.
It occurred from time to time that I had to transfer changes from the new system back to old one to have the data comparable. e.g. renaming of columns.
To get the theoretically same data on both systems you need to know and execute all dependencies in the same order.
For some parts I first had to explore and document features deeper than before to be sure that I do the exact same execution on both systems.
Also I sometimes accidentally execute a wrong command, screw up one of my tests systems and needed hours to get back to the same point.
I do everything on my powerful laptop. Our test servers are way to weak for that.
I can checkout every branch, built, deploy and migrate locally.
Is there a code change happening at the same time, beyond what’s needed to support the data migration? I.e. will there be new features as well as data in a new home?
Is it possible to run the pre-migration state and the post-migration state in parallel for a transition period? I.e. you have two copies of the system running in parallel (pre and post-migration). You keep them in sync e.g. all requests get copied to both. You start with the old system being the system of record, and then once you have confidence in the new one it becomes the system of record.
A wrinkle on this is instead of switching the whole thing over in one go, you migrate the data piecemeal e.g. by batches of customers or whatever makes sense for your world. The new system will be the system of record for the migrated customers while the old system is for the not migrated customers.
Instead of or as well as some kind of functional way of splitting data into batches (e.g. by customer) you can split by time. I.e. you start by migrating only the most recent data plus any reference data needed for that to make sense, and then bring the older data gradually. E.g. you first bring in recent customer orders, plus the data about all products as they’re referred to by the orders. Then you bring in older orders.
A nasty issue is integrations with third parties e.g. payment systems. It’s often well worth the effort to test the new system with stub integrations as early as possible.
I realise that most of this is about the migration itself rather than testing it, but I see them as all ways of mitigating risk. I.e. you do the migration in a way that reduces risk e.g. old and new systems in parallel, and you do testing on top to confirm things are going to plan. The risk reducing steps are likely to make the testing smaller or simpler in some way - e.g. you have old and new to compare, or you have only new data migrated across to start with so you have a smaller system to test etc.
I had a series of semi-planned articles on NFR testing, and this was a topic. Never got around to it - although ironically I scrum mastered a big data migration project.
I’m assuming this is just moving from one DB to another in an identical form. It gets a little more tricky if you’re talking about a transform (sometimes it’s not just moving the data, but reshaping it as it moves).
Here is an email I got a while from a colleague who was doing similar to you:
Hi Mike
Was having a think about what you might want to cover in the testing and outlined below. This is assuming that it’s a simple lift and shift operation, as in no filtering\data conversions etc
Target table testing
This can be done once the target table has been created and before the migration has been done
Check all fields match the spec (or source table if not spec’d). For this check
All columns are present
All columns have the correct data type, e.g VarChar, nVarChar etc
Column lengths match
Null\Not Null values match
Primary and Foreign keys match
Analyse the source data
Again, this can be done before any migration has been done. The idea here is basically to look for any data that may be risky. Given that you said it is a user table, the type of thing may be hyphens, apostrophes, macrons etc. Create a test case to verify these after migration
Reconciliation
Do a total reconciliation
Do a partial recon, e.g. Select count (*) from firstname where name = ‘Alan’
Do a fuzzy recon, e.g. Select count (*) from firstname where name like ‘Al%’
Do a recon on multiple fields, e.g. Select count (*) from firstname where name = ‘Alan’ and birthyear = ‘1980’
Verify some of the records migrated
This ensures that the mapping is correct. Best to take a record with all fields populated in the source. The idea here is to check for things like first name is mapped to firstname and not lastname. Dates are another on to watch out for
Functional testing
Verify that the system behaves OK with the migrated data. Assuming it’s a user, log in search for a user, update the user and save. Looking the DB, find someone that has all UI display fields migrated and they display as expected on the UI. (Be aware these details may appear on multiple UIs)
Functional testing is a must. Even if the data is not converted, then configuration files or other environmental factors can lead to problems.
For one big migration I did a test of the backip and restore procedure after a talk with the system administrator. If something goes wrong during the backup, is the data still usable?
In the past I noticed that the planning of the actual migration was underestimated:
how long does it take to move the data from the source environment to the target environment? Did you measure it?
at which points of the migration, which tests take place?
are rollback procedures in place in case of a failed migration?
when and how are users notified of any disruptionns in the services?
Etcetera
During two big data migrations I made a deployment plan with people in different roles. It wsa a serie of one on one talks. For a web site conversion and upgrades of interacting componennts I organised a big meeting with all involved parties. Here are some blog posts about having a meeting with more than 10 people:
There are probably other more interesting risks. Data warehouse migration is an OPS problem in my mind, but things like a full system account migration/upgrade that we are going to do soon where I work will rewrite every record in the system over the course of a year. So doing it incrementally is a huge relief to all.
It’s a huge dataset, and well I think we all know we have broken and corrupted data in our databases. Surely a test that merely verifies broken data replicates to another database, will merely cost effort if the broken data is not sanitized first?
I suppose this overs both implementation and test. I only ever do big bang. It’s most annoying (I never check my blood pressure either side of a migration).
On the technical side, lots of summing, totalling, cross checking, massaging (where there are differences in the number of decimal places, for example), extracting and diffing. Be aware of static (e.g. a service class) and dynamic (e.g. a customer record with changing balances) data. Check if the client is amenable to a data freeze of the static data prior to the full migration (and get that out of the way beforehand).
Ideally on a full copy of the clients data. Otherwise a good, controlled subset is vital. The tester should be deeply familiar with dealing with meta data and all the little command line tools for slicing and dicing data.
If the client has data that is deleted after a period (e.g. 30 days worth of client transactions), it is also worth checking if the client is happy to access these on the old system. It’s potentially quite a lot less work to implement and test.
In general, the bigger the data set you can get your hands on (preferably a full a copy of the client’s data, if feasible*), the better since you prove the robustness of the process, learn the timings and know exactly what you’re dealing with once at the very pointy end of the job.
Oh, and from a test perspective you can’t be paranoid enough. Sorry.
That really does depends on scale though. There are terrabytes. Then there are petabytes… Then there are the myriad of privacy laws and regulations to work around.
I’m in the midst of trying to lead non-test professionals (business analysts & users) through a total dat a migration from one CRM system to another. There have been two test runs of the conversion scripts so far (which are being written by consultants). It’s going badly in the sense that the second test migration had more errors in it than the first.
That means another test run of the migration has to happen before we get to the go-live, and I’ve been asked, “isn’t there a better way to run these conversion tests?” Well, we were doing a partial reconciliation based on the UI, because we have non-technical folks to do the testing (not SQL users). Then we added in functional tests to start to check out workflows. But there are so many errors, just the bug writing is bogging everything down. I don’t really know how to come up with a magical “better way” to test the migration. Any suggestions?