This post was extensively updated and incorporated into my new book, The Data Conversion Cycle: A guide to migrating transactions and other records for system implementation teams. Now available on Amazon.com in both Kindle format for $4.49 and paperback for $6.99.
This is the last article in the series on the data conversion cycle. As is frequently noted, you can’t manage what you don’t measure. However, as both the Hawthorne experiments and Werner Heisenberg found in the 1920’s, the act of measuring a phenomenon influences the object under observation. So the trick is to measure carefully, so that any influence your measurement has is at least neutral, and preferably desirable. Consequently, I’m going to close out this (admittedly interminable) series on the data conversion cycle with considerations for assessing data conversion process quality, as the team “learns how to move data.”
As previously noted, a beneficial side effect of an iterative approach to data conversion is that the team eventually gets good at it. But what constitutes “goodness?” For most projects, “good” would be defined as error-free, fast, and predictable. The trick is expressing those attributes in such a way as to make them measurable, without driving one at the expense of the others. To that end:
- Error rate: the number of number of corrections to be made in the target system subsequent to the load, divided by the number of records loaded. This ignores the “learning” errors in mapping or extraction processes in order to concentrate on outcomes.
- Extraction time: total time (as opposed to work hours) from copy of the source system to the extraction and formatting of records, to the transfer of the extracted data to the load team.
- Load time: total time to load the formatted records to the target system.
- Validation time: total time required for validation of the load.
- Predictability: sum of Extraction time, Load time, and Validation time, divided by the predicted time required for them. A value of 1.0 means that the process is absolutely predictable, whereas variances from 1.0 indicate the degree of uncertainty.
Plainly, the error rate is critical to the users of the system, as they will have to make any needed corrections. Also, the more time it takes for the extraction, load and validation, the longer the users will be unable to enter transactions, and the more transactions will accumulate for entry once the target system is finally available to the users. But predictability is vital to both the users and the conversion team, as a tight, accurate cutover schedule is in everyone’s best interest. The ability to minimize the unknowns (read: risks) in the cutover to production is largely a function of the predictability of the process.
Tracking these metrics in each cycle will give the project team the ability to measure improvements, but also guide decision making on where to expend resources. On most projects, improvements in the validation processes will reduce validation time, with the side benefit of improving predictability. Driving automation of the extraction processes will usually produce the same benefits, frequently with the added benefit of a reduced error rate. But in order to get the best return on investment, it is useful to analyze the metrics from each conversion, so that efforts to reduce the error rate increase the extraction and load times more than necessary. Measurements allow for trade-offs, so you don’t go past the point of diminishing returns on any one metric.
Thanks for reading through all of these posts over the last two months. As previously mentioned, I plan to consolidate these posts into a Kindle book. Special thanks to Samad Aidane for the “blog a book” idea!