This post was extensively updated and incorporated into my new book, The Data Conversion Cycle: A guide to migrating transactions and other records for system implementation teams. Now available on Amazon.com in both Kindle format for $4.49 and paperback for $6.99.
Last week, I continued this series on the data conversion cycle with a look at how to develop legacy system extraction processes. This week, we’ll look at actually executing those processes, and then loading the resulting files into the target system.
Once the extraction and translation processes have been developed and unit tested, and the sequence of execution scripted, you are ready to create the load files. However, first determine whether this load will require a refresh of the source from production. Remember: always extract data from a static system, copied from production at the correct point in the production cycle. In most cases, it will matter whether certain activities are in process. You don’t want to extract payroll data from a copy taken in the middle of a payroll calculation, for example.
During execution of the extraction script, be sure to note any exceptions, error conditions, or failures. It may be necessary to halt the process, correct the problem, and start over. If so, don’t fail to alert the developers of the extraction and conversion processes, and if corrections to the data are required, the people responsible for maintenance of the data in the production system. This is a part of the feedback process needed to implement the lessons learned from one cycle to the next. As previously noted, it is important to maintain an appropriate level of security for the data records, in transit and at rest. Don’t just Email files to the next person in the chain! This includes any exceptions or error reports. Just because the information is no longer in the system of record doesn’t mean it isn’t subject to the same controls.
Once the extraction and conversion process is complete, transfer control of the resulting load files to the team members responsible for running the loading tool. Note that the loading process will typically require a certain sequence to be followed, and it won’t necessarily be the same sequence in which the load files were created. Consequently, while it might be tempting to commence loading as soon as possible, keep in mind that rework by the extraction team might make it impossible for the load to be completed. Especially in the first conversion cycle, be aware of differences in dependencies.
Typically, the loading tool will validate the records to be loaded against other records and configurations already in the system. You should have a protocol in place for how to handle these exceptions, updated for each conversion cycle. Any manual corrections to the load files need to be communicated back to the appropriate people – source production system owners, extraction process developers, or configuration team – responsible for preventing the error in the next cycle. If some records can’t be loaded, a determination should be made on the impact of the failure, and how to proceed with subsequent load files. You should also continuously update progress in loading the files. Note execution / load time and the number of records loaded, to facilitate planning of subsequent loads.
At the conclusion of the loading script, ensure appropriate audit reports are run to identify problems such as orphaned records, missing or incomplete records, or other problems. Note that this is different from validating the load, as the goal is assessing readiness for processing. The load validation process addresses the completeness and accuracy of the records loaded into the system, as compared to their representation in the source system. I’ll address the validation process next week. In the meantime, if you have any comments or lessons learned from past data conversion projects to share, please leave a comment below.