This post was extensively updated and incorporated into my new book, The Data Conversion Cycle: A guide to migrating transactions and other records for system implementation teams. Now available on Amazon.com in both Kindle format for $4.49 and paperback for $6.99.
If you are replacing a legacy system with a bright, shiny new system, one question almost always come up – what do we do with our historical data? It’s a pretty obvious decision to move over transactions in progress, or recent transactions that will be used in processing future transactions. But what about the other records in the system that won’t be used in future processing, but aren’t “old enough” to simply delete, based on your record retention schedule? What is the rationale to apply for converting, retaining, or otherwise dealing with historical data?
First, consider the possible use cases for these records. Auditing, where you might be spot checking for compliance with processes, or for accuracy? Internal checks on single records, in response to an executive request for information on a transaction? External requests, possibly in response to a subpoena or a request from a regulatory agency? Is it possible someone might need to conduct queries of a class of records, or perform a statistical analysis? For each use case, consider the frequency of the usage events, the required response time, and how much money your organization is willing to spend to provide that service.
Second, ask who has a legitimate need to access the records, and under what circumstances? The obvious answer is, “Whoever has access to them in the legacy system,” but that might not be the case. Is your operations model changing, along with the system? Are you changing business processes, moving to a shared services or other new model? It might be that the historical records in question can (or should) be accessed separately from current records. If so, consider the incremental costs associated with converting the records for storage in the new system, as opposed to storing them off-line, or in the legacy system, in read-only mode.
Third, determine the costs associated with retaining the old application in a production environment, including server, license, and other support costs. If you are replacing the old system at least in part in order to get the operating costs off your budget, it might not be feasible to leave the data in place. What are the costs you’re will to accept, as part of the project, to eliminate the need to keep the old system “plugged in?”
Finally, consider storage alternatives to either leaving them in place or migrating them to the new system. If you don’t expect to conduct broad queries in support of statistical or other analyses, do you even need them to be stored in an indexed, tabular form? Can you simply export the data you need to retain as reports, and preserve them in PDF or another searchable format stored on a file server? If you do need it in a form that can be queried using SQL, can you store the critical records in a less costly system, like de-normalized reporting tables in a database on a production server, outside the old application? Or even in Access? If you expect to need the records for statistical analysis, would it make more sense to simply import them into your analysis package?
Note that you might not have to move the data to the “final resting place” at the same time you move the new system into production. It is fairly common for legacy systems and their data to be left on line in “read-only” mode during the initial few days or weeks after cutover, to support “sanity checks” of the current records. In any case, be sure you include the disposition of the legacy system and the data that won’t be converted in your project plan, so that you don’t leave a mess for the users and the production support folks.