What is Data Warehouse Testing?

Data warehouse testing is a process that is used to inspect and qualify the integrity of data that is maintained in some type of storage facility. The idea behind the testing is to make sure the data has not experienced any type of corruption and remains complete and retrievable when and as needed. Regular testing of stored data makes it possible to identify any issues that may be developing and correct those problems before the stored data becomes fully corrupted and can only be partially reconstructed using some type of data recovery process.

In many ways, data warehouse testing is very similar to any type of testing done to ensure the integrity of information stored on a computer hard drive or some remote storage device. The data contained in the warehouse is systematically checked using a software program that reads each file or other data source to make sure it remains fully intact and accessible. Some types of data warehouse testing software have the capability to correct a limited range of errors as part of the overall testing process. Others simply compile a listing of the exceptions, allowing the user to evaluate each exception individually before any action is taken.

Data warehouse testing usually uses a system-triggered model. This simply means that the software uses a basic formula known as ETL, or extraction-transformation-loading. The idea is to compare the current condition of the data with the condition of the information when it was first warehoused. If any errors are identified, the data is flagged for further review. In most cases, the errors or exceptions are minor and can be repaired with relatively little effort, either using protocols built into the testing software or by review by an analyst who can either approve the repair or dismiss the exception as really being some type of corruption.

The basic process of data warehouse testing is much like testing any type of electronic transaction of information. The information is examined in blocks or cells which are then cleared or noted for any exceptions that the software has identified before moving on to the next block. Once the process is complete, a recap of the testing is compiled, including information on the types of exceptions found and whether those exceptions were corrected during the testing or are waiting for manual review. As with any type of system testing, it is a good idea to conduct data warehouse testing on a regular basis in order to ensure the information remains complete and free of any type of corruption.