What Is Data Validation?

Data validation is the process of checking a program or database to ensure the information is standard and clean. There are different standards, based on the programming language used or the type of information collected and stored in the database. The process can be simple or very complex. A variety of tests can be used to ensure validation. If data are not regularly validated, then this can lead to security problems, because hackers will have less trouble sneaking into unorganized and non-standard coding.

There is no universal standard for data validation; rather, the standard depends on what information is being validated. For example, some programming languages allow underscore marks to be used in lieu of spaces to connect several words, while other programming languages do not allow underscore marks. These kinds of differences mean all data validation must be specific to the data; otherwise, it can cause problems and inconsistencies with the data’s standards.

Performing data validation can be simple or complex. A simple validation procedure would be checking a database of phone numbers and ensuring that letters and non-standard symbols — such as the percentage symbol or a dollar sign — are not included. More complex validation procedures check to ensure that programs reference the correct files and that there is no corrupted code in the program.

To ensure data validation, there are many tests that can be used, most of which are handled by a validation program. For example, a consistency check will ensure that all records are consistent; if a record is supposed to have a name followed by a phone number, then a consistency check will ensure all records follow this order. Limit and range checks will look through numbers in the program or database and ensure the numbers are not too high or out of range. For databases and programs that cannot have redundant data, a uniqueness check will make sure that each record is unique.

Aside from making data work better and ensuring standard coding or inputs, data validation also helps protect against hackers. When data is disorganized, or non-standard, it has a high potential of being corrupted and working poorly. This means a hacker can sneak into the system easier than if all the data were valid. For example, when the code becomes corrupt, it will be prone to changes; this means a hacker can infiltrate the system and change coding to open holes or steal information without being easily detected.