ID elements and their check digits
The cornerstone of your entire compliance framework is going to be how you track and identify the individual elements in your lists. Seriously.
Most folks don't think long term when they start to put together compliance programs. They don't think that the stack of compliance regulations on their desk as a list of regulations. They don't think of the stack of audit questions as a list of audit questions.
We do. Which is probably why we built the Unified Compliance Frameworkâ˘. So lets start with the ID elements.
IDs must be consistently formatted numbers
Think of your Social Security Number or Passport number. The format of NNN-NN-NNNN for US Social Security Numbers is a consistent format for every person who was ever issued a number. Some number will start with a zero, such as the last number set in 312-98-0422. And that's okay, as there is as much information in the pattern as there is in the number itself.
To a computer system, 00009 is much different than 9. If your compliance program is going to be designed to hold a million records, then your ID number system should begin with 0000000, not 0 and definitely not 1. Why begin with all zeros? You have to have a root record in any well formed system. That root record should always be your equivalent of a null value.
IDs must be validated
When entering numbers, we humans have a tendency to screw up the entry or copying of those numbers. A Dutch mathematician named Jacobus Verhoeff conducted a study of 12,000 numerical errors and from that, proposed a check digit calculation scheme that catches all single errors as well as all adjacent transpositions and most other errors.
To ensure that the IDs assigned by the system have integrity during input as well as distribution while being transferred into various formats (such as Excel, Word, Text, XML), each ID should have its own checksum value stored in a checksum field. Currently, the best methodology we can find for creating and verifying the checksum follows the Verhoeff calculation format.
Here's the Verhoeff calculation as a formula you can use in your database:
Format
Verhoeff ( numericString ; index ; checkSum )
Parameters
numericString - a string of numeric characters (digits) or field containing numeric characters
index - indicates the digit position of the current iteration - needs to be initialized to zero (0) when calling the function
checkSum - indicates the check digit of the current iteration - needs to be initialized to zero (0) when calling the function
Data type returned = number
Description
Returns the Verhoeff dihedral check digit of numericString. Use this function to verify a numeric string protected by Verhoeff check digit, or to generate the correct Verhoeff check digit for a given numeric string.
Calculation
Let ( [
n = Right ( numericString ; 1 ) ;
p = Let ( [
array = "01234567891576283094580379614289160435279453126870428657390127938064157046913258" ;
start = 10 * Mod ( index ; 8 ) + n + 1
] ;
Middle ( array ; start ; 1 )
) ;
d = Let ( [
array = "0123456789123406789523401789563401289567401239567859876043216598710432765982104387659321049876543210" ;
start = 10 * checkSum + p + 1
] ;
Middle ( array ; start ; 1 )
) ;
len = Length ( numericString ) ;
nextString = Left ( numericString ; len - 1 )
] ;
Case ( len > 1 ; Verhoeff ( nextString ; index + 1 ; d ) ; d )
)
This allows us to enter a calculation whereby we can either create a check digit for our ID field, or after editing or copying the ID to another location, we can call the function and check for the validity of an ID using its Verhoeff check digit:
Not Verhoeff(ID & CheckDigit; 0 ; 0 ) = Valid
IDs must be unique and persistent
Some might think this goes without saying, but we've seen instances in systems where this was not the case. Yes, once an ID number has been assigned to any record in any list, it should never be reused.
On a similar note, once you've assigned an ID to any record in your system, you should never delete that ID. Does this mean you can't get rid of the records? Yes and no. You will need to deprecate records, meaning that you keep the ID in the system and then notify those who land in the record with that ID that the record is no longer available. The reason you need to use this deprecation method instead of just deleting the ID is that databases don't have an easy way of looking for items that aren't there any longer. So keeping the ID there, but marking the record as deprecated, solves this problem.

Post a comment