Home >

Dataleaf Technologies, Inc:  HR System migration, Archiving, Analysis, Modeling, Interim HRIS

[Document ©2001-2005 Dataleaf Technologies, Inc, 486 Concord Street, Carlisle MA 01741. Contact georgestalker@dataleaf.net, (978) 369-7472]


Data Repair

The process of 'data cleaning' — whether in support of a system migration, or as part of a data engineering program — can be a huge sink of client staff resources.

traditional limitations on 'data cleanup'

We have noticed that 'in-house' data repair or data cleanup initiatives have certain common shortcomings:

  • In some instances (e.g., prior to a system migration) 'total data cleanup' is mandated by management but no heuristics are specified and no resources or technologies are made available for the process.
  • Sometimes no auditable method for data quality improvement is available, except 'editing' through existing user interfaces, which are designed with very different operations in mind.
  • Usually the assumption is made that '100%' cleanup is possible. This goal may be unachievable in the presence of 'hard' field re-use -- the use of a field in a way that was not intended at design time because no other field is available to carry some necessary information.
  • Data inconsistencies are recognized as a problem, but are not exploited by automated processes as a means for discovering and resolving data issues..

Dataleaf data repair

Dataleaf's data repair methods are sharply different. In a Dataleaf system migration project (or data quality project), (1) all values of all legacy data elements are tabulated up front; (2) data cleaning is integrated with fit analysis and data conversion if applicable.

Data repair is iterative, rule-driven, and cyclic. The normal user interface is often not used. Users NEVER need to make the same manual data change on multiple records. A very complete audit trail, is always generated. It is tagged with cleanup heuristics and user decisions (which, in the case of mass corrections, are usually entered by filling in blanks on an Excel spreadsheet). Dataleaf data repair operations are always reversible in every detail.

A large class of automated tools called 'Consistency Reporting Tools' -- developed through tens of system migrations -- are deployed by Dataleaf from the very beginning of the project...

screen shot


removing jokers from the deck...

no joker image

When Dataleaf's data-cleaning mechanisms are used, much less staff effort is required and a much more controlled, more auditable result is achieved.

The goal is to "eliminate jokers" from the data.


Dataleaf Services...