Missing Data. Corrupted Data: 4 Tactics to Keep Your Data Clean

Missing Data

SAP Licensing and Authorizations Managers: How do you know that your final report is not relying on corrupted data? Maybe you have a software tool that analyses the data for you – but is any data missing or corrupted to begin with? How do you know?

John Doe is Requesting Your Approval to View Invoices

Recently some of our customers acknowledged that it’s just not enough to only have a superior tool for licensing or user authorization monitoring. As the phrase says: “the devil is in the details”, and they discovered that some of their details is inaccurate and incomplete. This is not a secret of course, you’ve heard it if you’ve been in the IT business for a while, but it’s an extremely critical concept when it comes to combining user data among different systems or granting user authorizations.

Imagine yourself trying to combine 10,000 user details among different applications without having the full email address for each one. “John Smith” and “Zhang Wei” will probably result in multiple combinations, but maybe they really represent a single person and should be joined into one employee? You’ll never know if you don’t have a complete and accurate email address.

Or how about when you get a request for additional user authorizations to view financial statements – the user account is “JOHNS”, the first name is “John” and there’s no last name or email address. How can you identify who’s really behind it?

Missing Data. Incorrect Data

From Google: “Data cleansing, data cleaning or data scrubbing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database.”

In order to work well with data, the data needs to be accurate and complete. Here are four pieces of advice that we’ve collected from customers who’ve gone through the tedious process of data cleansing.

  • Most important data goes first: You should identify what the most important data fields to your process are and start there. For example, in order to have a successful licensing process that includes the step of combining user accounts among different SAP applications, you need data for user email addresses to begin with, and then first and last name. In order to identify whom a person is in an authorization related process, you need data on their department and position too. First identify what the most important data is, and then focus on completing it.
  • Share the task, but be prepared to take over: You will probably need other people to help you complete missing data, especially if you need data about offsite locations. Take the time to recognize who the relevant people are that can assist you, but just know that some won’t get to it fast enough, and in this case, you’ll just have to do the work yourself. For example, if you need to complete email address information, you can ask the department managers for their employee data, but you’ll probably discover they passed the task to another person and that person doesn’t understand, or they’re too busy to even reply. Anyway, it’s your responsibility, so if this happens, take over and to do it yourself. Don’t forget that this is just the data cleansing step on the way to sound SAP licensing or an efficient authorization processes.
  • It’s not all-or-nothing, 90% is much better than 50%: Eliminating dirty data is a continuous challenge that will probably not end when your project does. Understanding this will make it easier for you to not be frustrated if you discover that there’s more data to clean, and it seems like a never-ending story. Make an effort to cleanse a good amount of data but don’t forget to set time limits and the target; It is much more important to keep to the timeline and reach the goal (better licensing, implementation of workflow) than to have all of the data clean and accurate. After a successful first project, you can always plan the second one.
  • Tools are great, but not always a must-have: If your organization is not so big, or the amount of the data is not so huge, you can try to trust the human eye to spot missing and corrupted data. The human eye can easily scan 1,000 records of data, and even 10,000 data items is not beyond its scanning reach. Of course, there are very good tools that can spot corrupted data by using data-rule sets and duplicate-tracing, however you might discover that your management doesn’t want to spend money on them. In this case, good practice is to narrow down the most important data objects and do the work manually. You will be surprised by the ability of the human eye

Share With Your Network

Share on facebook
Share on google
Share on twitter
Share on linkedin