Customer is a government organization who deals with new business license providing and tracking. They collect the nature of business and its activities (Business License activities allowed as per the economic department) from various business license authorities. Data comes from multiple sources; assignment of Activity code, Description is not followed standard ISIC code as all the authorities. Due to this, the customer could not calculate the metrics like Company count by activities, role up metrics to calculate the number of companies by ISIS Level, etc. Involved manual update/cleanup to generate the reports that are required to publish on their website. Also, during the COVID-19 situation number of companies left the business; Business impact due to COVID-19 is not measurable due to the non-availability of standard ISIC code assignment.
We have addressed this issue in multiple approaches
1. Automatic update of ISIC code using matching
Create a Match process in Informatica Data Quality which cleanse and match the business activities to form a group based on similarities and identify if any valid ISIC code available in the group then the identified ISIC code will be assigned to all the records in the group, which are matched above threshold (90% match)
2. Manual review
We have created a process to clusters/groups based on similarities using Informatica matching techniques and posted the groups through Analyst for manual correction (Exception Process).
Business SME will access the clusters/groups using the Informatica Analyst tool to
- Review and assign ISIC code for a master record; will have the process to assign the same code to all the records in the cluster
- Review and move the activities to another group; Assign the ISIC code for the master record for both the groups/clusters
- Review and move the activities to New Group; assign the ISIC code for the master record
User input will be processed and update the existing Datamart tables and will have Qlik reports to show the revised metrics
3. Stand-alone record correction:
Records will not belong to any group /Cluster will be loaded into Informatica Bad Exception framework where user can review and update the valid ISIC code through Informatica Analyst (UI Tool)
If we get similar records as part of the daily load, which users already reviewed and updated will be auto-assigned with valid ISIC code using Informatica Reference table.
Business Process flow
- Reduce Manual effort required to exchange and cleansing the data
- Timely availability of data for reporting to Media
- Uniformity and Accuracy of information
- Improved data Quality