Have you tried using the HANA Smart Data Quality (SDQ), and found the transforms aren’t working? Well this blog should help, there are country specific files required for both geocoding (latitude, longitude) and address cleansing.
The steps covered below are
1. Download the files
2. Extract the zip file
3. Verify existing configuration
4. Update Configuration
5. Validate Configuration
6. Build SDI Flowgraph
7. Errors
Locate the required files from SAP Software Downloads.
Addresses and Geocoding are both within the “Addreess Directories & Reference Data” category, and data is held a country level, I am using some UK data so have downloaded the UK GEO DIRECTORY, this will be needed for the geocode SDQ transform.
The steps covered below are
1. Download the files
2. Extract the zip file
3. Verify existing configuration
4. Update Configuration
5. Validate Configuration
6. Build SDI Flowgraph
7. Errors
1. Download the files
Locate the required files from SAP Software Downloads.
Addresses and Geocoding are both within the “Addreess Directories & Reference Data” category, and data is held a country level, I am using some UK data so have downloaded the UK GEO DIRECTORY, this will be needed for the geocode SDQ transform.
You usually want the latest file available
I download these directly onto the HANA box as they can be quite large.
2. Extract the zip file
The official way to install these file is using the hdblcm tool, but you can also do this manually. Which can be handy if your hdblcm is not working. The files need to be owned by the HANA SID ADM user (ih2adm) in my case.
3. Verify existing configuration
We can check to see if there’s an existing configuration of the SDQ reference data files
select *
from SYS.M_INIFILE_CONTENTS
where FILE_NAME = 'scriptserver.ini'
and SECTION = 'adapter_framework'
and KEY = 'dq_reference_data_path';
This should not return any rows.
The same is true in HANA Studio, Administration -> Configuration
4. Update Configuration (dq_reference_data_path)
We can go ahead and update our configuration as below, but replacing with your directory path.
-- EXECUTE ON SYSTEMDB
ALTER SYSTEM ALTER CONFIGURATION ('scriptserver.ini', 'SYSTEM')
SET ('adapter_framework', 'dq_reference_data_path') = '/hana/SDQ_Reference/' WITH RECONFIGURE
After doing this change we need to restart the ScriptServer.
5. Validate Configuration
Repeating step 3 we should now see our new configuration in place.
6. Build SDI Flowgraph
We can now switch to the WebIDE and create an SDI (Smart Data Integration) Flowgraph.
Here I have just created a dummy .hdbview that doesn’t contain any real personal data.
We can then use this in a simple flowgraph as below
DataSource
Geocode
Here we can see the SDQ has automatically guessed correctly the Content Types, this is from the data that is being fed in.
It makes sense to add our input data as output columns
We can now save, build and Execute the Flowgraph
Before we celebrate the sucessful execution, we should check our target table.
All appears to be good, for one final verification we can put the LAT, LONG in to our favourite mapping tool and check the location looks correct.
Errors
If you receive this error below then you address and/or geocode reference data is likely missing.
15:57:12 (/Sample/db) Execution Failed : (dberror) [686]: start task error: “SAMPLE”.”UK_GEOCODE_SP”: line 5 col 0 (at pos 85): [2620] executor: plan operation failed;Execution of Adapter operation within node GeocodeUK_GEOCODE_TEMPLATE_ failed: exception 141005: Failed to create or initialize HanaTransform object
Looking at the diagnosis files in HANA Studio will give you more information. Check the scriptserver_alert_<hostname>.trc
Here it appears as though I am out of memory, but this is actually caused by a permissions issue. Make sure the geo or address files are owned by you HANA <SID>ADM OS user.
[65085]{200563}[2/66728980] 2019-04-04 16:08:16.664458 e af_core hanaLogging.cpp(00198) : <GEO0018>: GEO0018GEO_ERROR_DIR_OUT_OF_MEMORY: Out of memory during directory initialization: hana/SDQ_Reference/geo_gb_nt.dir.
[65085]{200563}[2/66728980] 2019-04-04 16:08:16.740559 e af_core hanaTransform.cpp(01240) : GeocodeUK_GEOCODE_TEMPLATE_:1: In HanaTransform, Failed to create transform
[65085]{200563}[2/66728980] 2019-04-04 16:08:16.742133 e CalcEngine cePopAdapter.cpp(00338) : Execution of Adapter operation within node GeocodeUK_GEOCODE_TEMPLATE_ failed: exception 141005: Failed to create or initialize HanaTransform object