Execute, Extract, Transform and Load Functionality
The Contextual Fusion Hub (CFH) of Genix platform consists of a library of in-built adapters for connecting to different source systems and extracting data. In case, the in-built adapters do not support a new source system, you can develop a new adapter and plug into the platform to extract data from the source system.
CFH allows you to select a network (category) and a source (adapter) to establish a connection. Each adapter provides you a list of parameters or fields which are to be filled. The Metadata Mapping section of CFH allows you to map the metadata and source fields to Industry Cognitive Model (ICM). The target objects are the tables. You can select a target table and map the source fields of the connection’s source objects to the fields of the target table.
Adapters connect with source systems, extract data, and load data into CDL. The Extract, Transfer and Load (ETL) functionality of Genix Platform further processes and loads this data from the CDL into the ICM as per the mapping defined in the CFH.
The following sections explains how to run the ETL functionality and load data from the CDL into the ICM as per the mapping defined in the CFH. It includes the procedure of batch processing using the on-premise functionality, which is metadata driven. Groups are created for the various modules in the metadata table and these groups contain the information about the sources and targets of the data. The source and the target tables are mapped with the JSON mappings which are available in the cosmos DB based on the source connector codes.
The calling of the functions through REST API is metadata driven. There are two ways of executing the functions:
-
Using the rundeck scheduler by creating schedules according to the modules.
-
Calling the REST API by giving the runtime parameters.
On a high-level, the execution of the ETL functionality involves the following steps:
-
Load the metadata: You must load the metadata of the application before loading the data into the tables, this will consist the information of the tables which will be the part of data load. The metadata of the tables will be loaded according to the group names by which the tables can be divided according to the functionality.
-
Update the metadata: Once the metadata is loaded, you can update the metadata according to the tables and application requirement by using the various functions.
-
Load data from the adapters: To load the data coming from various IT & OT Adapters into the ICM database.
-
Create the Rundeck Schedule: To load the incremental data, you must create schedule jobs which will do the data load from the source to the ICM tables as per the frequency.
-
Load the data using the Rundeck Schedule: After creating the Rundeck Schedule, perform this procedure to manually load the data for the created schedule. This is optional.
-
Load custom data using REST API: To load data for selected tables in a particular group.
-
Load data in ICM tables with any additional configurations: To load the hierarchy data into the reporting hierarchy table.