Skip to content

Configure transformation and enrichment plugins

There are two types of plugins: transformer or enrichment. When choose plugins, you can only have one transformer and zero or multiple enrichment for a pipeline.

Built-in Plugins

Below plugins are provided by Clickstream Analytics on AWS.

Plugin name Type Description
UAEnrichment enrichment User-agent enrichment, use ua_parser Java library to enrich User-Agent in the HTTP header to ua_browser,ua_browser_version,ua_os,ua_os_version,ua_device
IpEnrichment enrichment IP address enrichment, use GeoLite2 data by MaxMind to enrich IP to city, continent, country

The UAEnrichment uses UA Parser to parse user-agent in Http header.

The IpEnrichment plugin uses GeoLite2-City data created by MaxMind, available from https://www.maxmind.com.

Custom Plugins

You can add custom plugins to transform raw event data or enrich the data for your need.

Note

To add custom plugins, you must develop your own plugins firstly, see Develop Custom Plugins

You can add your plugins by click Add Plugin button, which will open a new window, in which you can upload your plugins.

  1. Give the plugin Name and Description.
  2. Chose Plugin Type,
  3. Enrichment: Plugin to add fields into event data collected by SDK (both Clickstream SDK or third-party SDK)
  4. Transformation: A plugin used to transform a third-party SDK's raw data into solution built-in schema

  5. Upload plugin java JAR file.

  6. (Optional) Upload the dependency files if any.

  7. Main function class: fill the full class name of your plugin class name, e.g. com.example.sol.CustomTransformer.

Develop Custom Plugins

The simplest way to develop custom plugins is making changes based on our example project.

  1. Clone/Fork the example project.
git clone https://github.com/awslabs/clickstream-analytics-on-aws.git

cd examples/custom-plugins
  • For enrichment plugin, please refer to the example: custom-enrich/
  • For transformer plugin, please refer to the example: custom-sdk-transformer/

  • Change packages and classes name as your desired.

  • Implement the method public Dataset<row> transform(Dataset<row> dataset) to do transformation or enrichment.

  • (Optional) Write test code.

  • Run gradle to package code to jar ./gradlew clean build.

  • Get the jar file in build output directory ./build/libs/.