Skip to content

Components

the pipeline of data processing, model training and deployment

architecture of model training

The solution uses AWS Step Functions workflow orchestrate the pipeline from raw IEEE-CIS dataset, graph data processing, training GNN model and inference endpoint deployment. Below is the detail for each workflow step,

  1. Use AWS Lambda function downloads dataset to Amazon S3 bucket
  2. Execute AWS Glue crawler to build Glue Data Catalog from dataset
  3. Execute AWS Glue ETL job processing the raw data, converting the tabular data to graph structure data, then write to S3 bucket
  4. Use Amazon SageMaker trains the GNN model on DGL
  5. After training the model, loading graph structure data into graph database Neptune Amazon Neptune
  6. Package the custom inference code with model
  7. Use Amazon SageMaker to create model, configure endpoint configuration and deploying inference endpoint

real-time fraud detection and business monitor system

architecture of real-time inference and business dashboard

real-time fraud detection

The solution follows below steps for implementing real-time fraud detection,

  1. Process the online transaction data as the graph structure data
  2. Insert the graph data(vertices, edges and relationships) into graph database Neptune
  3. Query the sub-graph of current transaction vertice and its 2nd connected vertices
  4. Send the data of sub-graph to inference endpoint to get the possibility of fraudulent of the transaction. Then publish the transaction and its fraudulent possibility to Amazon SQS queue

business monitor system

The solution uses below services consisting of the monitor system of fraudulent transactions,