Class MetricsMetadataHandler

  • All Implemented Interfaces:
    com.amazonaws.services.lambda.runtime.RequestStreamHandler

    public class MetricsMetadataHandler
    extends MetadataHandler
    Handles metadata requests for the Athena Cloudwatch Metrics Connector.

    For more detail, please see the module's README.md, some notable characteristics of this class include:

    1. Provides two tables (metrics and metric_samples) for accessing Cloudwatch Metrics data via the "default" schema. 2. Supports Predicate Pushdown into Cloudwatch Metrics for most fields. 3. If multiple Metrics (namespace, metric, dimension(s), and statistic) are requested, they can be read in parallel.

    • Constructor Detail

      • MetricsMetadataHandler

        public MetricsMetadataHandler​(Map<String,​String> configOptions)
      • MetricsMetadataHandler

        protected MetricsMetadataHandler​(software.amazon.awssdk.services.cloudwatch.CloudWatchClient metrics,
                                         EncryptionKeyFactory keyFactory,
                                         software.amazon.awssdk.services.secretsmanager.SecretsManagerClient secretsManager,
                                         software.amazon.awssdk.services.athena.AthenaClient athena,
                                         String spillBucket,
                                         String spillPrefix,
                                         Map<String,​String> configOptions)
    • Method Detail

      • doListSchemaNames

        public ListSchemasResponse doListSchemaNames​(BlockAllocator blockAllocator,
                                                     ListSchemasRequest listSchemasRequest)
        Only supports a single, static, schema defined by SCHEMA_NAME.
        Specified by:
        doListSchemaNames in class MetadataHandler
        Parameters:
        blockAllocator - Tool for creating and managing Apache Arrow Blocks.
        listSchemasRequest - Provides details on who made the request and which Athena catalog they are querying.
        Returns:
        A ListSchemasResponse which primarily contains a Set of schema names and a catalog name corresponding the Athena catalog that was queried.
        See Also:
        MetadataHandler
      • doListTables

        public ListTablesResponse doListTables​(BlockAllocator blockAllocator,
                                               ListTablesRequest listTablesRequest)
        Supports a set of static tables defined by: TABLES
        Specified by:
        doListTables in class MetadataHandler
        Parameters:
        blockAllocator - Tool for creating and managing Apache Arrow Blocks.
        listTablesRequest - Provides details on who made the request and which Athena catalog and database they are querying.
        Returns:
        A ListTablesResponse which primarily contains a List enumerating the tables in this catalog, database tuple. It also contains the catalog name corresponding the Athena catalog that was queried.
        See Also:
        MetadataHandler
      • doGetTable

        public GetTableResponse doGetTable​(BlockAllocator blockAllocator,
                                           GetTableRequest getTableRequest)
        Returns the details of the requested static table.
        Specified by:
        doGetTable in class MetadataHandler
        Parameters:
        blockAllocator - Tool for creating and managing Apache Arrow Blocks.
        getTableRequest - Provides details on who made the request and which Athena catalog, database, and table they are querying.
        Returns:
        A GetTableResponse which primarily contains: 1. An Apache Arrow Schema object describing the table's columns, types, and descriptions. 2. A Set of partition column names (or empty if the table isn't partitioned).
        See Also:
        MetadataHandler
      • getPartitions

        public void getPartitions​(BlockWriter blockWriter,
                                  GetTableLayoutRequest request,
                                  QueryStatusChecker queryStatusChecker)
                           throws Exception
        Our table doesn't support complex layouts or partitioning so we simply make this method a NoOp and the SDK will automatically generate a single placeholder partition for us since Athena needs at least 1 partition returned if there is potetnailly any data to read. We do this because Cloudwatch Metric's APIs do not support the kind of filtering we need to do reasonably scoped partition pruning. Instead we do the pruning at Split generation time and return a single partition here. The down side to doing it at Split generation time is that we sacrifice parallelizing Split generation. However this is not a significant performance detrement to this connector since we can generate Splits rather quickly and easily.
        Specified by:
        getPartitions in class MetadataHandler
        Parameters:
        blockWriter - Used to write rows (partitions) into the Apache Arrow response.
        request - Provides details of the catalog, database, and table being queried as well as any filter predicate.
        queryStatusChecker - A QueryStatusChecker that you can use to stop doing work for a query that has already terminated
        Throws:
        Exception
        See Also:
        MetadataHandler
      • doGetSplits

        public GetSplitsResponse doGetSplits​(BlockAllocator blockAllocator,
                                             GetSplitsRequest getSplitsRequest)
                                      throws Exception
        Each 'metric' in cloudwatch is uniquely identified by a quad of Namespace, List, MetricName, Statistic. If the query is for the METRIC_TABLE we return a single split. If the query is for actual metrics data, we start forming batches of metrics now that will form the basis of GetMetricData requests during readSplits.
        Specified by:
        doGetSplits in class MetadataHandler
        Parameters:
        blockAllocator - Tool for creating and managing Apache Arrow Blocks.
        getSplitsRequest - Provides details of the catalog, database, table, andpartition(s) being queried as well as any filter predicate.
        Returns:
        A GetSplitsResponse which primarily contains: 1. A Set which represent read operations Amazon Athena must perform by calling your read function. 2. (Optional) A continuation token which allows you to paginate the generation of splits for large queries.
        Throws:
        Exception
        See Also:
        MetadataHandler