Class HiveMetadataHandler

  • All Implemented Interfaces:
    com.amazonaws.services.lambda.runtime.RequestStreamHandler

    public class HiveMetadataHandler
    extends JdbcMetadataHandler
    • Constructor Detail

      • HiveMetadataHandler

        public HiveMetadataHandler​(Map<String,​String> configOptions)
      • HiveMetadataHandler

        protected HiveMetadataHandler​(DatabaseConnectionConfig databaseConnectionConfiguration,
                                      software.amazon.awssdk.services.secretsmanager.SecretsManagerClient secretManager,
                                      software.amazon.awssdk.services.athena.AthenaClient athena,
                                      JdbcConnectionFactory jdbcConnectionFactory,
                                      Map<String,​String> configOptions)
    • Method Detail

      • getPartitionSchema

        public org.apache.arrow.vector.types.pojo.Schema getPartitionSchema​(String catalogName)
        Delegates creation of partition schema to database type implementation.
        Specified by:
        getPartitionSchema in class JdbcMetadataHandler
        Parameters:
        catalogName - Athena provided hive catalog name.
        Returns:
        schema. See Schema
      • getPartitions

        public void getPartitions​(BlockWriter blockWriter,
                                  GetTableLayoutRequest getTableLayoutRequest,
                                  QueryStatusChecker queryStatusChecker)
                           throws Exception
        Used to get the hive partitions that must be read from the request table in order to satisfy the requested predicate.
        Specified by:
        getPartitions in class JdbcMetadataHandler
        Parameters:
        blockWriter - Used to write rows (hive partitions) into the Apache Arrow response.
        getTableLayoutRequest - Provides details of the catalog, database, and table being queried as well as any filter predicate.
        queryStatusChecker - A QueryStatusChecker that you can use to stop doing work for a query that has already terminated
        Throws:
        Exception - An Exception should be thrown for database connection failures , query syntax errors and so on.
      • doGetSplits

        public GetSplitsResponse doGetSplits​(BlockAllocator blockAllocator,
                                             GetSplitsRequest getSplitsRequest)
        Used to split-up the reads required to scan the requested batch of partition(s).
        Specified by:
        doGetSplits in class JdbcMetadataHandler
        Parameters:
        blockAllocator - Tool for creating and managing Apache Arrow Blocks.
        getSplitsRequest - Provides details of the Hive catalog, database, table, and partition(s) being queried as well as any filter predicate.
        Returns:
        A GetSplitsResponse which primarily contains: 1. A Set of Splits which represent read operations Amazon Athena must perform by calling your read function. 2. (Optional) A continuation token which allows you to paginate the generation of splits for large queries.
      • doGetTable

        public GetTableResponse doGetTable​(BlockAllocator blockAllocator,
                                           GetTableRequest getTableRequest)
                                    throws Exception
        Used to get definition (field names, types, descriptions, etc...) of a Hive Table.
        Overrides:
        doGetTable in class JdbcMetadataHandler
        Parameters:
        blockAllocator - Tool for creating and managing Apache Arrow Blocks.
        getTableRequest - Provides details on who made the request and which Athena catalog, database, and Hive table they are querying.
        Returns:
        A GetTableResponse which primarily contains: 1. An Apache Arrow Schema object describing the table's columns, types, and descriptions. 2. A Set of Strings of partition column names (or empty if the table isn't partitioned).
        Throws:
        Exception