Class DDBTableUtils


  • public final class DDBTableUtils
    extends Object
    Provides utility methods relating to table handling.
    • Field Detail

      • SCHEMA_INFERENCE_NUM_RECORDS

        public static final int SCHEMA_INFERENCE_NUM_RECORDS
        See Also:
        Constant Field Values
    • Method Detail

      • getTable

        public static DynamoDBTable getTable​(String tableName,
                                             ThrottlingInvoker invoker,
                                             software.amazon.awssdk.services.dynamodb.DynamoDbClient ddbClient)
                                      throws TimeoutException
        Fetches metadata for a DynamoDB table
        Parameters:
        tableName - the (case sensitive) table name
        invoker - the ThrottlingInvoker to call DDB with
        ddbClient - the DDB client to use
        Returns:
        the table metadata
        Throws:
        TimeoutException
      • peekTableForSchema

        public static org.apache.arrow.vector.types.pojo.Schema peekTableForSchema​(String tableName,
                                                                                   ThrottlingInvoker invoker,
                                                                                   software.amazon.awssdk.services.dynamodb.DynamoDbClient ddbClient)
                                                                            throws TimeoutException
        Derives an Arrow Schema for the given table by performing a small table scan and mapping the returned attribute values' types to Arrow types. If the table is empty, only attributes found in the table's metadata are added to the return schema.
        Parameters:
        tableName - the table to derive a schema for
        invoker - the ThrottlingInvoker to call DDB with
        ddbClient - the DDB client to use
        Returns:
        the table's derived schema
        Throws:
        TimeoutException
      • buildSchemaFromItems

        public static SchemaBuilder buildSchemaFromItems​(List<Map<String,​software.amazon.awssdk.services.dynamodb.model.AttributeValue>> items)
        A utility method that takes a list of items, and returns a schema builder
        Parameters:
        items - a list of a map of DynamoDB elements
        Returns:
        schema builder
      • getNumSegments

        public static int getNumSegments​(long tableNormalizedReadThroughput,
                                         long currentTableSizeBytes)
        This hueristic determines an optimal segment count to perform Parallel Scans with using the table's capacity and size.
        Parameters:
        tableNormalizedReadThroughput - the provisioned read capacity for the table
        currentTableSizeBytes - the table's approximate size in bytes
        Returns:
        an optimal segment count
        See Also:
        https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Scan.html#Scan.ParallelScan