Class BlockUtils
- java.lang.Object
-
- com.amazonaws.athena.connector.lambda.data.BlockUtils
-
public class BlockUtils extends Object
This utility class abstracts many facets of reading and writing values into Apache Arrow's FieldReader and FieldVector objects.
-
-
Field Summary
Fields Modifier and Type Field Description static ZoneId
UTC_ZONE_ID
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static int
copyRows(Block srcBlock, Block dstBlock, int firstRow, int lastRow)
Copies a inclusive range of rows from one block to another.static String
fieldToString(org.apache.arrow.vector.complex.reader.FieldReader reader)
Used to convert a single cell for the given FieldReader to a human readable string.static Class
getJavaType(org.apache.arrow.vector.types.Types.MinorType minorType)
static boolean
isNullRow(Block block, int row)
Checks if a row is null by checking that all fields in that row are null (aka not set).static Block
newBlock(BlockAllocator allocator, String columnName, org.apache.arrow.vector.types.pojo.ArrowType type, Object... values)
Creates a new Block with a single column and populated with the provided values.static Block
newBlock(BlockAllocator allocator, String columnName, org.apache.arrow.vector.types.pojo.ArrowType type, Collection<Object> values)
Creates a new Block with a single column and populated with the provided values.static Block
newEmptyBlock(BlockAllocator allocator, String columnName, org.apache.arrow.vector.types.pojo.ArrowType type)
Creates a new, empty, Block with a single column.static String
rowToString(Block block, int row)
Used to convert a specific row in the provided Block to a human readable string.static void
setComplexValue(org.apache.arrow.vector.FieldVector vector, int pos, FieldResolver resolver, Object value)
Used to set complex values (Struct, List, etc...) on the provided FieldVector.static void
setValue(org.apache.arrow.vector.FieldVector vector, int pos, Object value)
Used to set values (Int, BigInt, Bit, etc...) on the provided FieldVector.static void
unsetRow(int row, Block block)
In some filtering situations it can be useful to 'unset' a row as an indication to a later processing stage that the row is irrelevant.protected static void
writeAllValue(org.apache.arrow.vector.complex.writer.FieldWriter writer, org.apache.arrow.vector.types.pojo.Field field, org.apache.arrow.memory.BufferAllocator allocator, int pos, FieldResolver resolver, Object value, boolean fromMapOrStruct)
protected static void
writeList(org.apache.arrow.memory.BufferAllocator allocator, org.apache.arrow.vector.complex.writer.FieldWriter writer, org.apache.arrow.vector.types.pojo.Field field, int pos, Iterable value, FieldResolver resolver)
Used to write a List value.protected static void
writeMap(org.apache.arrow.memory.BufferAllocator allocator, org.apache.arrow.vector.complex.writer.BaseWriter.MapWriter writer, org.apache.arrow.vector.types.pojo.Field field, int pos, Object value, FieldResolver resolver)
Used to write a Map value.protected static void
writeSimpleValue(org.apache.arrow.vector.complex.writer.FieldWriter writer, org.apache.arrow.vector.types.pojo.Field field, org.apache.arrow.memory.BufferAllocator allocator, Object value, boolean fromMapOrStruct)
Used to write an individual value into a field, multiple calls to this method per-cell are expected in order to write the N values of a list of size N.protected static void
writeStruct(org.apache.arrow.memory.BufferAllocator allocator, org.apache.arrow.vector.complex.writer.BaseWriter.StructWriter writer, org.apache.arrow.vector.types.pojo.Field field, int pos, Object value, FieldResolver resolver)
Used to write a Struct value.
-
-
-
Field Detail
-
UTC_ZONE_ID
public static final ZoneId UTC_ZONE_ID
-
-
Method Detail
-
newBlock
public static Block newBlock(BlockAllocator allocator, String columnName, org.apache.arrow.vector.types.pojo.ArrowType type, Object... values)
Creates a new Block with a single column and populated with the provided values.- Parameters:
allocator
- The BlockAllocator to use when creating the Block.columnName
- The name of the single column in the Block's Schema.type
- The Apache Arrow Type of the column.values
- The values to write to the new Block. Each value will be its own row.- Returns:
- The newly created Block with a single column Schema at populated with the provided values.
-
newBlock
public static Block newBlock(BlockAllocator allocator, String columnName, org.apache.arrow.vector.types.pojo.ArrowType type, Collection<Object> values)
Creates a new Block with a single column and populated with the provided values.- Parameters:
allocator
- The BlockAllocator to use when creating the Block.columnName
- The name of the single column in the Block's Schema.type
- The Apache Arrow Type of the column.values
- The values to write to the new Block. Each value will be its own row.- Returns:
- The newly created Block with a single column Schema at populated with the provided values.
-
newEmptyBlock
public static Block newEmptyBlock(BlockAllocator allocator, String columnName, org.apache.arrow.vector.types.pojo.ArrowType type)
Creates a new, empty, Block with a single column.- Parameters:
allocator
- The BlockAllocator to use when creating the Block.columnName
- The name of the single column in the Block's Schema.type
- The Apache Arrow Type of the column.- Returns:
- The newly created, empty, Block with a single column Schema.
-
setComplexValue
public static void setComplexValue(org.apache.arrow.vector.FieldVector vector, int pos, FieldResolver resolver, Object value)
Used to set complex values (Struct, List, etc...) on the provided FieldVector.- Parameters:
vector
- The FieldVector into which we should write the provided value.pos
- The row number that the value should be written to.resolver
- The FieldResolver that can be used to map your value to the complex type (mostly for Structs, Maps).value
- The value to write.
-
setValue
public static void setValue(org.apache.arrow.vector.FieldVector vector, int pos, Object value)
Used to set values (Int, BigInt, Bit, etc...) on the provided FieldVector.- Parameters:
vector
- The FieldVector into which we should write the provided value.pos
- The row number that the value should be written to.value
- The value to write.
-
rowToString
public static String rowToString(Block block, int row)
Used to convert a specific row in the provided Block to a human readable string. This is useful for diagnostic logging.- Parameters:
block
- The Block to read the row from.row
- The row number to read.- Returns:
- The human readable String representation of the requested row.
-
fieldToString
public static String fieldToString(org.apache.arrow.vector.complex.reader.FieldReader reader)
Used to convert a single cell for the given FieldReader to a human readable string.- Parameters:
reader
- The FieldReader from which we should read the current cell. This means the position to be read should have been set on the reader before calling this method.- Returns:
- The human readable String representation of the value at the FieldReaders current position.
-
copyRows
public static int copyRows(Block srcBlock, Block dstBlock, int firstRow, int lastRow)
Copies a inclusive range of rows from one block to another.- Parameters:
srcBlock
- The source Block to copy the range of rows from.dstBlock
- The destination Block to copy the range of rows to.firstRow
- The first row we'd like to copy.lastRow
- The last row we'd like to copy.- Returns:
- The number of rows that were copied.
-
isNullRow
public static boolean isNullRow(Block block, int row)
Checks if a row is null by checking that all fields in that row are null (aka not set).- Parameters:
block
- The Block we'd like to check.row
- The row number we'd like to check.- Returns:
- True if the entire row is null (aka all fields null/unset), False if any field has a non-null value.
-
writeList
protected static void writeList(org.apache.arrow.memory.BufferAllocator allocator, org.apache.arrow.vector.complex.writer.FieldWriter writer, org.apache.arrow.vector.types.pojo.Field field, int pos, Iterable value, FieldResolver resolver)
Used to write a List value.- Parameters:
allocator
- The BlockAllocator which can be used to generate Apache Arrow Buffers for types which require conversion to an Arrow Buffer before they can be written using the FieldWriter.writer
- The FieldWriter for the List field we'd like to write into.field
- The Schema details of the List Field we are writing into.pos
- The position (row) in the Apache Arrow batch we are writing to.value
- An iterator to the collection of values we want to write into the row.resolver
- The field resolver that can be used to extract individual values from the value iterator.
-
writeStruct
protected static void writeStruct(org.apache.arrow.memory.BufferAllocator allocator, org.apache.arrow.vector.complex.writer.BaseWriter.StructWriter writer, org.apache.arrow.vector.types.pojo.Field field, int pos, Object value, FieldResolver resolver)
Used to write a Struct value.- Parameters:
allocator
- The BlockAllocator which can be used to generate Apache Arrow Buffers for types which require conversion to an Arrow Buffer before they can be written using the FieldWriter.writer
- The FieldWriter for the Struct field we'd like to write into.field
- The Schema details of the Struct Field we are writing into.pos
- The position (row) in the Apache Arrow batch we are writing to.value
- The value we'd like to write as a struct.resolver
- The field resolver that can be used to extract individual Struct fields from the value.
-
writeMap
protected static void writeMap(org.apache.arrow.memory.BufferAllocator allocator, org.apache.arrow.vector.complex.writer.BaseWriter.MapWriter writer, org.apache.arrow.vector.types.pojo.Field field, int pos, Object value, FieldResolver resolver)
Used to write a Map value.- Parameters:
allocator
- The BlockAllocator which can be used to generate Apache Arrow Buffers for types which require conversion to an Arrow Buffer before they can be written using the FieldWriter.writer
- The FieldWriter for the Map field we'd like to write into.field
- The Schema details of the Map Field we are writing into.pos
- The position (row) in the Apache Arrow batch we are writing to.value
- The value we'd like to write as a Map.resolver
- The field resolver that can be used to extract individual Struct Map from the value.
-
writeAllValue
protected static void writeAllValue(org.apache.arrow.vector.complex.writer.FieldWriter writer, org.apache.arrow.vector.types.pojo.Field field, org.apache.arrow.memory.BufferAllocator allocator, int pos, FieldResolver resolver, Object value, boolean fromMapOrStruct)
- Parameters:
writer
- The FieldWriter for the Map field we'd like to write into.field
- The Schema details of the Map Field we are writing into.allocator
- The BlockAllocator which can be used to generate Apache Arrow Buffers for typespos
- The position (row) in the Apache Arrow batch we are writing to.resolver
- The field resolver that can be used to extract individual Struct Map from the value.value
- The value we'd like to write as a Map.fromMapOrStruct
- Is field from map or struct
-
writeSimpleValue
protected static void writeSimpleValue(org.apache.arrow.vector.complex.writer.FieldWriter writer, org.apache.arrow.vector.types.pojo.Field field, org.apache.arrow.memory.BufferAllocator allocator, Object value, boolean fromMapOrStruct)
Used to write an individual value into a field, multiple calls to this method per-cell are expected in order to write the N values of a list of size N.- Parameters:
writer
- The FieldWriter (already positioned at the row) that we want to write into.field
- The concrete type of the values.allocator
- The BlockAllocator that can be used for allocating Arrow Buffers for fields which require conversion to Arrow Buff before being written.value
- The value to write.fromMapOrStruct
- write the simple value for non map/struct or map/struct type
-
unsetRow
public static void unsetRow(int row, Block block)
In some filtering situations it can be useful to 'unset' a row as an indication to a later processing stage that the row is irrelevant. The mechanism by which we 'unset' a row is actually field type specific and as such this method is not supported for all field types.- Parameters:
row
- The row number to unset in the provided Block.block
- The Block where we'd like to unset the specified row.
-
getJavaType
public static Class getJavaType(org.apache.arrow.vector.types.Types.MinorType minorType)
-
-