io_parquet
only.Expand description
APIs to read from Parquet format.
Re-exports
pub use parquet2::fallible_streaming_iterator;
pub use schema::infer_schema;
Modules
API to perform page-level filtering (also known as indexes)
APIs to handle Parquet <-> Arrow schemas.
APIs exposing parquet2
’s statistics as arrow’s statistics.
Structs
A FallibleStreamingIterator
that decompresses CompressedPage
into [DataPage
].
Metadata for a column chunk.
A descriptor for leaf-level primitive columns. This encapsulates information such as definition and repetition levels and is used to re-assemble nested data.
A CompressedDataPage
is compressed, encoded representation of a Parquet data page.
It holds actual data and thus cloning it is expensive.
Decompressor that allows re-using the page buffer of [PageIterator
].
Metadata for a Parquet file.
An iterator of Chunk
s coming from row groups of a parquet file.
A fallible Iterator
of CompressedDataPage
. This iterator reads pages back
to back until all pages have been consumed.
The pages from this iterator always have None
crate::page::CompressedDataPage::selected_rows()
since
filter pushdown is not supported without a
pre-computed page index.
A MutStreamingIterator
of pre-read column chunks
Metadata for a row group.
An [Iterator<Item=RowGroupDeserializer>
] from row groups of a parquet file.
Enums
A Page
is an uncompressed, encoded representation of a Parquet page. It may hold actual data
and thus cloning it may be expensive.
Errors generated by this crate
Representation of a Parquet type describing primitive and nested fields, including the top-level schema of the parquet file.
The set of all physical types representable in Parquet
State of MutStreamingIterator
.
Traits
Trait describing a MutStreamingIterator
of column chunks.
A fallible, streaming iterator.
A special kind of fallible streaming iterator where advance
consumes the iterator.
Trait describing a FallibleStreamingIterator
of Page
Functions
Reads the column indexes of all ColumnChunkMetaData
and deserializes them into [Index
].
Returns an empty vector if indexes are not available
Reads a FileMetaData
from the reader, located at the end of the file.
Asynchronously reads the files’ metadata
Decompresses the page, using buffer
for decompression.
If page.buffer.len() == 0
, there was no decompression and the buffer was moved.
Else, decompression took place.
Returns a [ColumnIterator
] of column chunks corresponding to field
.
Returns all ColumnChunkMetaData
associated to field_name
.
For non-nested parquet types, this returns a single column
Returns all ColumnChunkMetaData
associated to field_name
.
For non-nested parquet types, this returns a single column
Creates a new iterator of compressed pages.
Returns a stream of compressed data pages
Reads all columns that are part of the parquet field field_name
Reads all columns that are part of the parquet field field_name
Returns a vector of iterators of Array
corresponding to the top level parquet fields whose
name matches fields
’s names.
Reads parquets’ metadata syncronously.
Reads parquets’ metadata asynchronously.
Read [PageLocation
]s from the ColumnChunkMetaData
s.
Returns an empty vector if indexes are not available
Type Definitions
Type declaration for a page filter