pub fn read_filtered_pages<R: Read + Seek, F: Fn(&[FieldPageStatistics], &[Vec<Vec<Interval>>]) -> Vec<Interval>>(
reader: &mut R,
row_group: &RowGroupMetaData,
fields: &[Field],
predicate: F
) -> Result<Vec<Vec<Vec<FilteredPage>>>, Error>
Available on crate feature
io_parquet
only.Expand description
Reads all page locations and index locations (IO-bounded) and uses predicate
to compute
the set of FilteredPage
that fulfill the predicate.
The non-trivial argument of this function is predicate
, that controls which pages are selected.
Its signature contains 2 arguments:
- 0th argument (indexes): contains one
ColumnPageStatistics
(page statistics) per field. Use it to evaluate the predicate against - 1th argument (intervals): contains one
Vec<Vec<Interval>>
(row positions) per field. For each field, the outermost vector corresponds to each parquet column: a primitive field contains 1 column, a struct field with 2 primitive fields contain 2 columns. The innerVec<Interval>
contains oneInterval
per page: its length equals the length ofColumnPageStatistics
. It returns a singleVec<Interval>
denoting the set of intervals that the predicate selects (over all columns).
This returns one item per field
. For each field, there is one item per column (for non-nested types it returns one column)
and finally Vec<FilteredPage>
, that corresponds to the set of selected pages.