pub struct Utf8Array<O: Offset> { /* private fields */ }
Expand description

A Utf8Array is arrow’s semantic equivalent of an immutable Vec<Option<String>>. Cloning and slicing this struct is O(1).

Example

use arrow2::bitmap::Bitmap;
use arrow2::buffer::Buffer;
use arrow2::array::Utf8Array;
let array = Utf8Array::<i32>::from([Some("hi"), None, Some("there")]);
assert_eq!(array.value(0), "hi");
assert_eq!(array.iter().collect::<Vec<_>>(), vec![Some("hi"), None, Some("there")]);
assert_eq!(array.values_iter().collect::<Vec<_>>(), vec!["hi", "", "there"]);
// the underlying representation
assert_eq!(array.validity(), Some(&Bitmap::from([true, false, true])));
assert_eq!(array.values(), &Buffer::from(b"hithere".to_vec()));
assert_eq!(array.offsets(), &Buffer::from(vec![0, 2, 2, 2 + 5]));

Generic parameter

The generic parameter Offset can only be i32 or i64 and tradeoffs maximum array length with memory usage:

  • the sum of lengths of all elements cannot exceed Offset::MAX
  • the total size of the underlying data is array.len() * size_of::<Offset>() + sum of lengths of all elements

Safety

The following invariants hold:

  • Two consecutives offsets casted (as) to usize are valid slices of values.
  • A slice of values taken from two consecutives offsets is valid utf8.
  • len is equal to validity.len(), when defined.

Implementations

Returns a Utf8Array created from its internal representation.

Errors

This function returns an error iff:

  • the offsets are not monotonically increasing
  • The last offset is not equal to the values’ length.
  • the validity’s length is not equal to offsets.len() - 1.
  • The data_type’s crate::datatypes::PhysicalType is not equal to either Utf8 or LargeUtf8.
  • The values between two consecutive offsets are not valid utf8
Implementation

This function is O(N) - checking monotinicity and utf8 is O(N)

Returns a Utf8Array from a slice of &str.

A convenience method that uses Self::from_trusted_len_values_iter.

Returns a new Utf8Array from a slice of &str.

A convenience method that uses Self::from_trusted_len_iter.

Returns an iterator of Option<&str>

Returns an iterator of &str

Returns the length of this array

Returns the value of the element at index i, ignoring the array’s validity.

Panic

This function panics iff i >= self.len.

Returns the value of the element at index i, ignoring the array’s validity.

Safety

This function is safe iff i < self.len.

Returns the DataType of this array.

Returns the values of this Utf8Array.

Returns the offsets of this Utf8Array.

The optional validity.

Returns a slice of this Utf8Array.

Implementation

This operation is O(1) as it amounts to essentially increase two ref counts.

Panic

This function panics iff offset + length >= self.len().

Returns a slice of this Utf8Array.

Implementation

This operation is O(1) as it amounts to essentially increase two ref counts.

Safety

The caller must ensure that offset + length <= self.len().

Boxes self into a Box<dyn Array>.

Boxes self into a std::sync::Arc<dyn Array>.

Returns this Utf8Array with a new validity.

Panics

This function panics iff validity.len() != self.len().

Sets the validity of this Utf8Array.

Panics

This function panics iff validity.len() != self.len().

Try to convert this Utf8Array to a MutableUtf8Array

Returns a new empty Utf8Array.

The array is guaranteed to have no elements nor validity.

Returns a new Utf8Array whose all slots are null / None.

Returns a default DataType of this array, which depends on the generic parameter O: DataType::Utf8 or DataType::LargeUtf8

Creates a new Utf8Array without checking for offsets monotinicity nor utf8-validity

Errors

This function returns an error iff:

  • The last offset is not equal to the values’ length.
  • the validity’s length is not equal to offsets.len() - 1.
  • The data_type’s crate::datatypes::PhysicalType is not equal to either Utf8 or LargeUtf8.
Safety

This function is unsound iff:

  • the offsets are not monotonically increasing
  • The values between two consecutive offsets are not valid utf8
Implementation

This function is O(1)

Creates a new Utf8Array.

Panics

This function panics iff:

  • the offsets are not monotonically increasing
  • The last offset is not equal to the values’ length.
  • the validity’s length is not equal to offsets.len() - 1.
  • The data_type’s crate::datatypes::PhysicalType is not equal to either Utf8 or LargeUtf8.
  • The values between two consecutive offsets are not valid utf8
Implementation

This function is O(N) - checking monotinicity and utf8 is O(N)

Creates a new Utf8Array without checking for offsets monotinicity.

Errors

This function returns an error iff:

  • The last offset is not equal to the values’ length.
  • the validity’s length is not equal to offsets.len() - 1.
  • The data_type’s crate::datatypes::PhysicalType is not equal to either Utf8 or LargeUtf8.
Safety

This function is unsound iff:

  • the offsets are not monotonically increasing
  • The values between two consecutive offsets are not valid utf8
Implementation

This function is O(1)

Returns a (non-null) Utf8Array created from a TrustedLen of &str.

Implementation

This function is O(N)

Creates a new Utf8Array from a Iterator of &str.

Creates a Utf8Array from an iterator of trusted length.

Safety

The iterator must be TrustedLen. I.e. that size_hint().1 correctly reports its length.

Creates a Utf8Array from an iterator of trusted length.

Creates a Utf8Array from an falible iterator of trusted length.

Safety

The iterator must be TrustedLen. I.e. that size_hint().1 correctly reports its length.

Creates a Utf8Array from an fallible iterator of trusted length.

Alias for new

Alias for Self::new_unchecked

Safety

This function is unsafe iff:

  • the offsets are not monotonically increasing
  • The values between two consecutive offsets are not valid utf8

Trait Implementations

Converts itself to a reference of Any, which enables downcasting to concrete types.

Converts itself to a mutable reference of Any, which enables mutable downcasting to concrete types.

The length of the Array. Every array has a length corresponding to the number of elements (slots). Read more

The DataType of the Array. In combination with Array::as_any, this can be used to downcast trait objects (dyn Array) to concrete arrays. Read more

The validity of the Array: every array has an optional Bitmap that, when available specifies whether the array slot is valid or not (null). When the validity is None, all slots are valid. Read more

Slices the Array, returning a new Box<dyn Array>. Read more

Slices the Array, returning a new Box<dyn Array>. Read more

Clones this Array with a new new assigned bitmap. Read more

Clone a &dyn Array to an owned Box<dyn Array>.

whether the array is empty

The number of null slots on this Array. Read more

Returns whether slot i is null. Read more

Returns whether slot i is valid. Read more

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

Formats the value using the given formatter. Read more

Returns the “default value” for a type. Read more

Converts to this type from the input type.

Converts to this type from the input type.

Creates a value from an iterator. Read more

The values of the array

The offsets of the array

The type of the elements being iterated over.

Which kind of iterator are we turning this into?

Creates an iterator from a value. Read more

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason. Read more

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason. Read more

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason. Read more

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The resulting type after obtaining ownership.

Creates owned data from borrowed data, usually by cloning. Read more

Uses borrowed data to replace owned data, usually by cloning. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.