Variable stride arrays

This page presents the concept of variable stride array objects and an utility class to manipulate it.

Rationale

A variable stride array (hereafter VS array) is conceptually an array of \(N\) arrays, where each subarray have a different size but the same datatype. This differs from multidimensional arrays (such as numpy ndarray) which have a rectangular shape.

A typical usage of VS array is the representation of mesh connectivities, since the number of neighbor is not the same for each entity of the mesh. For exemple, if we want to describe the vertex-to-cell connectivity of the following mesh,

../../_images/mesh_vstride.svg

we would have the single value 1 for the first vertex, then the pair of values (1,2) for the second vertex, the pair of values (2,3) for the third vertex, etc. The vertex most connected vertex is n°5 for which we would have the five values (1,2,3,4,5).

A VS array could be implemented as a list of arrays (here [[1], [1,2], [2,3], ..., [5]]), but for performance issues, it is often preferable to store the data in a single memory contiguous buffer. We call it the values of the VS array. Therefore, an additional information is needed to know where starts and ends each subarray: we can use either

  • a counting array counts of size \(N\), where counts[i] store the lenght of the i-th subarray,

  • or a displacements array displs of size \(N+1\), where displs[i] store the starting position of the i-th subarray in the values array.

    #Index    0  1     2     3     4              5     6  7     8
    counts = [1, 2,    2,    2,    5,             2,    1, 2,    1]
    displs = [0, 1,    3,    5     7,             12,   14,15,   17, 18]
    values = [1, 1, 2, 2, 3, 1, 4, 1, 2, 3, 4, 5, 3, 5, 4, 4, 5, 5]
    

Note that counts and displs are redondant: in the data structure described here, it is not permitted to use the displs array to jump over some parts of values or to change the order of elements. Thus, the following properties holds:

  • counts 0 and counts.sum() == values.size

  • displs[i+1] = displs[i] + counts[i], with displs[0] = 0. In particular, displs[N]==values.size.

In the remaining of this page, we will use the term strides to designate this virtual slicing of the VS array. The subarrays, who are thus sections of the values arrays, will be called blocks. We emphasize the fact that an element of the VS array is a whole block.

Note

  • The CGNS standard uses the displs option to describe polyedric elements; this is the purpose of the ElementStartOffset node.

  • In sparse linear algebra, the CSR data structure is quite similar to a VS array, with the difference that CSR stores an additional array containing the column index of its elements.

Python module

Maia provides the vstride module in order to operate on such data structure. This module defines the VStrideArray class as well as some functions operating on VS arrays.

For now, the underlying fields of a VS array are represented by numpy arrays. The values array is allowed to have any supported datatype, while the counts and displs arrays are requested to be integer arrays of same datatype (either int32 or int64).

Tip

In this documentation, the module namespace is denoted vs, while np designate the numpy module:

>>> import numpy as np
>>> from maia.utils import vstride as vs

The VStrideArray object

class VStrideArray

A class representing a variable stride array.

Once created, a VStrideArray instance arr has the following attributes:

displs

view on displs array

counts

view on counts array

values

values array

dsize

size of the underlying values array

dtype

datatype of the underlying values array

In addition, the following operations are supported:

  • Length: len(arr) returns the number of elements N.

  • Basic indexing: arr[i] returns the i-th block, ie a view on the values array.

  • Basic assignement: arr[i] = val replace the \(m_i\) values of the i-th block using the provided input val, which must be an array of relevant size \(m_i\) or a scalar (it will be broadcasted).

Warning

We strongly advise to not iterate over blocks using python loops such as func(blk) for blk in arr. As for numpy arrays, this would lead to poor performances. Several functions or methods are provided to avoid such loops.

  • Unary arithmetic operations -, +, abs() and ~ applies element wise on the values array.

  • Binary arithmetic operations (such as +, -, *, /, %, **, &, etc.) and comparison operation (such as ==, !=, <, >=, etc.) also apply element wise on the values array. The right operand can be:

    • another VStrideArray object. In this case, the strides of the two operand must be equal:

      >>> a = vs.from_counts([2,2,1], [0.2, 1.4, 2.6, 0.5, 1.0])
      >>> b = vs.from_counts([2,2,1], [0.1, 2.3, 1.4, 0.6, 0.9])
      >>> a <= b
      vsarray([
        [False,  True],
        [False,  True],
        [False],
      ], dtype=bool)
      
    • a numpy array of size \(N\). In this case, each value will be repeated \(m_i\) times:

      >>> vs.from_counts([2,2,1], np.arange(5)) + np.arange(3)
      vsarray([
        [0, 1],
        [3, 4],
        [6],
      ], dtype=int64)
      
    • a scalar value. In this case, it will we broadcasted to a constant array of size values.size:

      >>> vs.from_counts([2,2,1], np.arange(5)) * 2
      vsarray([
        [0, 2],
        [4, 6],
        [8],
      ], dtype=int64)
      

    These operations return a new VStrideArray instance. The output dtype of its values array is determined by numpy, to which the operation itself is delegated.

  • Inplace arithmetic operations (such as +=, -=, *=, etc.) are also supported under the same conditions for the right operand.

__init__(displs, counts, values)

Create a new VStrideArray object.

At least one of displs or counts argument must not be None. The argument values must never be None.

Parameters
  • displs (integer ndarray, optional) – array of displs or None

  • counts (integer ndarray, optional) – array of counts or None

  • values (ndarray) – array of values

Example

>>> vs.VStrideArray(None, np.array([3,5,2]),  np.arange(10))
vsarray([
  [0, 1, 2],
  [3, 4, 5, 6, 7],
  [8, 9],
], dtype=int64)
reduce(op)

Apply a reduction operation within each block.

The avalaible operations are the members of the enumeration ReduceOp.

This function returns an array of size \(N\) (one value per block). The datatype of the output array depends on the input datatype and the requested operation: it is generally the same than the input datatype, excepted for logical operations that always return boolean arrays and for ReduceOp.SUM that return int64 when applied to a boolean input.

Note

For empty blocks (ie for the set of i such that counts[i] == 0), the corresponding result out[i] will be initialized with the neutral value of the corresponding operation.

It is still possible to filter the result afterward, eg with out[self.counts > 0].

Parameters

op (ReduceOp) – operation performed to reduce the blocks

Returns

flat ndarray of size N – result of the reduction

Example

>>> a = vs.from_counts([3,5,2], np.arange(10))
>>> a.reduce(vs.ReduceOp.SUM)
array([ 3, 23, 17])
restride(displs=None, counts=None)

Change the strides (counts and displs) of the array, without updating its values.

Eiher a new displs or a new counts array can be provided. If both arguments are None, the object remain unchanged. Note that the new strides must be compatible with the array.dsize attribute.

Example

>>> a = vs.from_counts([1, 2, 5], [0.4, 0.3, 0.5, 0.1, 0.7, 0.2, 0.6, 0.9])
>>> a.restride(counts=[4,4])
>>> a
vsarray([
  [0.4, 0.3, 0.5, 0.1],
  [0.7, 0.2, 0.6, 0.9],
], dtype=float64)
to_array_list()

Return the values as a list of NumPy 1d arrays.

The len of the output list is N, and the size of each element i is counts[i]. Data is copied into the 1d arrays.

Returns

list of ndarray – values as a list of 1d arrays

Example

>>> a = vs.from_counts([0, 2, 5], [0.3, 0.5, 0.1, 0.7, 0.2, 0.6, 0.9])
>>> a.to_array_list()
[array([], dtype=float64), array([0.3, 0.5]), array([0.1, 0.7, 0.2, 0.6, 0.9])]
to_masked_array()

Return the values as a NumPy masked ndarray.

Output is a 2d array of shape (N, max(counts)), where the masked value is used to complete each row i for which counts[i] < max(counts). Data is copied into the masked array.

Returns

masked ndarray – values as a masked array

Example

>>> a = vs.from_displs([0,3,6,6,8,9], [1,3,3,4,5,6,7,8,10])
>>> print(a.to_masked_array())
[[1 3 3]
 [4 5 6]
 [-- -- --]
 [7 8 --]
 [10 -- --]]

Arrays routines

The vstride module provides the following routines in order to create or manipulate VStrideArray instances:

Creation routines : these routines offer shortcuts to create a new VStrideArray from other input data.

array

Create a new instance from the input data.

from_counts

Create a new instance from a 1d array described by its counts.

from_displs

Create a new instance from a 1d array described by its displs.

Indexing routines : these routines create a new VStrideArray by acting on the specified indices of the input array.

take

Take elements from the input VS array.

put

Update the specified elements of input VS array with provided values.

insert

Insert new elements in the input VS array.

delete

Remove elements from the input VS array.

Algorithms : these routines apply an algorithm to VStrideArray object. All these functions take a parameter axis to indicate if the algorithm is applied:

  • independantly on each block of the array (using INNER_AXIS);

  • globally over the elements of the array (using OUTER_AXIS).

flip

Reverse the order of values of the input VS array.

sort

Sort the values of the input VS array.

unique

Find the unique values of the input VS array.

roll

Roll the values of the input VS array.

concatenate

Join a sequence of VS arrays.

Additional operators : few operators that are not part of language keywords, or additional comparison functions.

sign

Create a new VS array storing the sign of array.values.

strides_equal

True if the two input VS arrays have the same strides, False otherwise.

array_equal

True if the two input VS arrays have the same strides and values, False otherwise.

array_close

True if the two input VS arrays have the same strides and close values, False otherwise.


array(data, *, dtype=None)

Create a new instance from the input data.

Input data can be either:

  • a list of objects convertible to a 1d ndarray. This include eg list of lists, list of arrays, but not list of scalars;

  • a 2d masked ndarray, in which case only visible values are selected;

  • another VStrideArray instance.

Input data is always copied to create the new VS array.

Parameters
  • data (object) – see above

  • dtype (data-type, optional) – expected datatype of the values array. If None, datatype is inferred by NumPy from the input data.

Returns

VStrideArray – new VS array

Examples

>>> vs.array([[1,2], [3,4,5], [], [6]]) # From nested lists
vsarray([
  [1, 2],
  [3, 4, 5],
  [],
  [6]
], dtype=int64)
>>> data = np.ma.array([[1, 2, 3], [4,5,6], [7, 8, 9]], # From masked array
...               mask=[[0, 0, 1], [0,0,0], [0,1,1]])
>>> vs.array(data)
vsarray([
  [1, 2],
  [4, 5, 6],
  [7],
], dtype=int64)
from_counts(counts, values, *, dtype=None)

Create a new instance from a 1d array described by its counts.

Arguments are converted to ndarray using np.asarray: this operation is copyless if arguments are already ndarray objects and if values matches the requested datatype.

Parameters
  • counts (array_like) – an object convertible to a 1d integer ndarray

  • values (array_like) – an object convertible to a 1d ndarray

  • dtype (data-type, optional) – override datatype of the values array. If None, datatype is inferred from the input data.

Returns

VStrideArray – new VS array

Example

>>> vs.from_counts([0, 2, 5], [0.3, 0.5, 0.1, 0.7, 0.2, 0.6, 0.9])
vsarray([
  [],
  [0.3, 0.5],
  [0.1, 0.7, 0.2, 0.6, 0.9],
], dtype=float64)
from_displs(displs, values, *, dtype=None)

Create a new instance from a 1d array described by its displs.

Arguments are converted to ndarray using np.asarray: this operation is copyless if arguments are already ndarray objects and if values matches the requested datatype.

Parameters
  • displs (array_like) – an object convertible to a 1d integer ndarray

  • values (array_like) – an object convertible to a 1d ndarray

  • dtype (data-type, optional) – override datatype of the values array. If None, datatype is inferred from the input data.

Returns

VStrideArray – new VS array

Example

>>> vs.from_displs([0, 2, 5], [0.3, 0.5, 0.1, 0.7, 0.2], dtype='f4')
vsarray([
  [0.3, 0.5],
  [0.1, 0.7, 0.2],
], dtype=float32)
take(array, indices)

Take elements from the input VS array.

Indices to extract directly refer to elements number, and must thus be included in [0, len(array)[. A given index can be provided more than once.

A new object is returned.

Parameters
  • array (VStrideArray) – input VS array

  • indices (array of int) – indices of the elements to extract

Returns

VStrideArray – extracted VS array

Example

>>> a = vs.from_displs([0, 2, 4, 6, 9, 10], values=np.arange(10))
>>> vs.take(a, [2, 1, 4, 1])
vsarray([
  [4, 5],
  [2, 3],
  [9],
  [2, 3],
], dtype=int64)
put(array, indices, values)

Update the specified elements of input VS array with provided values.

Indices to update directly refer to elements number, and must thus be included in [0, len(array)[. A given index can be provided more than once; in this case, only the last corresponding value is used.

The new values must be provided as a VStrideArray of lenght len(indices). Its datatype will be converted if needed to match array.dtype.

If indices is a scalar value, then a single 1d array_like object is allowed for values.

Note

Contrary to np.put, this function does not operate inplace, since the strides of the input VS array can be modified. A new object is returned.

Parameters
  • array (VStrideArray) – input VS array

  • indices (int or array of int) – indices of the elements to update

  • values (array_like or VStrideArray) – block(s) to write

Returns

VStrideArray – updated VS array

Example

>>> a = vs.from_counts([2, 3, 1, 3], np.arange(9))
>>> vals = vs.array([[-1,-2,-3,-4], [99]])
>>> vs.put(a, [2,0], vals)
vsarray([
  [99],
  [ 2,  3,  4],
  [-1, -2, -3, -4],
  [ 6,  7,  8],
], dtype=int64)
delete(array, indices)

Remove elements from the input VS array.

Indices to delete directly refer to elements number, and must thus be included in [0, len(array)[.

A new object is returned.

Parameters
  • array (VStrideArray) – input VS array

  • indices (array of int) – indices of the elements to remove

Returns

VStrideArray – filtered VS array

Example

>>> a = vs.from_displs([0, 2, 4, 6, 9, 10], np.arange(10))
>>> vs.delete(a, [2, 1, 4])
vsarray([
  [0, 1],
  [6, 7, 8],
], dtype=int64)
insert(array, indices, values)

Insert new elements in the input VS array.

The indices where the new block(s) are insered must be included in [0, len(array)]. The insered values must be provided as a VStrideArray of length len(indices). Its datatype will be converted if needed to match array.dtype.

If indices is a scalar value, then a single 1d array_like object is expected for values.

A new object is returned.

Parameters
  • array (VStrideArray) – input VS array

  • index (int of array of int) – position where the block(s) should be insered

  • values (array_like or VStrideArray) – block(s) to insert

Returns

VStrideArray – new VS array

Example

>>> a = vs.from_counts([2, 4, 3], np.arange(9))
>>> vs.insert(a, 1, [9,10,11])
vsarray([
  [ 0,  1],
  [ 9, 10, 11],
  [ 2,  3,  4,  5],
  [ 6,  7,  8],
], dtype=int64)
flip(array, axis)

Reverse the order of values of the input VS array.

Depending on the axis argument, the operation reverse:

  • the order of blocks if axis==OUTER_AXIS, which is roughly equivalent to

    vs.array([blk for blk in array][::-1])
    
  • each block independently if axis==INNER_AXIS, which is roughly equivalent to

    vs.array([blk[::-1] for blk in array)]
    

In both cases, a copy is done and a new object is returned.

Parameters
  • array (VStrideArray) – input VS array

  • axis (Axis) – direction used to reverse

Returns

VStrideArray – flipped VS array

Example

>>> a = vs.from_counts([2, 4, 3], np.arange(9))
>>> vs.flip(a, vs.OUTER_AXIS)
vsarray([
  [6, 7, 8],
  [2, 3, 4, 5],
  [0, 1],
], dtype=int64)
>>> vs.flip(a, vs.INNER_AXIS)
vsarray([
  [1, 0],
  [5, 4, 3, 2],
  [8, 7, 6],
], dtype=int64)
sort(array, axis)

Sort the values of the input VS array.

Depending on the axis argument, the operation sort:

  • the elements if axis==OUTER_AXIS, which is roughly equivalent to

    vs.array(sorted([blk for blk in array]))
    

    In this case, lexicographic order is used to compare two blocks.

  • each block if axis==INNER_AXIS, which is roughly equivalent to

    vs.array([sort(blk) for blk in array)]
    

In both cases, a copy is done and a new VStrideArray is returned.

Parameters
  • array (VStrideArray) – input VS array

  • axis (Axis) – direction used to sort

Returns

VStrideArray – sorted VS array

Example

>>> a = vs.from_counts([2, 4, 3], [3,2, 3,1,5,2, 9,5,8])
>>> vs.sort(a, vs.OUTER_AXIS)
vsarray([
  [3, 1, 5, 2],
  [3, 2],
  [9, 5, 8],
], dtype=int64)
>>> vs.sort(a, vs.INNER_AXIS)
vsarray([
  [2, 3],
  [1, 2, 3, 5],
  [5, 8, 9],
], dtype=int64)
unique(array, axis)

Find the unique values of the input VS array.

Depending on the axis argument, the operation applies to:

  • the elements if axis==OUTER_AXIS (not yet implemented), which is roughly equivalent to

    vs.array(unique([blk for blk in array]))
    
  • each block if axis==INNER_AXIS, which is roughly equivalent to

    vs.array([unique(blk) for blk in array)]
    

    In this case, the output array have the same number of elements \(N\), but a different counts array.

Parameters
  • array (VStrideArray) – input VS array

  • axis (Axis) – direction in which algorithm is applied

Returns

VStrideArray – unique VS array

Example

>>> a = vs.from_counts([2, 4, 3], [2,2, 3,1,3,2, 9,5,5])
>>> vs.unique(a, vs.INNER_AXIS)
vsarray([
  [2],
  [3, 1, 2],
  [9, 5],
], dtype=int64)
roll(array, shift, axis)

Roll the values of the input VS array.

Positive values of shift move the elements to the right, while negative values move them to the left. Values leaving the array are reintroduced on the opposite side.

Depending on the axis argument, the operation apply to:

  • the elements if axis==OUTER_AXIS, which is roughly equivalent to

    vs.array(array[-shift:] + array[:shift+1]) # for positive shift
    
  • each block if axis==INNER_AXIS, which is roughly equivalent to

    vs.array([roll(blk, shift) for blk in array)]
    

In both cases, a copy is done and a new VStrideArray is returned.

Parameters
  • array (VStrideArray) – input VS array

  • shift (int) – number of places by which the elements are shifted

  • axis (Axis) – direction used to roll

Returns

VStrideArray – rolled VS array

Example

>>> a = vs.from_counts([2, 3, 5, 4], [1,2, 3,1,1, 2,7,2,5,9, 6,4,4,2])
>>> vs.roll(a, 2, vs.OUTER_AXIS)
vsarray([
  [2, 7, 2, 5, 9],
  [6, 4, 4, 2],
  [1, 2],
  [3, 1, 1],
], dtype=int64)
>>> vs.roll(a, -1, vs.INNER_AXIS)
vsarray([
  [2, 1],
  [1, 1, 3],
  [7, 2, 5, 9, 2],
  [4, 4, 2, 6],
], dtype=int64)
concatenate(array_l, axis)

Join a sequence of VS arrays.

Depending on the axis argument, the concatenation is applied to:

  • the elements if axis==OUTER_AXIS, which is roughly equivalent to

    vs.array([blk for arr in array_l for blk in arr])
    

    In this case, the input arrays can have a different length; the number of elements of the output array is the sum of the input lenghts.

  • each block if axis==INNER_AXIS, which is roughly equivalent to

    vs.array([concatenate(blks) for blks in zip(*array_l)])
    

    In this case, all the input arrays must have the same number of elements \(N\), which is the number of elements of the output array.

Parameters
  • array_l (sequence of VStrideArray) – VS arrays to concatenate

  • axis (Axis) – direction used to concatenate

Returns

VStrideArray – concatenated VS array

Example

>>> a1 = vs.array([[0,1],  [2,3,4], [5,6]],  dtype=int)
>>> a2 = vs.array([[],  [0,1,2,3], [4,6,7]], dtype=int)
>>> vs.concatenate([a1, a2], vs.OUTER_AXIS)
vsarray([
  [0, 1],
  [2, 3, 4],
  [5,6],
  [],
  [0, 1, 2, 3],
  [4, 5, 6],
], dtype=int64)
>>> vs.concatenate([a1, a2], vs.INNER_AXIS)
vsarray([
  [0, 1],
  [2, 3, 4, 0, 1, 2, 3],
  [5, 6, 4, 6, 7],
], dtype=int64)
sign(array, dtype=None)

Create a new VS array storing the sign of array.values. The strides of the output array are identical to the input strides.

Parameters
  • array (VStrideArray) – input VS array

  • dtype (data-type, optional) – desired datatype for the output

Returns

VStrideArray – sign VS array

Example

>>> vs.sign(vs.from_counts([2,3,1], [1,2,-3,4,0,-6]))
vsarray([
  [ 1,  1],
  [-1,  1,  0],
  [-1],
], dtype=int64)
strides_equal(a1, a2)

True if the two input VS arrays have the same strides, False otherwise.

Parameters
Returns

bool – comparison result

Example

>>> vs.array_equal(vs.from_counts([2,3,1], [1,2,3,4,5,6]),
...                vs.from_counts([2,3,1], [6,5,4,3,2,1]))
True
>>> vs.array_equal(vs.from_counts([2,3,1], [1,2,3,4,5,6]),
...                vs.from_counts([3,2,1], [1,2,3,4,5,6]))
False
array_equal(a1, a2)

True if the two input VS arrays have the same strides and values, False otherwise.

Parameters
Returns

bool – comparison result

Example

>>> vs.array_equal(vs.from_counts([2,3,1], [1,2,3,4,5,6]),
...                vs.from_counts([2,3,1], [1,2,3,4,5,6]))
True
>>> vs.array_equal(vs.from_counts([2,3,1], [1,2,3,4,5,6]),
...                vs.from_counts([2,3,1], [6,5,4,3,2,1]))
False
array_close(a1, a2, rtol=1e-05, atol=1e-08)

True if the two input VS arrays have the same strides and close values, False otherwise.

Value comparison is performed by np.isclose. See the related documentation for description of rtol and atol parameters.

Parameters
  • a1 (VStrideArray) – first input

  • a2 (VStrideArray) – second input

  • rtol (float, optional) – relative tolerance for comparison

  • atol (float, optional) – absolute tolerance for comparison

Returns

bool – comparison result

Example

>>> vs.array_close(vs.from_counts([2,3,1], [1.,2,3,4,5,6]),
...                vs.from_counts([2,3,1], [1.,2,3,4,5,6+1e-9]))
True
>>> vs.array_close(vs.from_counts([2,3,1], [1.,2,3,4,5,6]),
...                vs.from_counts([2,3,1], [1.,2,3,4,5,6.2]))
False
Axis : Enum class

Enumeration to indicate the axis on which a function should operate. Members are INNER and OUTER.

ReduceOp : Enum class

Enumeration storing the available operations. Members are SUM, PROD, MIN, MAX, LAND, LOR, BAND and BOR, where ‘L’ stands for logical operations and ‘B’ for bitwise operations.

INNER_AXIS

A shortcut to Axis.INNER

OUTER_AXIS

A shortcut to Axis.OUTER