Arrays base tools


clean_dtype(dtype[, sort])

Remove offsets from dtype, keeping only names and dtype.


Remove offset fields from structured array.

set_or_add_to_structured(array, data[, copy])

Updates existing structured array, either by replacing the data for existing fields, or by adding new fields to the array.


Casts list of array and names (and optional dtypes) to numpy structured array.


Casts list of array and names (and optional dtypes) to numpy structured array.

See Numpy’s documentation for how to use numpy’s structured arrays.

A pandas DataFrame can be converted into a numpy record array, using DataFrame.to_records() (to_records documentation, documentation on record arrays). A record array can then be converted into a structured array by using:

>>> recordarr.view(recordarr.dtype.fields, numpy.ndarray)

arrays (list-of-tuple) – list((name, array-like, *dtype-infos))) or {name: array-like}. A lone value will be broadcasted as an array full of this value and of the size of the other arrays




>>> to_structured([
>>>     ('a', numpy.arange(5), 'uint32'),
>>>     ('b', 2 * numpy.arange(5), 'float32')
>>> ])
array([(0, 0.), (1, 2.), (2, 4.), (3, 6.), (4, 8.)],
  dtype=[('a', '<u4'), ('b', '<f4')])

A single value can also be used, and will be broadcasted as an array full of this value, and of the size of the other arrays, for instance:

>>> to_structured([
>>>     ('a', numpy.arange(5), 'uint32'),
>>>     ('b', 2, 'float32')
>>> ])
array([(0, 2.), (1, 2.), (2, 2.), (3, 2.), (4, 2.)],
  dtype=[('a', '<u4'), ('b', '<f4')])

Not specifying the dtype in the tuples will cause the function to use the array’s dtype, or to infer it in case of sequences of Python objects.

>>> to_structured([
>>>    ('a', numpy.arange(5, dtype='uint32')),
>>>    ('b', [2 * i for i in range(5)])
>>> ])
array([(0, 0), (1, 2), (2, 4), (3, 6), (4, 8)],
  dtype=[('a', '<u4'), ('b', '<i8')])

Using dictionaries:

>>> to_structured({
>>>     'a': numpy.arange(5, dtype='uint32'),
>>>     'b': [2 * i for i in range(5)]
>>> })
array([(0, 0), (1, 2), (2, 4), (3, 6), (4, 8)],
  dtype=[('a', '<u4'), ('b', '<i8')])

(n, m) 2D numpy arrays are viewed as (n,) arrays of (m,) 1D arrays:

>>> to_structured([
>>> ('a', numpy.arange(5)),
>>> ('b', numpy.arange(15).reshape(5, 3))
>>> ])
array([(0, [ 0,  1,  2]), (1, [ 3,  4,  5]), (2, [ 6,  7,  8]),
   (3, [ 9, 10, 11]), (4, [12, 13, 14])],
  dtype=[('a', '<i8'), ('b', '<i8', (3,))])
set_or_add_to_structured(array, data, copy=True)

Updates existing structured array, either by replacing the data for existing fields, or by adding new fields to the array.

Fast alternative to numpy.lib.recfunctions.append_fields.

  • array (struct-array) – array to update

  • data (list-of-tuple) – list((name, array-or-scalar)) (scalars are broadcasted)

  • copy (bool?) – set to False to avoid copy when possible (default: True)




Adding field to existing structured array:

>>> array = to_structured([
>>>     ('a', numpy.arange(5, dtype='uint8')),
>>>     ('b', 2 * numpy.arange(5, dtype='uint16')),
>>> ])
>>> new_data = 3 * numpy.arange(5, dtype='float32')
>>> updated_array = set_or_add_to_structured(array, [
>>>     ('c', new_data),
>>> ])
>>> updated_array
array([(0, 0,  0.), (1, 2,  3.), (2, 4,  6.), (3, 6,  9.), (4, 8, 12.)],
  dtype=[('a', 'u1'), ('b', '<u2'), ('c', '<f4')])

Replacing data from a structured array:

>>> updated_array = set_or_add_to_structured(array, [
>>>     ('b', new_data)
>>> ])
array([(0,  0), (1,  3), (2,  6), (3,  9), (4, 12)],
  dtype=[('a', 'u1'), ('b', '<u2')])

Or doing both, while adding broadcasted constants:

>>> updated_array = set_or_add_to_structured(array, [
>>>     ('b', new_data),
>>>     ('c', 2 * new_data),
>>>     ('d', 1),
>>>     ('e', b'1')
>>> ])
    (0,  0,  0., 1, b'1'),
    (1,  3,  6., 1, b'1'),
    (2,  6, 12., 1, b'1'),
    (3,  9, 18., 1, b'1'),
    (4, 12, 24., 1, b'1')],
    dtype=[('a', 'u1'), ('b', '<u2'), ('c', '<f4'), ('d', '<i8'), ('e', 'S1')])
clean_dtype(dtype, sort=False)

Remove offsets from dtype, keeping only names and dtype. (See Numpy dtype documentation.)

  • dtype (dtype-descr) – either a numpy.dtype, or a description of it

  • sort (bool?) – (default: False)


clean dtype, without offsets, sorted by field-names


>>> d = numpy.dtype({
>>>    'names': ['z_col', 'd_col', 'a_col'],
>>>    'formats': ['i4', 'f4','i4'],
>>>    'offsets': [0, 4, 40]
>>> })
>>> d
dtype({'names':['z_col','d_col','a_col'], 'formats':['<i4','<f4','<i4'], 'offsets':[0,4,40], 'itemsize':44})
>>> clean_dtype(d)
[('a_col', dtype('int32')),
('d_col', dtype('float32')),
('z_col', dtype('int32'))]

Remove offset fields from structured array. Does not copy the data if the dtype does not have offsets.


array (array) – structured array


structured array without offsets


>>> a = numpy.array([(1, 2, 3), (4, 5, 6)], [('a', 'i4'), ('b', 'i4'), ('c', 'i4')])
>>> b = a[['c', 'a']]
>>> b.dtype
dtype({'names':['c','a'], 'formats':['<i4','<i4'], 'offsets':[8,0], 'itemsize':12})
>>> b = remove_structured_offset(b)
>>> b.dtype
dtype([('c', '<i4'), ('a', '<i4')])