librt.vecs¶
The librt.vecs module defines the vec type, a low-level, uniform growable array type.
It’s part of the librt package on PyPI.
When constructing a vec, the item type T is always explicitly given via vec[T]:
from librt.vecs import append, vec
v = vec[float]([1.0, 2.5]) # Construct vec[float] with two items
vec supports many sequence operations, though it’s not a full sequence type:
len(v) # 2
v[0] # 1.0
v[-1] # 2.5
for x in v:
print(x)
The length of each vec value is immutable. Appending an item is still a fast operation,
but it returns a new vec value:
v = append(v, -0.5)
print(v) # vec[float]([1.0, 2.5, -0.5])
vec only supports simple, uniform item types. It uses an efficient packed binary encoding
for these value item types:
mypy_extensions.i64(signed 64-bit integer)mypy_extensions.i32(signed 32-bit integer)mypy_extensions.i16(signed 16-bit integer)mypy_extensions.u8(unsigned byte)float(64-bit float)bool
int is not a valid item type, since it has an arbitrary precision, and vec is an
efficiency-focused type. Use one of the fixed-length integer types instead.
Class item types (e.g. str or MyNativeClass) are represented as regular object references.
Optional class item types (e.g. str | None) are supported for convenience, but arbitrary
union types are not supported as item types. Nested vecs are supported, e.g. vec[vec[i64]].
A vec value is often used as an efficient alternative to list or array.array in code
compiled using mypyc. Its primary advantages are an efficient packed memory representation
for value item types and very fast inlined get and set item operations.
Vec instances perform runtime checking of item types. Since values of type variables are not available at runtime (they are erased), type variables can’t be used as item types.
A vec value is effectively an immutable (length, buffer) pair. This means that any operation
that changes the length of a vec, including append as we saw above, returns a modified
value.
Note
An immutable length allows more efficient code to be generated by mypyc, and vec values can be allocated to machine registers effectively. However, vec values must be boxed if used in a non-native context, such as if added to a list or dict.
Here are some examples of valid vec types:
Type |
Item representation |
|---|---|
|
Packed 32-bit integers |
|
Packed 64-bit floats |
|
Object references |
|
Packed vec values |
The vec class¶
- class vec[T](items: Iterable[T] = ..., *, capacity: i64 = ...)¶
A generic growable array type. The runtime type parameter
Tused when callingvec[T](...)determines the element type.The
capacityparameter allows defining the minimum initial capacity of the buffer, some of which may be unused after construction. Unused capacity allows fastappendandextendoperations that don’t need to reallocate the buffer. Actual capacity will be larger thancapacityifitemshas more thancapacityitems.Construction from
listandtupleobjects is optimized. Also, for value item types, construction from an object that implements the buffer protocol is optimized (such asbytes), if the format is compatible with the vec item type.Mypyc treats
vec[T]([x] * n)as a special form. For example,vec[u8]([0] * n)constructs a zero-initialized vec object efficiently, without building an intermediate list. There are also other constructor-related special forms – see Special forms below.It’s an error to construct a
vecobject without providing an item type:vec()raises an exception.- len(v) → i64
Return the length of
v.
- v[i] → T
Return item at index
i(index may be negative).
- v[i:j] → vec[T]
Return a slice. This constructs a new
vecobject.iandjmay be negative.
- v[i] = o
Assign to an item (index may be negative).
- o in v → bool
Return True if
vcontainso.
- for o in v
Iterate over items.
- memoryview(v)
vecimplements the buffer protocol, but only for value item types that use a packed representation.
Functions¶
Since the following operations return a modified value, they are module-level functions instead of methods.
- append(v: vec[T], o: T) vec[T]¶
Return
vwith itemoappended to it. Ifvhas unused capacity, reuse the existing buffer. The time complexity is O(1) on average. Example:v = vec[i32]() v = append(v, 1)
- extend(v: vec[T], it: Iterable[T]) vec[T]¶
Return
vwith all items from iterableitappended to it. Ifvhas sufficient unused capacity, reuse the existing buffer. The time complexity is O(n) on average, where n is the length ofit. Example:v = vec[u8]() v = extend(v, b"foo")
Special forms¶
Certain combinations of operations that would be multiple separate operations in regular Python are guaranteed to be compiled by mypyc to direct operations with no unnecessary temporary objects.
Special form |
Description |
|---|---|
|
Construct empty vec with no buffer. This doesn’t perform any dynamic allocation (at least for non-nested vecs). |
|
Directly construct a vec object with given items, without a temporary list. |
|
Directly construct a vec with length n, without any temporary list. |
|
Vec comprehension creates no temporary list. |
Thread safety¶
In free-threaded Python builds, it’s unsafe to write or modify an item if other
threads might be concurrently accessing the same item. For example, writing v[4]
is not safe to do if another thread might be reading v[4]. Similarly, two
threads concurrently calling append or remove on the same vec object is not safe.
This is different from list objects, since vec is a lower-level type where implicit synchronization would have a significant performance cost. However, since vec lengths are immutable, some race conditions that lists can be susceptible to are not possible with vecs.
Implementation details¶
In a native context, such as in a local variable or a parameter in a native function, or in an attribute of a native class, vec values are implemented as value objects with two fields: length and buffer. The buffer is a normal Python object, but it’s not directly accessible to users. If a vec object is empty, no buffer object is required. This means that empty vecs are particularly efficient in a native context (usually 16 bytes).
A packed representation is used for buffers with supported value item types, including for
nested vecs. The packed representation is much more efficient than a Python list object, and
it’s also significantly more efficient than array.array for small sequences.
Multiple vec values can share the same underlying buffer. For example, assigning a vec to another variable creates an alias that refers to the same buffer:
v = vec[i32]([1, 2, 3], capacity=3)
w = v # v and w share the same buffer
w[0] = 99
print(v[0]) # 99 -- both see the change
However, this sharing is not guaranteed to persist if there are operations that change
the length (such as append). These may reallocate the buffer, breaking the sharing
silently:
v = append(v, 4) # reallocates the buffer since there is no free capacity
v[0] = 0
print(w[0]) # still 99 -- v and w no longer share a buffer
If you need independent copies, use slicing (v[:]) to explicitly create a vec with
its own buffer. It’s not recommended to rely on the details of buffer reallocation,
as these might change between librt releases.
Using vecs outside compiled code¶
vec is fully supported in non-compiled code, but vec values will be boxed in such
non-native contexts. There will be always two objects, a boxed vec object and a buffer object,
whereas in native contexts usually only the buffer is a dynamically allocated object.
vec is primarily useful in code compiled using mypyc, and it’s been heavily optimized
for this use case. There may be no performance benefit in interpreted code over using
list or array.array.