zero.data.iter_batches

zero.data.iter_batches(data, *args, **kwargs)[source]

Efficiently iterate over data in a batchwise manner.

The function is useful when you want to efficiently iterate once over tensor-based data in a batchwise manner. See examples below for typical use cases.

The function is a more efficient alternative to torch.utils.data.DataLoader when it comes to in-memory data, because it uses batch-based indexing instead of item-based indexing (DataLoader’s behavior). The shuffling logic is fully delegated to native PyTorch DataLoader, i.e. no custom logic is performed under the hood.

Parameters
Returns

Iterator over batches.

Return type

Iterator

Warning

Numpy-arrays are not supported because of how they behave when indexed by a torch tensor of the size 1. For details, see the issue

Note

If you want to infititely iterate over batches, wrap the function in while True:.

See also

Examples

Besides loops over batches, the function can be used in combination with concat:

result = concat(map(fn, iter_batches(dataset_or_tensors_or_whatever, ...)))

The function can also be used for training:

for epoch in epochs:
    for batch in iter_batches(data, batch_size, shuffle=True)):
        ...