Skip to content

LavenderDataLoader - PyTorch Integration

Convert your iteration to a PyTorch DataLoader for seamless integration with PyTorch, by calling the .torch() method.

ParameterDescriptionDefault
prefetch_factorNumber of batches to prefetch.None
pin_memoryPin memory for faster GPU transfer.False
pin_memory_deviceDevice to pin memory to.""
in_orderWhether to iterate in order.
If False, the order is not guaranteed but can be slightly faster.
True
poll_intervalHow often to check for new samples.
Smaller values can make the iteration faster
but can lead to more cpu usage on the server side.
0.01
dataloader = LavenderDataLoader(
dataset_id=dataset.id,
shardsets=[shardset.id],
batch_size=10,
shuffle=True,
).torch(
prefetch_factor=4,
pin_memory=True,
pin_memory_device="cuda:0",
in_order=True,
)
for batch in dataloader:
# Use in PyTorch training loop
outputs = model(batch)
loss = criterion(outputs, batch["labels"])
# ...