delu.data.FnDataset#
- class delu.data.FnDataset(fn, args, transform=None)[source]#
Create simple PyTorch datasets without classes and inheritance.
FnDataset
allows avoiding implementingDataset
classes in simple cases.Tutorial
First, a quick example. Without
FnDataset
:from PIL import Image class ImagesList(Dataset): def __init__(self, filenames, transform): self.filenames = filenames self.transform = transform def __len__(self): return len(self.filenames) def __getitem__(self, index): return self.transform(Image.open(self.filenames[index])) dataset = ImagesList(filenames, transform)
With
FnDataset
:dataset = delu.data.FnDataset(Image.open, filenames, transform) # Cache images after the first load: from functools import lru_cache dataset = delu.data.FnDataset(lru_cache(None)(Image.open), filenames)
In other words, with the vanilla PyTorch, in order to create a dataset, you have to inherit from
torch.utils.data.Dataset
and implement three methods:__init__
__len__
__getitem__
With
FnDataset
the only thing you may need to implement is thefn
argument that will power__getitem__
. The easiest way to learnFnDataset
is to go through the examples below.A list of images:
dataset = delu.data.FnDataset(Image.open, filenames) # dataset[i] returns Image.open(filenames[i])
A list of images that are cached after the first load:
from functools import lru_cache dataset = delu.data.FnDataset(lru_cache(None)(Image.open), filenames)
pathlib.Path
is handy for creating datasets that read from files:images_dir = Path(...) dataset = delu.data.FnDataset(Image.open, images_dir.iterdir())
If you only need files with specific extensions:
dataset = delu.data.FnDataset(Image.open, images_dir.glob('*.png'))
If you only need files with specific extensions located in all subfolders:
dataset = delu.data.FnDataset( Image.open, (x for x in images_dir.rglob('**/*.png') if condition(x)) )
A segmentation dataset:
image_filenames = ... gt_filenames = ... def get(i): return Image.open(image_filenames[i]), Image.open(gt_filenames[i]) dataset = delu.data.FnDataset(get, len(image_filenames))
A dummy dataset that demonstrates that
FnDataset
is a very general thing:def f(x): return x * 10 def g(x): return x * 2 dataset = delu.data.FnDataset(f, 3, g) # dataset[i] returns g(f(i)) assert len(dataset) == 3 assert dataset[0] == 0 assert dataset[1] == 20 assert dataset[2] == 40
Methods
Initialize self.
Get value by index.
Get the dataset size.