data.dataloader¶
async dataloader¶
AsyncDataLoader¶
- class ding.utils.data.dataloader.AsyncDataLoader(data_source: Union[Callable, dict], batch_size: int, device: str, chunk_size: Optional[int] = None, collate_fn: Optional[Callable] = None, num_workers: int = 0)[source]¶
- Overview:
An asynchronous dataloader.
- Interface:
__init__, __iter__, __next__, close
- __init__(data_source: Union[Callable, dict], batch_size: int, device: str, chunk_size: Optional[int] = None, collate_fn: Optional[Callable] = None, num_workers: int = 0) → None[source]¶
- Overview:
Init dataloader with input parameters. If
data_sourceisdict, data will only be processed inget_data_threadand put intoasync_train_queue. Ifdata_sourceisCallable, data will be processed by implementing functions, and can be sorted in two types:num_workers== 0 or 1: Only main worker will process it and put intoasync_train_queue.num_workers> 1: Main worker will divide a job into several pieces, push every job intojob_queue; Then slave workers get jobs and implement; Finally they will push procesed data intoasync_train_queue.
At the last step, if
devicecontains “cuda”, data inasync_train_queuewill be transferred tocuda_queuefor uer to access.- Arguments:
data_source (
Union[Callable, dict]): The data source, e.g. function to be implemented(Callable), replay buffer’s real data(dict), etc.batch_size (
int): Batch size.device (
str): Device.chunk_size (
int): The size of a chunked piece in a batch, should exactly dividebatch_size, only function when there are more than 1 worker.collate_fn (
Callable): The function which is used to collate batch size into each data field.num_workers (
int): Number of extra workers. 0 or 1 means only 1 main worker and no extra ones, i.e. Multiprocessing is disabled. More than 1 means multiple workers implemented by multiprocessing are to processs data respectively.
- __iter__() → Iterable[source]¶
- Overview:
Return the iterable self as an iterator.
- Returns:
self (
Iterable): Self as an iterator.