Open
Description
Hi, thanks for your attention.
When reading the source code of transformers, I cannot understand the implementation of _get_train_sampler
in trainer.py
. Why the default data sampler is RandomSampler
rather than DistributedSampler
? How does the trainer handle the sampler for data parallel?
reference code: https://github.com/huggingface/transformers/blob/main/src/transformers/trainer.py#L975
Metadata
Metadata
Assignees
Labels
No labels