Source code for KDD'23 paper: BatchSampler: Sampling Mini-Batches for Contrastive Learning in Vision, Language, and Graphs.
BatchSampler is a simple and general generative method to sample mini-batches of hard-to-distinguish (i.e., hard and true negatives to each other) instances, which can be directly plugged into in-batch contrastive models in vision, language, and graphs.
- Python >= 3.7
- Pytorch >= 1.9.0
Take vision modality as an example, you can run the code on STL10.
sh train.sh
We conduct experiments on five datasets across three modalities. For vision modality, we use a large-scale dataset ImageNet, two medium-sacle datasets: STL10 and ImageNet-100, and two small-scale datasets: CIFAR10 and CIFAR100. For language modality, we use 7 semantic textual similarity (STS) tasks. For graphs modality, we conduct graph-level classification experiments on 7 benchmark datasets: IMDB-B, IMDB- M, COLLAB, REDDIT-B, PROTEINS, MUTAG, and NCI1.
Vision ModalityMethod | 100ep | 400ep | 800ep |
---|---|---|---|
SimCLR | 64.0 | 68.1 | 68.7 |
w/BatchSampler | 64.7 | 68.6 | 69.2 |
Moco v3 | 68.9 | 73.3 | 73.8 |
w/BatchSampler | 69.5 | 73.7 | 74.2 |
language Modality
Method | STS12 | STS13 | STS14 | STS15 | STS16 | STS-B | SICK-R | Avg. |
---|---|---|---|---|---|---|---|---|
SimCSE-BERT{BASE} | 68.62 | 80.89 | 73.74 | 80.88 | 77.66 | 77.79 | 69.64 | 75.60 |
w/kNN Sampler | 63.62 | 74.86 | 69.79 | 79.17 | 76.24 | 74.73 | 67.74 | 72.31 |
w/BatchSampler | 72.37 | 82.08 | 75.24 | 83.10 | 78.43 | 77.54 | 68.05 | 76.69 |
DCL-BERT{BASE} | 65.22 | 77.89 | 68.94 | 79.88 | 76.72 | 73.89 | 69.54 | 73.15 |
w/kNN Sampler | 66.34 | 76.66 | 72.60 | 78.30 | 74.86 | 73.65 | 67.92 | 72.90 |
w/BatchSampler | 69.55 | 82.66 | 73.37 | 80.40 | 75.37 | 75.43 | 66.76 | 74.79 |
HCL-BERT{BASE} | 62.57 | 79.12 | 69.70 | 78.00 | 75.11 | 73.38 | 69.74 | 72.52 |
w/kNN Sampler | 61.12 | 75.73 | 68.43 | 76.64 | 74.78 | 71.22 | 68.04 | 70.85 |
w/BatchSampler | 66.87 | 81.38 | 72.96 | 80.11 | 77.99 | 75.95 | 70.89 | 75.16 |
Graphs Modality
Method | IMDB-B | IMDB-M | COLLAB | REDDIT-B | PROTEINS | MUTAG | NCI1 |
---|---|---|---|---|---|---|---|
GraphCL | 70.90±0.53 | 48.48±0.38 | 70.62±0.23 | 90.54±0.25 | 74.39±0.45 | 86.80±1.34 | 77.87±0.41 |
w/kNN Sampler | 70.72±0.35 | 47.97±0.97 | 70.59±0.14 | 90.21±0.74 | 74.17±0.41 | 86.46±0.82 | 77.27±0.37 |
w/BatchSampler | 71.90±0.46 | 48.93±0.28 | 71.48±0.28 | 90.88±0.16 | 75.04±0.67 | 87.78±0.93 | 78.93±0.38 |
DCL | 71.07±0.36 | 48.93±0.32 | 71.06±0.51 | 90.66±0.29 | 74.64±0.48 | 88.09±0.93 | 78.49±0.48 |
w/kNN Sampler | 70.94±0.19 | 48.47±0.35 | 70.49±0.37 | 90.26±1.03 | 74.28±0.17 | 87.13±1.40 | 78.13±0.52 |
w/BatchSampler | 71.32±0.17 | 48.96±0.25 | 70.44±0.35 | 90.73±0.34 | 75.02±0.61 | 89.47±1.43 | 79.03±0.32 |
HCL | 71.24±0.36 | 48.54±0.51 | 71.03±0.45 | 90.40±0.42 | 74.69±0.42 | 87.79±1.10 | 78.83±0.67 |
w/kNN Sampler | 71.14±0.44 | 48.36±0.93 | 70.86±0.74 | 90.64±0.51 | 74.06±0.44 | 87.53±1.37 | 78.66±0.48 |
w/BatchSampler | 71.20±0.38 | 48.76±0.39 | 71.70±0.35 | 91.25±0.25 | 75.11±0.63 | 88.31±1.29 | 79.17±0.27 |
MVGRL | 74.20±0.70 | 51.20±0.50 | - | 84.50±0.60 | - | 89.70±1.10 | - |
w/kNN Sampler | 73.30±0.34 | 50.70±0.36 | - | 82.70±0.67 | - | 85.08±0.66 | - |
w/BatchSampler | 76.70±0.35 | 52.40±0.39 | - | 87.47±0.79 | - | 91.13±0.81 | - |
@article{yang2023batchsampler,
title={BatchSampler: Sampling Mini-Batches for Contrastive Learning in Vision, Language, and Graphs},
author={Yang, Zhen and Huang, Tinglin and Ding, Ming and Dong, Yuxiao and Ying, Rex and Cen, Yukuo and Geng, Yangliao and Tang, Jie},
journal={arXiv preprint arXiv:2306.03355},
year={2023}
}