*************** ** Arguments ** *************** T: 1.0 backbone: batch_size: 128 config_file: configs/trainers/LoCoOp/vit_b16_ep50.yaml dataset_config_file: configs/datasets/imagenet.yaml in_dataset: imagenet lambda_value: 1 load_epoch: 50 model_dir: output/imagenet/LoCoOp/vit_b16_ep50_16shots/nctx16_cscFalse_ctpend/seed1 opts: ['TRAINER.LOCOOP.N_CTX', '16', 'TRAINER.LOCOOP.CSC', 'False'] output_dir: output/imagenet/LoCoOp/vit_b16_ep50_16shots/nctx16_cscFalse_ctpend/seed1 resume: root: data seed: -1 topk: 200 trainer: LoCoOp ************ ** Config ** ************ DATALOADER: K_TRANSFORMS: 1 NUM_WORKERS: 8 RETURN_IMG0: False TEST: BATCH_SIZE: 100 SAMPLER: SequentialSampler TRAIN_U: BATCH_SIZE: 32 N_DOMAIN: 0 N_INS: 16 SAME_AS_X: True SAMPLER: RandomSampler TRAIN_X: BATCH_SIZE: 1 N_DOMAIN: 0 N_INS: 16 SAMPLER: RandomSampler DATASET: ALL_AS_UNLABELED: False CIFAR_C_LEVEL: 1 CIFAR_C_TYPE: NAME: ImageNet NUM_LABELED: -1 NUM_SHOTS: -1 ROOT: data SOURCE_DOMAINS: () STL10_FOLD: -1 SUBSAMPLE_CLASSES: all TARGET_DOMAINS: () VAL_PERCENT: 0.1 INPUT: COLORJITTER_B: 0.4 COLORJITTER_C: 0.4 COLORJITTER_H: 0.1 COLORJITTER_S: 0.4 CROP_PADDING: 4 CUTOUT_LEN: 16 CUTOUT_N: 1 GB_K: 21 GB_P: 0.5 GN_MEAN: 0.0 GN_STD: 0.15 INTERPOLATION: bicubic NO_TRANSFORM: False PIXEL_MEAN: [0.48145466, 0.4578275, 0.40821073] PIXEL_STD: [0.26862954, 0.26130258, 0.27577711] RANDAUGMENT_M: 10 RANDAUGMENT_N: 2 RGS_P: 0.2 RRCROP_SCALE: (0.08, 1.0) SIZE: (224, 224) TRANSFORMS: ('random_resized_crop', 'random_flip', 'normalize') MODEL: BACKBONE: NAME: ViT-B/16 PRETRAINED: True HEAD: ACTIVATION: relu BN: True DROPOUT: 0.0 HIDDEN_LAYERS: () NAME: INIT_WEIGHTS: OPTIM: ADAM_BETA1: 0.9 ADAM_BETA2: 0.999 BASE_LR_MULT: 0.1 GAMMA: 0.1 LR: 0.002 LR_SCHEDULER: cosine MAX_EPOCH: 50 MOMENTUM: 0.9 NAME: sgd NEW_LAYERS: () RMSPROP_ALPHA: 0.99 SGD_DAMPNING: 0 SGD_NESTEROV: False STAGED_LR: False STEPSIZE: (-1,) WARMUP_CONS_LR: 1e-05 WARMUP_EPOCH: 1 WARMUP_MIN_LR: 1e-05 WARMUP_RECOUNT: True WARMUP_TYPE: constant WEIGHT_DECAY: 0.0005 OUTPUT_DIR: output/imagenet/LoCoOp/vit_b16_ep50_16shots/nctx16_cscFalse_ctpend/seed1 RESUME: SEED: -1 TEST: COMPUTE_CMAT: False EVALUATOR: Classification FINAL_MODEL: last_step NO_TEST: False PER_CLASS_RESULT: False SPLIT: test TRAIN: CHECKPOINT_FREQ: 0 COUNT_ITER: train_x PRINT_FREQ: 5 TRAINER: CDAC: CLASS_LR_MULTI: 10 P_THRESH: 0.95 RAMPUP_COEF: 30 RAMPUP_ITRS: 1000 STRONG_TRANSFORMS: () TOPK_MATCH: 5 CROSSGRAD: ALPHA_D: 0.5 ALPHA_F: 0.5 EPS_D: 1.0 EPS_F: 1.0 DAEL: CONF_THRE: 0.95 STRONG_TRANSFORMS: () WEIGHT_U: 0.5 DAELDG: CONF_THRE: 0.95 STRONG_TRANSFORMS: () WEIGHT_U: 0.5 DDAIG: ALPHA: 0.5 CLAMP: False CLAMP_MAX: 1.0 CLAMP_MIN: -1.0 G_ARCH: LMDA: 0.3 WARMUP: 0 DOMAINMIX: ALPHA: 1.0 BETA: 1.0 TYPE: crossdomain ENTMIN: LMDA: 0.001 FIXMATCH: CONF_THRE: 0.95 STRONG_TRANSFORMS: () WEIGHT_U: 1.0 LOCOOP: CLASS_TOKEN_POSITION: end CSC: False CTX_INIT: N_CTX: 16 PREC: fp16 M3SDA: LMDA: 0.5 N_STEP_F: 4 MCD: N_STEP_F: 4 MEANTEACHER: EMA_ALPHA: 0.999 RAMPUP: 5 WEIGHT_U: 1.0 MIXMATCH: MIXUP_BETA: 0.75 RAMPUP: 20000 TEMP: 2.0 WEIGHT_U: 100.0 MME: LMDA: 0.1 NAME: LoCoOp SE: CONF_THRE: 0.95 EMA_ALPHA: 0.999 RAMPUP: 300 USE_CUDA: True VERBOSE: True VERSION: 1 lambda_value: 1 topk: 200 Collecting env info ... ** System info ** PyTorch version: 1.10.0 Is debug build: False CUDA used to build PyTorch: 11.3 ROCM used to build PyTorch: N/A OS: Ubuntu 20.04.5 2023.03.29 LTS (Cubic 2023-03-30 14:05) (x86_64) GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 Clang version: Could not collect CMake version: Could not collect Libc version: glibc-2.31 Python version: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0] (64-bit runtime) Python platform: Linux-5.15.0-91-generic-x86_64-with-glibc2.17 Is CUDA available: True CUDA runtime version: 12.0.140 GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4090 GPU 1: NVIDIA GeForce RTX 4090 GPU 2: NVIDIA GeForce RTX 4090 GPU 3: NVIDIA GeForce RTX 4090 Nvidia driver version: 525.85.12 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.6.0 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.6.0 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.6.0 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.6.0 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.6.0 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.6.0 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.6.0 HIP runtime version: N/A MIOpen runtime version: N/A Versions of relevant libraries: [pip3] numpy==1.24.3 [pip3] torch==1.10.0 [pip3] torchaudio==0.10.0 [pip3] torchvision==0.11.0 [conda] blas 1.0 mkl [conda] cudatoolkit 11.3.1 h9edb442_10 conda-forge [conda] ffmpeg 4.3 hf484d3e_0 pytorch [conda] mkl 2021.4.0 h06a4308_640 [conda] mkl-service 2.4.0 py38h95df7f1_0 conda-forge [conda] mkl_fft 1.3.1 py38h8666266_1 conda-forge [conda] mkl_random 1.2.2 py38h1abd341_0 conda-forge [conda] numpy 1.24.3 py38h14f4228_0 [conda] numpy-base 1.24.3 py38h31eccc5_0 [conda] pytorch 1.10.0 py3.8_cuda11.3_cudnn8.2.0_0 pytorch [conda] pytorch-mutex 1.0 cuda pytorch [conda] torchaudio 0.10.0 py38_cu113 pytorch [conda] torchvision 0.11.0 py38_cu113 pytorch Pillow (8.3.2) Loading trainer: LoCoOp Loading dataset: ImageNet Building transform_train + random resized crop (size=(224, 224), scale=(0.08, 1.0)) + random flip + to torch tensor of range [0, 1] + normalization (mean=[0.48145466, 0.4578275, 0.40821073], std=[0.26862954, 0.26130258, 0.27577711]) Building transform_test + resize the smaller edge to 224 + 224x224 center crop + to torch tensor of range [0, 1] + normalization (mean=[0.48145466, 0.4578275, 0.40821073], std=[0.26862954, 0.26130258, 0.27577711]) --------- --------- Dataset ImageNet # classes 1,000 # train_x 1,281,167 # val 50,000 # test 50,000 --------- --------- Loading CLIP (backbone: ViT-B/16) Building custom CLIP Initializing a generic context Initial context: "X X X X X X X X X X X X X X X X" Number of context words (tokens): 16 Turning off gradients in both the image and the text encoder Loading evaluator: Classification Loading weights to prompt_learner from "output/imagenet/LoCoOp/vit_b16_ep50_16shots/nctx16_cscFalse_ctpend/seed1/prompt_learner/model.pth.tar-50" (epoch = 50) Evaluting OOD dataset iNaturalist MCM score in score samples (random sampled): [-0.001081 -0.001121 -0.001067], out score samples: [-0.001066 -0.001061 -0.001075] FPR:0.4061, AUROC:0.9142617919999999, AURPC:0.979728856435133 GL-MCM score in score samples (random sampled): [-0.002188 -0.002218 -0.002176], out score samples: [-0.002148 -0.002134 -0.002155] FPR:0.2775, AUROC:0.935141396, AURPC:0.9844209777294515 Evaluting OOD dataset SUN MCM score in score samples (random sampled): [-0.001081 -0.001121 -0.001067], out score samples: [-0.00107 -0.001074 -0.0010395] FPR:0.3949, AUROC:0.925418654, AURPC:0.9831906898820529 GL-MCM score in score samples (random sampled): [-0.002188 -0.002218 -0.002176], out score samples: [-0.002169 -0.002165 -0.002117] FPR:0.305, AUROC:0.935670149, AURPC:0.9848521450283514 Evaluting OOD dataset places365 MCM score in score samples (random sampled): [-0.001081 -0.001121 -0.001067], out score samples: [-0.001071 -0.001081 -0.001064] FPR:0.4588, AUROC:0.8973732019999998, AURPC:0.9742088284349986 GL-MCM score in score samples (random sampled): [-0.002188 -0.002218 -0.002176], out score samples: [-0.00215 -0.002193 -0.00213 ] FPR:0.3731, AUROC:0.907325958, AURPC:0.9760026198427791 Evaluting OOD dataset Texture MCM score in score samples (random sampled): [-0.001081 -0.001121 -0.001067], out score samples: [-0.001058 -0.001055 -0.001053] FPR:0.46879432624113476, AUROC:0.8987898368794327, AURPC:0.986073235471353 GL-MCM score in score samples (random sampled): [-0.002188 -0.002218 -0.002176], out score samples: [-0.002163 -0.002148 -0.002167] FPR:0.5072695035460993, AUROC:0.8727623102836879, AURPC:0.9805123197658356 MCM avg. FPR:0.43214858156028363, AUROC:0.9089608712198581, AUPR:0.9808004025558843 GL-MCM avg. FPR:0.36571737588652486, AUROC:0.912724953320922, AUPR:0.9814470155916044