Benchmarks results

Digits datasets

We use the same network architecture, number of epochs, batch_size for all experiments. These are the results obtained with the default configuration in digits_network.json. The feature extractor is a convolutional neural network with 3 convolutional layers. The classifier is a 3-layer fully connected neural network.

MNIST -> MNIST-M (5 runs)

Method source acc target acc
Source 89.0% +- 2.52 34.0% +- 1.71
DANN 94.2% +- 1.57 37.5% +- 2.85
CDAN 98.7% +- 0.19 68.4% +- 1.80
CDAN-E 98.7% +- 0.12 69.6% +- 1.51
DAN 98.0% +- 0.68 47.0% +- 1.85
JAN 96.4% +- 4.57 52.9% +- 2.16
WDGRL 93.9% +- 2.70 52.0% +- 4.82

MNIST -> USPS (5 runs)

Method source acc target acc
Source 99.2% +- 0.08 94.2% +- 1.07
DANN 99.1% +- 0.15 93.8% +- 1.06
CDAN 98.8% +- 0.17 90.7% +- 1.17
CDAN-E 98.9% +- 0.11 90.3% +- 0.98
DAN 99.0% +- 0.14 95.0% +- 0.83
JAN 98.6% +- 0.30 89.5% +- 2.00
WDGRL 98.7% +- 0.13 85.7% +- 6.57

SVHN -> MNIST (5 runs)

Method source acc target acc
Source 80.4% +- 1.65 60.2% +- 1.98
DANN 80.7% +- 2.09 61.7% +- 2.75
CDAN 80.5% +- 1.91 79.0% +- 3.13
CDAN-E 82.0% +- 0.74 77.9% +- 5.59
DAN 80.7% +- 1.26 54.8% +- 2.76
JAN 79.6% +- 0.67 57.9% +- 2.35
WDGRL 80.5% +- 1.09 59.5% +- 3.61

MNIST -> SVHN (5 runs)

This problem is much harder than the others and results are usually not reported. Indeed, most methods fail to improve performance – on the contrary, aligning features results in decreased global performance.

Method source acc target acc
Source 91.6% +- 2.28 16.4% +- 3.31
DANN 96.2% +- 0.24 19.5% +- 2.60
CDAN 67.0% +- 12.74 11.5% +- 1.62
CDAN-E 59.8% +- 18.99 11.3% +- 1.08
DAN 93.3% +- 3.94 16.7% +- 1.19
JAN 68.4% +- 12.72 11.5% +- 1.53
WDGRL 77.4% +- 3.11 13.8% +- 1.75

Office 31 dataset

We use the same network architecture, number of epochs, batch_size for all experiments. These are the results obtained with the default configuration in office31_network.json. The feature extractor is a ResNet50 with the last layer removed. The task classifier is linear.

The results on the source look a bit lower than results reported in the literature (which I found from 68 to 80%). Note, however, that the parameters haven’t been full finetuned, and that the confidence intervals are wider than those reported in the literature.

Amazon to Webcam (5 runs)

Method source acc target acc
Source 83.0% +- 1.04 59.7% +- 3.67
DANN 82.7% +- 1.56 73.4% +- 5.63
CDAN 82.0% +- 1.30 82.3% +- 3.52
CDAN-E 82.7% +- 0.74 81.6% +- 4.67
DAN 83.0% +- 1.65 68.1% +- 1.96
JAN 82.0% +- 2.02 64.6% +- 3.70
WDGRL 83.4% +- 1.72 75.5% +- 3.13

Amazon to DSLR (5 runs)

Method source acc target acc
Source 81.9% +- 1.02 62.7% +- 3.25
DANN 80.2% +- 0.99 68.8% +- 2.78
CDAN 81.2% +- 2.06 72.0% +- 2.09
CDAN-E 80.1% +- 1.35 70.9% +- 3.24
DAN 82.3% +- 2.35 68.7% +- 1.36
JAN 78.1% +- 4.77 62.3% +- 2.39
WDGRL 79.6% +- 2.53 68.8% +- 1.63

DSLR to Amazon (5 runs)

Method source acc target acc
Source 95.6% +- 2.08 55.6% +- 6.44
DANN 91.6% +- 2.42 53.8% +- 5.61
CDAN 87.2% +- 3.09 56.6% +- 4.87
CDAN-E 89.1% +- 2.09 54.1% +- 6.71
DAN 95.9% +- 2.05 58.1% +- 5.49
JAN 92.8% +- 2.75 54.1% +- 5.92
WDGRL 88.8% +- 2.08 56.6% +- 4.03