STS

  • Increase font size
  • Default font size
  • Decrease font size


Official results


The main evaluation column is "mean". The rank column gives the rank of the submission as ordered by the "mean" result.


Core task (official)

TEAM NAME headlines OnWN FNWN SMT mean rank
aolney-w3c3 0.5248 0.4701 0.1777 0.2744 0.3986 67
baseline-tokencos 0.5399 0.2828 0.2146 0.2861 0.3639 73
BGU-1 0.5075 0.3252 0.0768 0.1843 0.3181 81
BGU-2 0.3608 0.3777 -0.0173 0.0698 0.2363 88
BGU-3 0.3591 0.3360 0.0072 0.2122 0.2748 85
BUAP-RUN1 0.5005 0.2579 0.1766 0.2322 0.3234 78
BUAP-RUN2 0.4860 0.2872 0.2082 0.2117 0.3216 79
BUAP-RUN3 0.4817 0.2711 0.2511 0.1990 0.3156 82
CFILT-1 0.5336 0.2381 0.2261 0.2906 0.3531 75
CLaC-RUN1 0.6774 0.7667 0.3793 0.3068 0.5511 10
CLaC-RUN2 0.6921 0.7366 0.3793 0.3375 0.5587 7
CLaC-RUN3 0.5276 0.6495 0.4158 0.3082 0.4755 47
CNGL-LPSSVR 0.6510 0.6971 0.1180 0.2861 0.4961 36
CNGL-LPSSVRTL 0.6385 0.6756 0.1823 0.3098 0.4998 33
CNGL-LSSVR 0.6552 0.6943 0.2016 0.3005 0.5086 30
CPN-combined.RandSubSpace 0.6771 0.5135 0.3314 0.3369 0.4939 39
CPN-combined.SVM 0.6685 0.5096 0.3621 0.3408 0.4939 38
CPN-individual.RandSubSpace 0.6771 0.5484 0.3314 0.2769 0.4826 45
DeepPurple-length 0.6542 0.5105 0.2507 0.2803 0.4598 56
DeepPurple-linear 0.6878 0.5105 0.2693 0.2787 0.4721 50
DeepPurple-lineara 0.6227 0.5105 0.3265 0.2952 0.4607 55
deft-baseline 0.6532 0.8431 0.5083 0.3265 0.5795 3
deft-baseline2 0.5706 0.8111 0.5503 0.3325 0.5495 13
DLS@CU-char 0.3867 0.2386 0.3726 0.3337 0.3309 76
DLS@CU-charSemantic 0.4669 0.4165 0.3859 0.3411 0.4056 64
DLS@CU-charWordSemantic 0.4921 0.3769 0.4647 0.3492 0.4135 63
ECNUCS-Run1 0.5656 0.2083 0.1725 0.2949 0.3533 74
ECNUCS-Run2 0.7120 0.5388 0.2013 0.2504 0.4720 51
ECNUCS-Run3 0.6799 0.5284 0.2203 0.3595 0.4967 35
HENRY-run1 0.7601 0.4631 0.3516 0.2801 0.4917 41
HENRY-run2 0.7645 0.4631 0.3905 0.3593 0.5229 26
HENRY-run3 0.7103 0.3934 0.3364 0.3308 0.4734 48
IBM_EG-run2 0.7217 0.6110 0.3364 0.3460 0.5365 19
IBM_EG-run5 0.7410 0.5987 0.4133 0.3426 0.5452 15
IBM_EG-run6 0.7447 0.6257 0.4381 0.3275 0.5502 11
ikernels-sys1 0.7352 0.5432 0.3842 0.3180 0.5188 28
ikernels-sys2 0.7465 0.5572 0.3875 0.3409 0.5339 21
ikernels-sys3 0.7395 0.4228 0.3596 0.3294 0.4919 40
INAOE-UPV-run1 0.6392 0.3249 0.2711 0.3491 0.4332 59
INAOE-UPV-run2 0.6390 0.3260 0.2662 0.3457 0.4319 60
INAOE-UPV-run3 0.6468 0.6295 0.4090 0.3047 0.5085 31
KLUE-approach_1 0.6521 0.6507 0.3996 0.3367 0.5254 25
KLUE-approach_2 0.6510 0.6869 0.4189 0.3360 0.5355 20
KnCe2013-all 0.3475 0.3505 0.1073 0.1551 0.2639 86
KnCe2013-diff 0.4028 0.3537 0.1284 0.1804 0.2934 84
KnCe2013-set 0.0462 -0.1526 0.0376 -0.0605 -0.0397 90
LCL_Sapienza-ADW1 0.6943 0.4661 0.3571 0.3311 0.4880 43
LCL_Sapienza-ADW2 0.6520 0.5280 0.3598 0.3681 0.5019 32
LCL_Sapienza-ADW3 0.6205 0.5108 0.4462 0.3838 0.4996 34
LIPN-tAll 0.7063 0.6937 0.4037 0.3005 0.5425 16
LIPN-tSp 0.5791 0.7199 0.3522 0.3721 0.5261 24
MayoClinicNLP-r1wtCDT 0.6584 0.7775 0.3735 0.3605 0.5649 6
MayoClinicNLP-r2CDT 0.6827 0.6612 0.3960 0.3946 0.5572 8
MayoClinicNLP-r3wtCD 0.6440 0.8295 0.3202 0.3561 0.5671 5
NTNU-RUN1 0.7279 0.5952 0.3215 0.4015 0.5519 9
NTNU-RUN2 0.5909 0.1634 0.3650 0.3786 0.3946 68
NTNU-RUN3 0.7274 0.5882 0.3115 0.4035 0.5498 12
PolyUCOMP-RUN1 0.5176 0.1517 0.2496 0.2914 0.3284 77
SOFTCARDINALITY-run1 0.6410 0.7360 0.3442 0.3035 0.5273 23
SOFTCARDINALITY-run2 0.6713 0.7412 0.3838 0.2981 0.5402 18
SOFTCARDINALITY-run3 0.6603 0.7401 0.3347 0.2900 0.5294 22
sriubc-System1* 0.6083 0.2915 0.2790 0.3065 0.4011 66
sriubc-System2* 0.6359 0.3664 0.2713 0.3476 0.4420 57
sriubc-System3* 0.5443 0.2843 0.2705 0.3275 0.3842 70
SXUCFN-run1 0.6806 0.5355 0.3181 0.3980 0.5198 27
SXUCFN-run2 0.4881 0.6146 0.4237 0.3844 0.4797 46
SXUCFN-run3 0.6761 0.6481 0.3025 0.4003 0.5458 14
SXULLL-1 0.4840 0.7146 0.0415 0.1543 0.3944 69
UCam-A 0.5510 0.3099 0.2385 0.1171 0.3200 80
UCam-B 0.6399 0.4440 0.3995 0.3400 0.4709 53
UCam-C 0.4962 0.5639 0.1724 0.3006 0.4207 62
UCSP-NC** 0.1736 0.0853 0.1151 0.1658 0.1441 89
UMBC_EBIQUITY-galactus 0.7428 0.7053 0.5444 0.3705 0.5927 2
UMBC_EBIQUITY-ParingWords 0.7642 0.7529 0.5818 0.3804 0.6181 1
UMBC_EBIQUITY-saiyan 0.7838 0.5593 0.5815 0.3563 0.5683 4
UMCC_DLSI-1 0.5841 0.4847 0.2917 0.2855 0.4352 58
UMCC_DLSI-2 0.6168 0.5557 0.3045 0.3407 0.4833 44
UMCC_DLSI-3 0.3846 0.1342 -0.0065 0.2736 0.2523 87
UNIBA-2STEPSML 0.4255 0.4801 0.1832 0.2710 0.3673 71
UNIBA-DSM_PERM 0.6319 0.4910 0.2717 0.3155 0.4610 54
UNIBA-STACKING 0.6275 0.4658 0.2111 0.2588 0.4293 61
Unimelb_NLP-bahar 0.7119 0.3490 0.3813 0.3507 0.4733 49
Unimelb_NLP-concat 0.7085 0.6790 0.3374 0.3230 0.5415 17
Unimelb_NLP-stacking 0.7064 0.6140 0.1865 0.3144 0.5091 29
Unitor-SVRegressor_run1 0.6353 0.5744 0.3521 0.3285 0.4941 37
Unitor-SVRegressor_run2 0.6511 0.5610 0.3580 0.3096 0.4902 42
Unitor-SVRegressor_run3 0.6027 0.5489 0.3269 0.3192 0.4716 52
UPC-AE 0.6092 0.5679 -0.1268 0.2090 0.4037 65
UPC-AED 0.4136 0.4770 -0.0852 0.1662 0.3050 83
UPC-AED_T 0.5119 0.6386 -0.0464 0.1235 0.3671 72



Core task (without confidence scores)

TEAM NAME headlines OnWN FNWN SMT mean rank
BGU-3 0.3700 0.3305 0.0090 0.1893 0.2696 85
ECNUCS-Run1 0.6178 0.1928 0.1680 0.3546 0.3863 68
ECNUCS-Run2 0.7351 0.5171 0.2078 0.1894 0.4546 54
ECNUCS-Run3 0.7176 0.5108 0.2492 0.3174 0.4933 37
LIPN-tAll 0.6982 0.6838 0.4029 0.2848 0.5320 20
LIPN-tSp 0.5629 0.7112 0.3509 0.3599 0.5144 26
MayoClinicNLP-r1wtCDT 0.6854 0.7751 0.3825 0.3427 0.5681 6
MayoClinicNLP-r2CDT 0.7591 0.7247 0.4017 0.3613 0.5879 3
MayoClinicNLP-r3wtCD 0.6593 0.8225 0.3064 0.3485 0.5668 7
SXUCFN-run1 0.6462 0.3519 0.2721 0.3359 0.4380 56
SXUCFN-run2 0.3658 0.4443 0.3813 0.2959 0.3634 73
SXUCFN-run3 0.6378 0.5095 0.2953 0.3386 0.4773 44
SXULLL-1 0.3282 0.6762 0.1215 0.0942 0.3196 80






Typed-similarity task

TEAM NAME general author people_involved time location event subject description mean rank
baseline-cos 0.6691 0.4278 0.4460 0.5002 0.4835 0.3062 0.5015 0.5810 0.4894 8
BUAP-RUN1 0.6798 0.6166 0.0670 0.2761 0.0163 0.1612 0.5167 0.5283 0.3577 14
BUAP-RUN2 0.6745 0.6093 0.1285 0.3721 0.0163 0.1660 0.5094 0.5546 0.3788 13
BUAP-RUN3 0.6992 0.6345 0.1055 0.1461 0.0000 -0.0668 0.3729 0.5120 0.3004 15
BUT-1 0.3686 0.7468 0.3920 0.5725 0.3604 0.2906 0.2270 0.5882 0.4433 9
ECNUCS-Run1 0.6040 0.7362 0.3663 0.4685 0.3844 0.4057 0.5229 0.6027 0.5113 5
ECNUCS-Run2 0.6064 0.5684 0.3663 0.4685 0.3844 0.4057 0.5563 0.6027 0.4948 7
PolyUCOMP-RUN1 0.4888 0.6940 0.3223 0.3820 0.3621 0.1625 0.3962 0.4816 0.4112 12
PolyUCOMP-RUN2 0.4893 0.6940 0.3253 0.3777 0.3628 0.1968 0.3962 0.4816 0.4155 11
PolyUCOMP-RUN3 0.4915 0.6940 0.3254 0.3737 0.3667 0.2207 0.3962 0.4816 0.4187 10
UBC_UOS-RUN1* 0.7256 0.4568 0.4467 0.5762 0.4858 0.3090 0.5015 0.5810 0.5103 6
UBC_UOS-RUN2* 0.7457 0.6618 0.6518 0.7466 0.7244 0.6533 0.7404 0.7751 0.7124 4
UBC_UOS-RUN3* 0.7461 0.6656 0.6544 0.7411 0.7257 0.6545 0.7417 0.7763 0.7132 3
Unitor-SVRegressor_lin 0.7564 0.8076 0.6758 0.7090 0.7351 0.6623 0.7520 0.7745 0.7341 2
Unitor-SVRegressor_rbf 0.7981 0.8158 0.6922 0.7471 0.7723 0.6835 0.7875 0.7996 0.7620 1




Notes

* marks submissions which involve organizers of the task

** system submitted past the 120 hour window


Some submissions had minor issues:


  • NTNU had a confidence score of 0 for all items, we replaced it by 100
  • DeepPurple had a few NaN for the SMT dataset, which we replaced by 5


 

Announcements


- *SEM program (incl. papers)
- *SEM registration open
- System runs available
- Gold standard data now available
- Results now available
- Train data for pilot on typed similarity available
- Trial data available
- STS selected as shared task of *SEM 2013
- Please join the mailing list for updates