Selectional Preferences extracted from Semcor for WordNet 1.6 Synsets (v 1.0)

The selectional preferences that we compute in this process for WordNet 1.6 nouns and verbs are obtained from relations extracted from Semcor. The first step is to apply the Minipar parser [1], and obtain dependencies for all the examples in Semcor. Then we extract [noun-synset, relation, verb-synset] triples for the relations "object" and "subject".

In order to compute the weights of the triples, we use the weights of all the concepts above the target noun and verb synsets in the WordNet hierarchy. The formula to obtain these probabilities is based on estimated frequencies acquired from Semcor. For each occurrence of a synset or a triple in Semcor, we distribute the frequency among its ancestors. The formulas, and a complete description of this work can be found in [2] and [3]. An application of the method for multilingual analysis is described in [4].

With the frequency estimations obtained from Semcor, we can apply the formula to get the weight of any triple with the form [noun-synset, relation, verb-synset] for the relations "object" and "subject". A method for "pruning" is also provided, in order to limit the noun synsets that are linked to verb synsets (we keep only one synset per branch in the WordNet hierarchy).

With this document, we distribute the weights for the selectional preferences extracted from Semcor, and also for all the combinations formed with the ancestors of the triples extracted directly. The pruning step has not been applied. The weights have been normalized and the values range between 0 and 1 (the higher the weight, the stronger the relation).

Contact: David Martinez (IXA NLP group)


Syntactic relations in Semcor (646K)
Selectional preferences for Subject relations (950K)
Selectional preferences for Object relations (1.5M)


