Posted on 2008-06-02 13:10:42-07 by tpederse in response to 7998
Re: Compound Words?
Hi Ana, In fact I don't think you are doing anything wrong! :) There are a few different issues in your note, and let me tackle them one by one.

As of version 0.10 we have made compound handling "invisible". That is to say the user no longer needs to specify a compound file via --compounds, rather we automatically construct the compound file from WordNet. The reason for that is that it didn't really make sense to use to require the user to input a list of WordNet compounds, since we could construct it ourselves automatically. Also, we felt that a user might be misled by the compounds option - if they input a compound to that which is not known to WordNet, it still won't be used (because we can only handle compounds that are known to WordNet). Now, fast_food and miami_beach are both known to WordNet as compounds, so they will still be found as compounds even if you don't specify them in the compounds file (along with the other 60,000+ compounds known to WordNet)

As to your questions on compound handling, the there are two issues. The first has to do with the different results you see with the different measures. That is generally expected. jcn is only able to measure similarity between pairs of nouns or pairs of verbs - lesk on the other hand can measure releatedness between "mixed" pairs, that is a noun and a verb, or a verb and an adjective, etc. So, in general you should expect to see differences between these measures.

Finally, if you are are using tagged or wntagged text, compounds are not identified, or at least we don't intend to identify them. :) You would need to do the compound identification prior to part of speech tagging, and input as :

Every/DT year/NN I/PRP usually/RB go/VBP to/TO Miami_Beach/NNP on/IN vacation/NN

The reason for that is that we felt attempting to identify compounds in pos tagged text and then figure out the tag for the new compound was just a bit beyond the scope of SenseRelate::AllWords.

I hope this helps! Please do let us know if there are other questions, or if this doesn't address your concerns.

Cordially, Ted
Direct Responses: Write a response
Perl Weekly newsletter
A free weekly newsletter for people who are busy to read all the blogs. click here to check it out.