User:Alvations/Semeval-unwikified

From Wikipedia, the free encyclopedia
SemEval
Academics
Disciplines:Natural Language Processing
Computational Linguistics
Semantics
Umbrella
Organization:
ACL-SIGLEX
Workshop Overview
Founded:
(Origin)
1998 (Senseval)
Latest:Semseval 2
Summer 2010 (Ended)
ACL @ Uppsala, Sweden
Upcoming:Semseval 3
Summer 2012(tentative)
ACL @ Jeju Island, Korea
History
Senseval-11998 @ Sussex
Senseval-22001 @ Toulouse
Senseval-32004 @ Barcelona
SemEval-1 /
Senseval-4
2007 @ Prague
SemEval-22010 @ Uppsala

SemEval (originally Senseval) is a series of workshops conducted to evaluate semantic analysis systems. Traditionally, computational semantic analysis focused on Word Sense Disambiguation (WSD) tasks. WSD is an open problem of natural language processing, which governs the process of identifying which sense of a word (i.e. meaning) is used in a sentence, when the word has multiple meanings (polysemy).

ACL-SIGLEX (Special Interest Group on the LEXicon of the Association for Computational Linguistics)is the umbrella organization for SemEval semantic evaluations and the SENSEVAL word-sense evaluation exercises. The first three evaluation workshops, Senseval-1, Senseval-2 and Senseval-3, were focused on Word Sense Disambiguation Systems (WSD). More recently, Senseval had become SemEval, a series of evaluation exercises for semantic annotation involving a much larger and more diverse set of tasks [1]. Beginning with the 4th workshop, SemEval-1, the nature of the tasks evolved to include semantic analysis tasks outside of word sense disambiguation.

The framework of the SemEval/Senseval evaluation workshops emulates Message Understanding Conferences (MUCs) and other evaluation workshops ran by ARPA (Advanced Research Projects Agency, renamed the Defense Advanced Research Projects Agency (DARPA)).

SemEval Framework, adapted from MUC introduction
SemEval Framework, adapted from MUC introduction

Stages of SemEval/Senseval evaluation workshops[2]

  1. Firstly, all likely participants were invited to express their interest and participate in the exercise design.
  2. A timetable towards a final workshop was worked out.
  3. A plan for selecting evaluation materials was agreed.
  4. 'Gold standards' for the individual tasks were acquired, often human annotators were considered as a gold standard to measure precision and recall scores of computer systems. These 'gold standards' are what the computational systems strive towards. (In WSD tasks, human annotators were set on the task of generating a set of correct WSD answers(i.e. the correct sense for a given word in a given context)
  5. The gold standard materials, without answers, were released to participants, who then had a short time to run their programs over them and return their sets of answers to the organizers.
  6. The organizers then scored the answers and the scores were announced and discussed at a workshop


History[edit]

"-Eval" Etymology[edit]

"-Eval" is a fairly recent morpheme for conferences, workshops and algorithms related to computational evaluations. The "-Eval" innovation originate from the evaluation metric for computational grammar systems. Grammar Evaluation Interest Group (GEIG) evaluation metric, also termed as the Parseval metric ,[3], a blend of grammatical "pars"ing and system "eval"uation. Progessively, a series of well intended puns motivates the popular use of the "-eval" morpheme:

  • Parseval coincides with the Parseval theorem (a fourier series related theorem that most computer scientists are familiar with).

Pre-WSD evaluations[edit]

From the earliest days, assessing the quality of WSD algorithms had been primarily a matter of intrinsic evaluation, and “almost no attempts had been made to evaluate embedded WSD components”[4]. Only very recently have extrinsic evaluations begun to provide some evidence for the value of WSD in end-user applications [5]. Until 1990 or so, dissions of the sense disambiguation task focused mainly on illustrative examples rather than comprehensive evaluation. The early 1990s saw the beginnings of more systematic and rigorous intrinsic evaluations, including more formal experimentation on small sets of ambiguous words [6].

Senseval to Semeval[edit]

In April 1997, a workshop entitled Tagging with Lexical Semantics: Why, What, and How? was held in conjunction with the Conference on Applied Natural Language Processing[7]. At the time, there was a clear recognition that manually annotated corpora had revolutionized other areas of NLP, such as part-of-speech tagging and parsing, and that corpus-driven approaches had the potential to revolutionize automatic semantic analysis as well[8]. Kilgarriff recalls that there was “a high degree of consensus that the field needed evaluation,” and several practical proposals by Resnik and Yarowsky kicked off a discussion that led to the creation of the Senseval evaluation exercises.[9]

Senseval-1 took place in the summer of 1998 for English, French, and Italian, culminating in a workshop held at Herstmonceux Castle, Sussex, England on September 2–4.

Senseval-2 took place in the summer of 2001, and was followed by a workshop held in July 2001 in Toulouse, in conjunction with ACL 2001. Senseval-2 included tasks for Basque, Chinese, Czech, Danish, Dutch, English, Estonian, Italian, Japanese, Korean, Spanish, Swedish.

Senseval-3 took place in March–April 2004, followed by a workshop held in July 2004 in Barcelona, in conjunction with ACL 2004. Senseval-3 included 14 different tasks for core word sense disambiguation, as well as identification of semantic roles, multilingual annotations, logic forms, subcategorization acquisition.

Semeval-1/Senseval-4 took place in 2007, followed by a workshop held in conjunction with ACL in Prague. Semeval-1 included 18 different tasks targeting the evaluation of systems for the semantic analysis of text.

Semeval-2 took place in 2010, followed by a workshop held in conjunction with ACL in Uppsala. Semeval-2 included 18 different tasks targeting the evaluation of semantic analysis systems.

Senseval & Semeval Tasks[edit]

Senseval-1 & Senseval-2 focused on evaluation WSD systems on major languages that were available corpus and computerized dictionary. Senseval-3 looked beyond the lexemes and started to evaluate systems that look into wider areas of semantics, viz. Semantic Roles (technically known as Theta roles in formal semantics), Logic Form Transformation (commonly semantics of phrases, clauses or sentences are represented in first-order logic forms) and Senseval-3 explores performances of semantics analysis on Machine Translations.

As the types of different computational semantic systems grows beyond the coverage of WSD, Senseval evolves into Semeval, where more aspects of computational semantic systems were evaluated. The tables below (1) reflects the workshop growth from Senseval to Semeval and (2) gives an overview of which area of computational semantics was evaluated throughout the Senseval/Semeval workshops.

Senseval & Semeval Tasks Overview[edit]

Workshop No. of Tasks Areas of study Languages of Data Evaluated
Senseval-1 3 Word Sense Disambiguation (WSD) - Lexical Sample WSD tasks English, French, Italian
Senseval-2 12 Word Sense Disambiguation (WSD) - Lexical Sample, All Words, Translation WSD tasks Czech, Dutch, English, Estonian, Basque, Chinese, Danish, English, Italian, Japanese, Korean, Spanish,Swedish
Senseval-3 16 (including 2 cancelled tasks) Logic Form Transformation, Machine Translation (MT) Evaluation, Semantic Role Labelling, WSD Basque, Catalan, Chinese, English, Italian, Romanian, Spanish
SemEval-1 19 (including 1 cancelled task) Cross-lingual, Frame Extraction, Information Extraction, Lexical Substitution, Lexical Sample, Metonymy, Semantic Annotation, Semantic Relations, Semantic Role Labelling, Sentiment Analysis, Time Expression, WSD Arabic, Catalan, Chinese, English, Spanish, Turkish
SemEval-2 18 (including 1 cancelled task) Coreference, Cross-lingual, Ellipsis, Information Extraction, Lexical Substitution, Metonymy, Noun Compounds, Parsing, Semantic Relations, Semantic Role Labeling, Sentiment Analysis, Textual Entailment,Time Expressions, WSD Catalan, Chinese, Dutch, English, French, German, Italian, Japanese, Spanish

Areas of Evaluation[edit]

Areas of Study Brief Description Senseval-1 Senseval-2 Senseval-3 SemEval-1 SemEval-2
Coreference Co-reference occurs when multiple expressions in a sentence or document refer to the same thing; or in linguistic jargon, they have the same "referent". The main goal is to perform and evaluate coreference resolution for six different languages with the help of other layers of linguistic information and using different evaluation metrics (MUC, B-CUBED, CEAF and BLANC).
Cross-Lingual The goal of this task is to provide a framework for the evaluation of systems for cross-lingual lexical substitution. Given a paragraph and a target word, the goal is to provide several correct translations for that word in a given language, with the constraint that the translations fit the given context in the source language.
Ellipsis Verb Phrase Ellipsis (VPE) occurs in the English language when an auxiliary or modal verb abbreviates an entire verb phrase recoverable from the linguistic context. The study is envisioned in two subtasks: (1) automatically detecting VPE in free text; and (2) selecting the textual antecedent of each found VPE.
Keyphrase Extraction
(Information Extraction)
Keyphrases are words that capture the main topic of the document. The systems' goal is to produce the keyphrases for each article, given a set of scientific articles.
Metonymy Metonymy is a figure of speech used in rhetoric in which a thing or concept is not called by its own name. The goal is to identify whether the entity in that argument position satisfies the type expected by the predicate, given an argument of a predicate.
Noun Compounds Noun compounds is a sequences of nouns acting as a single noun. Given a compound and a set of paraphrasing verbs and prepositions, the participants goal is to provide a ranking that is as close as possible to the one proposed by human raters.
Semantic Relations The goal is to improve deep semantic analysis through automatic recognition of semantic relations between pairs of words.
Semantic Role Labeling The goal is to take Semantic role labelling (SRL) of nominal and verbal predicates beyond the domain of isolated sentences by linking local semantic argument structures to the wider discourse context.
Sentimental Analysis The basic task in sentiment analysis[10]is classifying the polarity of a given text at the document, sentence, or feature/aspect level — whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative or neutral.
Time Expression The goal is to identify the temporal structure of the text by (i) identification of events, (ii) identification of time expressions and (iii) identification of temporal relations.
Textual Entailment Entailment is the relationship between two sentences where the truth of one (A) requires the truth of the other (B). The aim is to train and evaluate semantic parsers using textual entailments. "Correct parse decisions are captured by textual entailments; thus systems are to decide which entailments are implied based on the parser output only, i.e. there will be no need for lexical semantics, anaphora resolution etc." [11]
Word Sense Disambiguation A WSD process requires two strict things: a dictionary to specify the senses which are to be disambiguated and a corpus of language data to be disambiguated (in some methods, a training corpus of language examples is also required). The goal is developing computational algorithms to replicate human's ability in disambiguating the correct meaning (sense) of word in a given context.

Senseval-1[edit]

The Senseval-1 evaluation exercise was attempting for the first time to run an ARPA-like competition between WSD systems, under the auspices of ACL-SIGLEX and EURALEX (European Association for lexicography), ELSNET and ECRAN (Extraction of Content Research At Near market) and SPARKLE (Shallow Parsing and Knowledge extraction for Language Engineering). There were two variants of computational WSD tasks, viz. "all-words" and "lexical-sample". In all words, participating systems have to disambiguate all words (or all open-class words) in a set of texts. In lexical-sample, first, a sample of words were selected. Then for each sample word, a number of corpus instances were selected. Participating systems then have to disambiguate just the sample-word instances.
For Senseval-1, the lexical-sample variant was chosen due to [12]

  1. Cost-effectiveness of "gold-standards" (human annotation of sense tags)
  2. Unavailability of a full dictionary for low or no cost
  3. Many systems interested in participating were not ready for all-word task.
  4. The lexical sample task would be more informative about the strength and failings of WSD research at that point of time. (The all-words task would provide too little data about problems presented by any particular word)

Senseval-1 Tasks

Tasks
no.
Senseval-1 Tasks Description Languages
01 - 03 Lexical Sample The lexicon was first sampled, then instances in context of the sample words were found and the evaluation was on those instances only. English, French, Italian


Senseval-2[edit]

Senseval-2 evaluated WSD systems on three types of task over 12 languages. In the "all-words" task, the evaluation was on almost all of the content words in a sample of texts. In the "lexical sample" task, first sample the lexicon was selected, then corpus instances of the sample words were selected and WSD systems competed to disambiguated the sense in these instances. In the "translation task" (Japanese only), senses corresponded to distinct translations of a word into another language.

Senseval-2 Tasks

Tasks
no.
Senseval-2 Tasks Description Languages
01 - 04 All-words The evaluation of word sense disambiguation was on almost all of the content words in a sample of texts. Czech, Dutch, English, Estonian
05 - 11 Lexical sample The lexicon was first sampled, then instances in context of the sample words were found and the evaluation was on those instances only. Basque, Chinese, Danish, English, Italian, Japanese, Korean, Spanish, Swedish
12 Translation In the translation tasks, the senses corresponded to distinct translations of a word into another language as opposed to corpus instances of the words like "all-words" and "lexical sample task" Japanese


Senseval-3[edit]

Senseval-3 was a follow-up to Senseval-1 and Senseval-2. Senseval-3 included 14 different tasks for core word sense disambiguation, as well as identification of semantic roles, multilingual annotations, logic forms, subcategorization acquisition.

Senseval-3 Tasks

Tasks
no.
Senseval-3 Tasks Description Languages
01 - 02 All words The evaluation of word sense disambiguation was on almost all of the content words in a sample of texts. English, Italian
03 - 09,
15(cancelled)
Lexical Sample The lexicon was first sampled, then instances in context of the sample words were found and the evaluation was on those instances only. Basque, Catalan, Chinese, English,Italian, Romanian, Spanish, Swedish(cancelled)
10. Automatic subcategorization acquisition This task involved evaluating word sense disambiguation (WSD) systems in the context of automatic subcategorization acquisition. English
11 Multilingual lexical sample The task was very similar to the lexical sample task, except that rather than using the sense inventory from a dictionary use the translations of the target words into a second language as the "inventory". English-French,
English-Hindi
12 WSD of WordNet glosses This task performed this tagging automatically using all hand-tagged glosses from eXtended WordNet as the test set, with the hand-tagging also serving as the gold standard for evaluation. The task will be performed as an "all-words" task, except that no context will be provided. English
13 Semantic Roles This task called for the development of systems to "Automatic Labeling of Semantic Roles". [13] English
14 Logic Forms This task was complementary to the mainstream task in Senseval. The goal was to transform English sentences into a first order logic notation. English
16 Semantic Role Identification
(cancelled task) Swedish


SemEval-1[edit]

Beginning with the 4th workshop, SemEval-2007 (SemEval-1), the nature of the tasks evolved to include semantic analysis tasks outside of word sense disambiguation. Semeval-1 included 18 different tasks targeting the evaluation of systems for the semantic analysis of text. The tasks were elaborated than Senseval as it crosses the different areas of studies in NLP

SemEval-1 Tasks

Tasks
no.
SemEval-1 Tasks Area of Study Description Languages
01. Evaluating WSD on Cross Language Information Retrieval Cross-lingual, Information Retrival, WSD This was an application-driven task, where the application was a fixed cross-lingual information retrieval system. English
02. Evaluating Word Sense Induction and Discrimination Systems Word Sense Induction The goal of this task was to allow for comparison across sense-induction and discrimination systems, and also to compare these systems to other supervised and knowledge-based systems. English
03. Pronominal Anaphora Resolution in the Prague Dependency Treebank 2.0(cancelled task) Anaphora (cancelled task) Czech (cancelled)
04. Classification of Semantic Relations between Nominals Semantic relations The goal of this task was the classification of semantic relations between simple nominals (nouns or base noun phrases) other than named entities – honey bee, for example, shows an instance of the Product-Producer relation. English
05. Multilingual Chinese-English Lexical Sample Task Cross-lingual, WSD-lexical sample The goal of this task was to create a framework for the evaluation of word sense disambiguation in Chinese-English machine translation systems. Chinese, English
06. Word-Sense Disambiguation of Prepositions WSD The task will be carried out in the same manner as previous Senseval lexical sample tasks, following the same methodology for evaluation(including the use of the same evaluation scripts, with sense tagging available for both fine-grained and coarse-grained disambiguation). English
07. Coarse-grained English all-words WSD-coarse gained The task was to a coarse-grained English all-words WSD task. One of the major obstacles to effective WSD is the fine granularity of the adopted computational lexicon, often the lexicon encodes sense distinctions which are too subtle even for human annotators [14] English
08. Metonymy Resolution at Semeval-2007 Metonymy The task was a lexical sample task for English. Participants had to automatically classify preselected expressions of a particular semantic class (such as country names) as having a literal or a metonymic reading, given a four-sentence context. English
09. Multilevel Semantic Annotation of Catalan and Spanish Semantic Annotation, Cross-lingual In this task, the aim was evaluating and comparing automatic systems for semantic annotation at several levels for the Catalan and Spanish languages. Catalan, Spanish
10. English Lexical Substitution Task for SemEval-2007 Lexical Substitution A substitution task where the task for both annotators and systems was to find a substitute for the target word in the test sentence English
11. English Lexical Sample Task via English-Chinese Parallel Text WSD-Lexical Sample, Cross-lingual It was an English lexical sample task for word sense disambiguation (WSD), where the sense-annotated examples were (semi)-automatically gathered from word-aligned English-Chinese parallel texts. English, Chinese
12. Turkish Lexical Sample Task WSD-Lexical Sample This was a Turkish WSD-Lexical Sample Task. The lexicon was first sampled, then instances in context of the sample words were found and the evaluation was on those instances only. Turkish
13. Web People Search WSD-Name Entity Recognition This task focuses on the disambiguation of person names in a Web searching scenario English
14. Affective Text WSD, Sentimental Analysis The goal of this task was to explore the connection between emotions and lexical semantics. Provided a short text (news headlines), the objective was to annotate the text for emotions using a predefined list of emotions (e.g. joy, fear, surprise), and/or for polarity orientation (positive/negative). English
15. TempEval: A proposal for Evaluating Time-Event Temporal Relation Identification Time Expression Text comprehensio ninvolves the capability to identify time expression (i.e. the events described in a text and locate these in time). This task was to identify event-time and event-event temporal relations in texts. English
16. Evaluation of wide coverage knowledge resources WSD The goal of this task was to measure the relative quality of the knowledge resources submitted for the task by performing an indirect evaluation by using all the resources delivered as Topic Signatures (TS). English
17. English Lexical Sample, English SRL and English All-Words Tasks WSD-Lexical Sample, WSD-All Words This task consists of lexical sample style training and testing data for 35 nouns and 65 verbs in the WSJ Penn Treebank II as well as the Brown corpus. English
18. Arabic Semantic Labeling Semantic Role Labelling The tasks will span both the WSD and Semantic Role labeling processes for this evaluation. Both sets of tasks will be evaluated on data derived from the same data set, the test set. Arabic
19. Frame Semantic Structure Extraction Semantic Relation This task consists of recognizing words and phrases that evoke semantic frames of the sort defined in the FrameNet project (http://framenet.icsi.berkeley.edu), and their semantic dependents, which were usually, but not always, their syntactic dependents (including subjects). English


SemEval-2[edit]

SemEval-2010 (SemEval-2) was the 5th workshop on semantic evaluation. SemEval-2 added tasks that were from new areas of studies in computational semantics, viz., Coreference, Elipsis, Keyphrase Extraction, Noun Compounds and Textual Entailment. The first three workshops, Senseval-1 through Senseval-3, were focused on word sense disambiguation, each time growing in the number of languages offered in the tasks and in the number of participating teams. In the 4th workshop, SemEval-2007, the nature of the tasks evolved to include semantic analysis tasks outside of word sense disambiguation.

SemEval-2 Tasks

Tasks
no.
SemEval-2 Tasks Area of Study Description Languages
01. Coreference Resolution in Multiple Languages Coreference This task was concerned with intra-document coreference resolution for six different languages. The complete task was divided into two subtasks for each of the languages(1) Detection of full coreference chains, composed by named entities, pronouns, and full noun phrases. (2)Pronominal resolution, i.e., finding the antecedents of the pronouns in the text. Catalan, Dutch, English, German, Italian, Spanish
02. Cross-Lingual Lexical Substitution Cross-Lingual, Lexical Subsitution The goal of this task was to provide a framework for the evaluation of systems for cross-lingual lexical substitution. Given a paragraph and a target word, the goal was to provide several correct translations for that word in a given language, with the constraint that the translations fit the given context in the source language English, Spanish
03. Cross-Lingual Word Sense Disambiguation Cross-lingual, WSD This task was an unsupervised Word Sense Disambiguation task for English nouns by means of parallel corpora. The sense label was composed of translations in the different languages and the sense inventory was built by three annotators on the basis of the Europarl parallel corpus by means of a concordance tool. Dutch, French, German, Italian, Spanish
04. VP Ellipsis - Detection and Resolution Ellipsis Verb Phrase Ellipsis (VPE) occurs in the English language when an auxiliary or modal verb abbreviates an entire verb phrase recoverable from the linguistic context (e.g. He spends his days [sketching passers-by](antecedent), or trying to(VPE). The proposed shared task consists of two subtasks: (1) automatically detecting VPE in free text; and (2) selecting the textual antecedent of each found VPE. English
05. Automatic Keyphrase Extraction from Scientific Articles Information Extraction Keyphrases are words that capture the main topic of the document. Participating systems was provided with set of scientific articles and they produced the keyphrases for each article. English
06. Classification of Semantic Relations between MeSH Entities in Swedish Medical Texts(cancelled task) Information Extraction (cancelled) English
07. Argument Selection and Coercion Metonymy This task involves identifying the compositional operations involved in argument selection. The task was defined as follows: for each argument of a predicate, identify whether the entity in that argument position satisfies the type expected by the predicate. English
08. Multi-Way Classification of Semantic Relations Between Pairs of Nominals Semantic Relations, Information Extraction This task was a deep semantic analysis to automatically recognuze semantic relations between pairs of words. The task was designed to compare different approaches to the problem and to provide a standard testbed for future research, which can benefit many applications in Natural Language Processing. [15] English
09. Noun Compound Interpretation Using Paraphrasing Verbs Noun Compound For each Noun compounds, there will be paraphrasing verbs and prepositions interpretation. Given the compound and the set of paraphrasing verbs and prepositions, the participants must provide a ranking that was as close as possible to the one proposed by human raters. English
10. Linking Events and their Participants in Discourse Semantic Role Labelling, Information Extraction The task involved two subtasks, which will be evaluated independently (participants can choose to enter either or both): For the Full Task the target predicates in the (test) data set will be annotated with gold standard word senses (frames). For the NIs only task, participants will be supplied with a test set which was already annotated with gold standard local semantic argument structure; only the referents for null instantiations had to be found. English
11. Event Detection in Chinese News Sentences Semantic Role Labelling, WSD The goal of the task was to detect and analyze some basic event contents in real world Chinese news texts. It consists of finding key verbs or verb phrases to describe these events in the Chinese sentences after word segmentation and part-of-speech tagging, selecting suitable situation description formula for them, and anchoring different situation arguments with suitable syntactic chunks in the sentence. Chinese
12. Parser Training and Evaluation using Textual Entailment Textual Entailment This was a targeted textual entailment task designed to train and evaluate parsers. The proposed task was desirable for several reasons (1)entailments focus on the semantically meaningful parser decisions.(2) no formal system training was required English
13. TempEval 2 Time Expression Text comprehension requires the capability to identify the events described in a text and to locate them in time. The three subtasks of TempEval were relevant to understanding the temporal structure of a text: (i) identification of events, (ii) identification of time expressions and (iii) identification of temporal relations. English
14. Word Sense Induction Word Sense Induction Word Sense Induction (WSI) is defined as the process of identifying the different senses (or uses) of a target word in a given text in an automatic and fully-unsupervised manner. The goal of this task was to allow comparison of unsupervised sense induction and disambiguation systems. A secondary outcome of this task will be to provide a comparison with current supervised and knowledge-based methods for sense disambiguation. This task was a continuation of the WSI task in SemEval-1 with some significant changes to the evaluation setting. English
15. Infrequent Sense Identification for Mandarin Text to Speech Systems WSD This task was a little different from traditional WSD. The WSD methodology was applied to solve homograph ambiguity in grapheme to phoneme (GTP) in a text to speech (TTS) systems. In this task two or more senses may correspond to one pronunciation. That is, the sense granularity was coarser than WSD. Chinese (Mandarin)
16. Japanese WSD WSD This task can be considered an extension of Senseval-2 Japanese Lexical Sample (monolingual dictionary-based) task. Word senses were defined according to the Iwanami Kokugo Jiten, a Japanese dictionary published by Iwanami Shoten. Japanese
17. All-words Word Sense Disambiguation on a Specific Domain (WSD-domain) WSD WSD systems trained on general corpora were known to perform worse when moved to specific domains. This task offered a testbed for domain-specific WSD systems, and will allow to test domain portability issues. English, Chinese, Dutch and Italian
18. Disambiguating Sentiment Ambiguous Adjectives WSD, Sentimental Analysis Some adjectives were neutral in sentiment polarity out of context, but they show positive, neutral or negative meaning within specific context. Such words can be called dynamic sentiment ambiguous adjectives. This task aims to create a benchmark dataset for disambiguating dynamic sentiment ambiguous adjectives. Chinese

See also[edit]

External links[edit]

Reference[edit]

  1. ^ Agirre, E., Lluís M., & Richard W. (2009), Computational semantic analysis of language: SemEval-2007 and beyond. Language Resources and Evaluation 43(2):97–104.
  2. ^ Kilgarriff, A. (1998). SENSEVAL: An Exercise in Evaluating Word Sense Disambiguation Programs. In Proc. LREC, Granada, May 1998. Pp 581--588
  3. ^ http://www.grsampson.net/RLeafAnc.html
  4. ^ Palmer, M., Ng, H.T., & Hoa, T.D. (2006), Evaluation of WSD systems, in Eneko Agirre & Phil Edmonds (eds.), Word Sense Disambiguation: Algorithms and Applications, Text, Speech and Language Technology, vol. 33. Amsterdam: Springer,75–106.
  5. ^ Resnik, P. (2006), WSD in NLP applications, in Eneko Agirre & Phil Edmonds (eds.), Word Sense Disambiguation: Algorithms and Applications. Dordrecht: Springer, 299–338.
  6. ^ Yarowsky, D. (1992), Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. Proceedings of the 14th Conference on Computational Linguistics, 454–60. http://dx.doi.org/10.3115/992133.992140
  7. ^ Palmer, M., & Light, M. (1999), ACL SIGLEX workshop on tagging text with lexical semantics: what, why, and how? Natural Language Engineering 5(2):i–iv.
  8. ^ Ng, H.T. (1997), Getting serious about word sense disambiguation. Proceedings of the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How? 1–7.
  9. ^ Philip Resnik and Jimmy Lin (2010) Evaluation of NLP Systems. In Alexander Clark, Chris Fox, and Shalom Lappin, editors. The Handbook of Computational Linguistics and Natural Language Processing. Wiley-Blackwellis. 11:271
  10. ^ Michelle de Haaff (2010), Sentiment Analysis, Hard But Worth It!, CustomerThink, retrieved 2010-03-12. {{citation}}: Check date values in: |accessdate= (help); Text "web" ignored (help)CS1 maint: numeric names: authors list (link)
  11. ^ http://semeval2.fbk.eu/semeval2.php?location=tasks
  12. ^ Kilgarriff, A. and Rosenzweig, J. (2000) Framework and results for English SENSEVAL. Computers in the Humanities 34(1–2): 15–48.
  13. ^ Gildea,D. and Jurafsky,D. (2002). Automatic Labeling of Semantic Roles. Computational Linguistics 28:3, 245-288.
  14. ^ Edmonds, P. and Kilgarriff,A (2002) Introduction to the Special Issue on Evaluating Word Sense Disambiguation Systems. Journal of Natural Language Engineering 8 (4).
  15. ^ Hendrickx, I., Su, N.K., Kozareva, Z.,Nakov, P., O S´eaghdha, D., Padok,S., Pennacchiotti, M., Romanom L.,Szpakowicz, S.(2010). SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals. 5th SIGLEX Workshop.