illinois-ml-nlp-users AT lists.siebelschool.illinois.edu

Subject: Support for users of CCG software closed 7-27-20

List archive

Re: [[Illinois-ml-nlp-users] ] How to reproduce CoNLL NER results?

From: Veljko Miljanic <veljko AT uw.edu>
To: "Sammons, Mark" <mssammon AT illinois.edu>
Cc: "illinois-ml-nlp-users AT lists.cs.illinois.edu" <illinois-ml-nlp-users AT lists.cs.illinois.edu>
Subject: Re: [[Illinois-ml-nlp-users] ] How to reproduce CoNLL NER results?
Date: Wed, 2 Nov 2016 20:11:28 -0700

This evening I have tried both GitHub version and version shared on the web site. Retraining of both versions on CoNLL 2003 data set gave f1 of about 70.

19:56:56 INFO LearningCurveMultiDataset:219 - Level1: bestround=15 F1=0.7049180327868853 Level2: bestround=13 F1=0.7012475377544319

For training I merged training and dev test sets and converted them in tagger column format:

tag 0 token_index chunk pos word x x 0

I've used following command to run training:

java -Xmx4g -cp dist/*;lib/*;models/* edu.illinois.cs.cogcomp.ner.NerTagger -train eng.train_plus_testa.column eng.testb.column -r true -c config\ner.properties

On Wed, Nov 2, 2016 at 3:45 PM, Veljko Miljanic <veljko AT uw.edu> wrote:

Thanks for responding Mark.
I am using NER from GitHub repository: https://github.com/IllinoisCogComp (master)

I really appreciate your help.
Veljko

On Wed, Nov 2, 2016 at 1:46 PM, Sammons, Mark <mssammon AT illinois.edu> wrote:
Hi, Veljko.

Which version of the CoNLL data set are you using (i.e. from what URL/source)? -- I suspect the version we have used is a perturbation of the original, but that it does not contain additional information.

Thanks,

Mark

-----Original Message-----
From: Veljko Miljanic [mailto:veljko AT uw.edu]
Sent: Monday, October 31, 2016 12:13 PM
To: illinois-ml-nlp-users AT lists.cs.illinois.edu
Subject: [[Illinois-ml-nlp-users] ] How to reproduce CoNLL NER results?

Hi,

For my project I would like to retrain NER system and reproduce results on conll 2003 set that were reported in:
L. Ratinov and D. Roth, Design Challenges and Misconceptions in Named Entity Recognition. CoNLL (2009).

Is there anything else but CoNLL dataset that I would need to repeat the results from this paper?

Also, I've noticed that CoNLL dataset has 4 columns while NER training tool requires 9 columns. What do other 5 columns represent and how can I generate them?

Thanks,
Veljko

RE: [[Illinois-ml-nlp-users] ] How to reproduce CoNLL NER results?, Sammons, Mark, 11/02/2016
- Re: [[Illinois-ml-nlp-users] ] How to reproduce CoNLL NER results?, Veljko Miljanic, 11/02/2016
  - Re: [[Illinois-ml-nlp-users] ] How to reproduce CoNLL NER results?, Veljko Miljanic, 11/02/2016