Skip to Content.
Sympa Menu

illinois-ml-nlp-users - Re: [[Illinois-ml-nlp-users] ] How to reproduce CoNLL NER results?

illinois-ml-nlp-users AT lists.siebelschool.illinois.edu

Subject: Support for users of CCG software closed 7-27-20

List archive

Re: [[Illinois-ml-nlp-users] ] How to reproduce CoNLL NER results?


Chronological Thread 
  • From: Veljko Miljanic <veljko AT uw.edu>
  • To: "Sammons, Mark" <mssammon AT illinois.edu>
  • Cc: "illinois-ml-nlp-users AT lists.cs.illinois.edu" <illinois-ml-nlp-users AT lists.cs.illinois.edu>
  • Subject: Re: [[Illinois-ml-nlp-users] ] How to reproduce CoNLL NER results?
  • Date: Wed, 2 Nov 2016 20:11:28 -0700

This evening I have tried both GitHub version and version shared on the web site. Retraining of both versions on CoNLL 2003 data set gave f1 of about 70. 

19:56:56 INFO  LearningCurveMultiDataset:219 - Level1: bestround=15      F1=0.7049180327868853   Level2: bestround=13    F1=0.7012475377544319

For training I merged training and dev test sets and converted them in tagger column format:
tag 0 token_index chunk pos word x x 0

I've used following command to run training:
java -Xmx4g -cp dist/*;lib/*;models/* edu.illinois.cs.cogcomp.ner.NerTagger -train eng.train_plus_testa.column eng.testb.column -r true -c config\ner.properties 




On Wed, Nov 2, 2016 at 3:45 PM, Veljko Miljanic <veljko AT uw.edu> wrote:
Thanks for responding Mark.
I am using NER from GitHub repository: https://github.com/IllinoisCogComp (master)

I really appreciate your help.
Veljko




On Wed, Nov 2, 2016 at 1:46 PM, Sammons, Mark <mssammon AT illinois.edu> wrote:
Hi, Veljko.

Which version of the CoNLL data set are you using (i.e. from what URL/source)?  -- I suspect the version we have used is a perturbation of the original, but that it does not contain additional information.

Thanks,

Mark

-----Original Message-----
From: Veljko Miljanic [mailto:veljko AT uw.edu]
Sent: Monday, October 31, 2016 12:13 PM
To: illinois-ml-nlp-users AT lists.cs.illinois.edu
Subject: [[Illinois-ml-nlp-users] ] How to reproduce CoNLL NER results?

Hi,

For my project I would like to retrain NER system and reproduce results on conll 2003 set that were reported in:
L. Ratinov and D. Roth, Design Challenges and Misconceptions in Named Entity Recognition. CoNLL (2009).

Is there anything else but CoNLL dataset that I would need to repeat the results from this paper?

Also, I've noticed that CoNLL dataset has 4 columns while NER training tool requires 9 columns. What do other 5 columns represent and how can I generate them?

Thanks,
Veljko





Archive powered by MHonArc 2.6.19.

Top of Page