illinois-ml-nlp-users AT lists.siebelschool.illinois.edu
Subject: Support for users of CCG software closed 7-27-20
List archive
- From: Nikos Papasarantopoulos <npapasa AT ilsp.gr>
- To: illinois-ml-nlp-users AT cs.uiuc.edu
- Subject: [Illinois-ml-nlp-users] Illinois NE Tagger - output format
- Date: Wed, 7 May 2014 14:53:44 +0300
- List-archive: <http://lists.cs.uiuc.edu/pipermail/illinois-ml-nlp-users/>
- List-id: Support for users of CCG software <illinois-ml-nlp-users.cs.uiuc.edu>
Hi all,
I would like to ask if there is any way or configuration for the NE server to leave the text intact in the output.
I have some files that I want to run NER on, but they contain a lot of whitespace and punctuation. After running NER, the output has eliminated whitespace and added spaces between the punctuation marks.
After NER, I would like to know for every entity the offset (how many chars from the beginning of the original document it was recognized); that's why I want the document layout not changing.
For example, in the online demo, the text returned has the same layout with the original.
Is there any kind of configuration by which I could solve this?
Thank you in advance.
N.P.
- [Illinois-ml-nlp-users] Illinois NE Tagger - output format, Nikos Papasarantopoulos, 05/07/2014
Archive powered by MHonArc 2.6.16.