kermitt2 / delft

a Deep Learning Framework for Text https://delft.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

add parameter to optionally output raw results in the evaluation

lfoppiano opened this issue · comments

When we perform the n-fold cross validation or holdout evaluation, we would like to have the possibility to output the raw results (on a separate file) as we do in grobid. In this way we can compare what is expected and predicted for each evaluation task.

Components to be updated:

  • SequenceLabelling
  • Classification
  • applications (to add option)

There is already a tag command that should provide the output.
In sciencebeam-trainer-delft I also extended that further to be able to provide a diff and output in various formats, e.g. in the data format.
There may not be a huge benefit having it as part of the evaluation if there is already a separate command to get the output?

@de-code I will check sciencebeam-trainer-delft.

Ideally, the idea is to have the evaluation results and all the raw data used for calculating them. IMHO it's useful, for example, during the n-fold cross-validation when the data might be partitioned differently at each run because it's shuffled.

I'm trying to find a way where to apply the change without changing the current design.
As the evaluation is done within the class Scorer in trainer.py, in particular in the method on_epoch_end.
I found two options:

  1. get the original sequence from within such method
    In this case I would have to get something like
    def inverse_transform(self, y):
        """
        send back original label string
        """
        indice_tag = {i: t for t, i in self.vocab_tag.items()}
        return [indice_tag[y_] for y_ in y]
...to get back the initial X sequence from the matrix?  

The disadvantage is that there is no guarantee that the X sequence is not truncated or padded. 
  1. it seems that the easier way would be to somehow get the predicted Y from the evaluation in the method eval_single or eval_nfold in wrapper.py but how to do it properly?

Ideally, the idea is to have the evaluation results and all the raw data used for calculating them. IMHO it's useful, for example, during the n-fold cross-validation when the data might be partitioned differently at each run because it's shuffled.

Apologies, I missed the n-fold use-case. That seems definitely valuable. In any case the tag command should provide an example how to get the values out.

You may also want to be mindful of memory usage. I had issues when working with "larger" datasets (~2000 documents, still small in the DL world). I think I rewritten some parts to be more generator based.

  1. get the original sequence from within such method
    In this case I would have to get something like

I am not sure whether you meant that just as an example. But the inverse functions are generally just mapping from say the integer label to the string label.

I think to get the x, you will need to collect what you are passing to the model, or the data generator.
I believe the tagging works more or less like this:

  • ask data generator to generate next batch
  • use model to predict batch
  • combine tokens from that batch with predicted labels

In my project I have the added complexity of supporting sliding windows for individual documents (i.e. the same document may be split across batches, potentially on separate positions). Without sliding windows it should be fairly simple. The padding would be easy to filter out. You could also use the original document length as we only pad at the end.

I've pushed a first implementation, which returns the raw information between each evaluation

Example output evaluation
[...]
Ag ag A Ag Ag Ag g Ag Ag Ag INITCAP NODIGIT 0 NOPUNCT Ag Xx Xx B-<formula> B-<formula>
0 0 0 0 0 0 0 0 0 0 NOCAPS ALLDIGIT 1 NOPUNCT X d d I-<formula> I-<formula>
. . . . . . . . . . ALLCAPS NODIGIT 1 DOT . . . I-<formula> I-<formula>
93 93 9 93 93 93 3 93 93 93 NOCAPS ALLDIGIT 0 NOPUNCT XX dd d I-<formula> I-<formula>
As as A As As As s As As As INITCAP NODIGIT 0 NOPUNCT As Xx Xx I-<formula> I-<formula>
0 0 0 0 0 0 0 0 0 0 NOCAPS ALLDIGIT 1 NOPUNCT X d d I-<formula> I-<formula>
. . . . . . . . . . ALLCAPS NODIGIT 1 DOT . . . I-<formula> I-<formula>
07 07 0 07 07 07 7 07 07 07 NOCAPS ALLDIGIT 0 NOPUNCT XX dd d I-<formula> I-<formula>
) ) ) ) ) ) ) ) ) ) ALLCAPS NODIGIT 1 ENDBRACKET ) ) ) O O

Ag ag A Ag Ag Ag g Ag Ag Ag INITCAP NODIGIT 0 NOPUNCT Ag Xx Xx B-<formula> B-<formula>
2 2 2 2 2 2 2 2 2 2 NOCAPS ALLDIGIT 1 NOPUNCT X d d I-<formula> I-<formula>
S s S S S S S S S S ALLCAPS NODIGIT 1 NOPUNCT S X X I-<formula> I-<formula>

Ag ag A Ag Ag Ag g Ag Ag Ag INITCAP NODIGIT 0 NOPUNCT Ag Xx Xx B-<formula> B-<formula>
2 2 2 2 2 2 2 2 2 2 NOCAPS ALLDIGIT 1 NOPUNCT X d d I-<formula> I-<formula>
Se se S Se Se Se e Se Se Se INITCAP NODIGIT 0 NOPUNCT Se Xx Xx I-<formula> I-<formula>

Ag ag A Ag Ag Ag g Ag Ag Ag INITCAP NODIGIT 0 NOPUNCT Ag Xx Xx B-<formula> B-<formula>
2 2 2 2 2 2 2 2 2 2 NOCAPS ALLDIGIT 1 NOPUNCT X d d I-<formula> I-<formula>
Se se S Se Se Se e Se Se Se INITCAP NODIGIT 0 NOPUNCT Se Xx Xx I-<formula> I-<formula>
crystals crystals c cr cry crys s ls als tals NOCAPS NODIGIT 0 NOPUNCT crystals xxxx x B-<shape> B-<shape>

Ag ag A Ag Ag Ag g Ag Ag Ag INITCAP NODIGIT 0 NOPUNCT Ag Xx Xx B-<formula> B-<formula>
2 2 2 2 2 2 2 2 2 2 NOCAPS ALLDIGIT 1 NOPUNCT X d d I-<formula> I-<formula>
Te te T Te Te Te e Te Te Te INITCAP NODIGIT 0 NOPUNCT Te Xx Xx I-<formula> I-<formula>

Ag ag A Ag Ag Ag g Ag Ag Ag INITCAP NODIGIT 0 NOPUNCT Ag Xx Xx B-<formula> B-<formula>
2 2 2 2 2 2 2 2 2 2 NOCAPS ALLDIGIT 1 NOPUNCT X d d I-<formula> I-<formula>
+ + + + + + + + + + ALLCAPS NODIGIT 1 NOPUNCT + + + I-<formula> I-<formula>
δ δ δ δ δ δ δ δ δ δ NOCAPS NODIGIT 1 NOPUNCT δ x x I-<formula> I-<formula>

Ag ag A Ag Ag Ag g Ag Ag Ag INITCAP NODIGIT 0 NOPUNCT Ag Xx Xx B-<doping> B-<formula>
5 5 5 5 5 5 5 5 5 5 NOCAPS ALLDIGIT 1 NOPUNCT X d d I-<doping> I-<formula>
% % % % % % % % % % ALLCAPS NODIGIT 1 NOPUNCT % % % I-<doping> I-<formula>

Ag ag A Ag Ag Ag g Ag Ag Ag INITCAP NODIGIT 0 NOPUNCT Ag Xx Xx B-<doping> B-<doping>
doped doped d do dop dope d ed ped oped NOCAPS NODIGIT 0 NOPUNCT doped xxxx x O O
SmBCO smbco S Sm SmB SmBC O CO BCO mBCO INITCAP NODIGIT 0 NOPUNCT SmBCO XxXXX XxX B-<name> B-<name>

	f1 (micro): 82.77
                  precision    recall  f1-score   support

        <doping>     0.6032    0.5623    0.5820       265
   <fabrication>     0.3333    0.1136    0.1695        44
       <formula>     0.8235    0.8392    0.8313      2569
          <name>     0.7606    0.7966    0.7782       949
         <shape>     0.9083    0.9536    0.9304       841
     <substrate>     0.5325    0.2770    0.3644       148
         <value>     0.8631    0.8985    0.8804       463
      <variable>     0.9499    0.9732    0.9614       448

all (micro avg.)     0.8244    0.8313    0.8279      5727

Evaluation runtime: 19.633 seconds 

Example output n-fold
14/26 [===============>..............] - ETA: 2s - loss: 0.2223
15/26 [================>.............] - ETA: 2s - loss: 0.2130
16/26 [=================>............] - ETA: 2s - loss: 0.2115
17/26 [==================>...........] - ETA: 1s - loss: 0.2024
18/26 [===================>..........] - ETA: 1s - loss: 0.2264
19/26 [====================>.........] - ETA: 1s - loss: 0.2305
20/26 [======================>.......] - ETA: 1s - loss: 0.2558
21/26 [=======================>......] - ETA: 1s - loss: 0.2652
22/26 [========================>.....] - ETA: 0s - loss: 0.2616
23/26 [=========================>....] - ETA: 0s - loss: 0.2607
24/26 [==========================>...] - ETA: 0s - loss: 0.2564
25/26 [===========================>..] - ETA: 0s - loss: 0.2568
26/26 [==============================] - 6s 215ms/step - loss: 0.2636
	f1 (micro): 99.67
training runtime: 171.763 seconds 

Evaluation:

------------------------ fold 0 --------------------------------------
27 27 2 27 27 27 7 27 27 27 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
4 4 4 4 4 4 4 4 4 4 LINEIN NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
1981 1981 1 19 198 1981 1 81 981 1981 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

16 16 1 16 16 16 6 16 16 16 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
April april A Ap Apr Apri l il ril pril LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

November november N No Nov Nove r er ber mber LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1993 1993 1 19 199 1993 3 93 993 1993 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

@1994 @1994 @ @1 @19 @199 4 94 994 1994 LINESTART ALLCAP CONTAINSDIGITS 0 1 0 NOPUNCT B-<year> <PAD>

Publication publication P Pu Pub Publ n on ion tion LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Date date D Da Dat Date e te ate Date LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2 2 2 2 2 2 2 2 2 2 LINEIN NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2007 2007 2 20 200 2007 7 07 007 2007 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Published published P Pu Pub Publ d ed hed shed LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
24 24 2 24 24 24 4 24 24 24 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

( ( ( ( ( ( ( ( ( ( LINESTART ALLCAP NODIGIT 1 0 0 OPENBRACKET O O
2007 2007 2 20 200 2007 7 07 007 2007 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
12 12 1 12 12 12 2 12 12 12 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
) ) ) ) ) ) ) ) ) ) LINEEND ALLCAP NODIGIT 1 0 0 ENDBRACKET O O

2008 2008 2 20 200 2008 8 08 008 2008 LINESTART NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

16 16 1 16 16 16 6 16 16 16 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
08 08 0 08 08 08 8 08 08 08 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Revision revision R Re Rev Revi n on ion sion LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Date date D Da Dat Date e te ate Date LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
03 03 0 03 03 03 3 03 03 03 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
/ / / / / / / / / / LINEIN ALLCAP NODIGIT 1 0 0 NOPUNCT O O
05 05 0 05 05 05 5 05 05 05 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
/ / / / / / / / / / LINEIN ALLCAP NODIGIT 1 0 0 NOPUNCT O O
2008 2008 2 20 200 2008 8 08 008 2008 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

June june J Ju Jun June e ne une June LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
30th 30th 3 30 30t 30th h th 0th 30th LINEIN NOCAPS CONTAINSDIGITS 0 0 0 NOPUNCT B-<day> B-<day>
2009 2009 2 20 200 2009 9 09 009 2009 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

October october O Oc Oct Octo r er ber ober LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
23 23 2 23 23 23 3 23 23 23 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1997 1997 1 19 199 1997 7 97 997 1997 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

January january J Ja Jan Janu y ry ary uary LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1997 1997 1 19 199 1997 7 97 997 1997 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

December december D De Dec Dece r er ber mber LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1996 1996 1 19 199 1996 6 96 996 1996 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
23 23 2 23 23 23 3 23 23 23 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1997 1997 1 19 199 1997 7 97 997 1997 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Date date D Da Dat Date e te ate Date LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Submitted submitted S Su Sub Subm d ed ted tted LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
2004 2004 2 20 200 2004 4 04 004 2004 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
11 11 1 11 11 11 1 11 11 11 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
04 04 0 04 04 04 4 04 04 04 LINEEND NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>

August august A Au Aug Augu t st ust gust LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
19 19 1 19 19 19 9 19 19 19 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
2009 2009 2 20 200 2009 9 09 009 2009 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

June june J Ju Jun June e ne une June LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
27 27 2 27 27 27 7 27 27 27 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2008 2008 2 20 200 2008 8 08 008 2008 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

29 29 2 29 29 29 9 29 29 29 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
March march M Ma Mar Marc h ch rch arch LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Nov nov N No Nov Nov v ov Nov Nov LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
. . . . . . . . . . LINEIN ALLCAP NODIGIT 1 0 0 DOT O O
12th 12th 1 12 12t 12th h th 2th 12th LINEIN NOCAPS CONTAINSDIGITS 0 0 0 NOPUNCT B-<day> B-<month>
. . . . . . . . . . LINEIN ALLCAP NODIGIT 1 0 0 DOT O O
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

August august A Au Aug Augu t st ust gust LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
31 31 3 31 31 31 1 31 31 31 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1998 1998 1 19 199 1998 8 98 998 1998 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

May may M Ma May May y ay May May LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
18 18 1 18 18 18 8 18 18 18 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2000 2000 2 20 200 2000 0 00 000 2000 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

15 15 1 15 15 15 5 15 15 15 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
09 09 0 09 09 09 9 09 09 09 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Aug aug A Au Aug Aug g ug Aug Aug LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
12 12 1 12 12 12 2 12 12 12 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2008 2008 2 20 200 2008 8 08 008 2008 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
Last last L La Las Last t st ast Last LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
update update u up upd upda e te ate date LINEIN NOCAPS NODIGIT 0 0 0 NOPUNCT O O
date date d da dat date e te ate date LINEEND NOCAPS NODIGIT 0 0 0 NOPUNCT O <PAD>

May may M Ma May May y ay May May LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
21 21 2 21 21 21 1 21 21 21 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2003 2003 2 20 200 2003 3 03 003 2003 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

1 1 1 1 1 1 1 1 1 1 LINESTART NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
9 9 9 9 9 9 9 9 9 9 LINEIN NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
1988 1988 1 19 198 1988 8 88 988 1988 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

1987 1987 1 19 198 1987 7 87 987 1987 LINESTART NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
. . . . . . . . . . LINEEND ALLCAP NODIGIT 1 0 0 DOT O O

21 21 2 21 21 21 1 21 21 21 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

1998 1998 1 19 199 1998 8 98 998 1998 LINESTART NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1998 1998 1 19 199 1998 8 98 998 1998 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

October october O Oc Oct Octo r er ber ober LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
11 11 1 11 11 11 1 11 11 11 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2000 2000 2 20 200 2000 0 00 000 2000 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

January january J Ja Jan Janu y ry ary uary LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1996 1996 1 19 199 1996 6 96 996 1996 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Published published P Pu Pub Publ d ed hed shed LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
25 25 2 25 25 25 5 25 25 25 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

( ( ( ( ( ( ( ( ( ( LINESTART ALLCAP NODIGIT 1 0 0 OPENBRACKET O O
Received received R Re Rec Rece d ed ved ived LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
12 12 1 12 12 12 2 12 12 12 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
October october O Oc Oct Octo r er ber ober LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2000 2000 2 20 200 2000 0 00 000 2000 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

May may M Ma May May y ay May May LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Date date D Da Dat Date e te ate Date LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Received received R Re Rec Rece d ed ved ived LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2009 2009 2 20 200 2009 9 09 009 2009 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

December december D De Dec Dece r er ber mber LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
15 15 1 15 15 15 5 15 15 15 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1997 1997 1 19 199 1997 7 97 997 1997 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

June june J Ju Jun June e ne une June LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
26 26 2 26 26 26 6 26 26 26 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1995 1995 1 19 199 1995 5 95 995 1995 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

17 17 1 17 17 17 7 17 17 17 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2010 2010 2 20 201 2010 0 10 010 2010 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
Monday monday M Mo Mon Mond y ay day nday LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
March march M Ma Mar Marc h ch rch arch LINEEND INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>

May may M Ma May May y ay May May LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
17 17 1 17 17 17 7 17 17 17 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1993 1993 1 19 199 1993 3 93 993 1993 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

( ( ( ( ( ( ( ( ( ( LINESTART ALLCAP NODIGIT 1 0 0 OPENBRACKET O O
Dated dated D Da Dat Date d ed ted ated LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
10 10 1 10 10 10 0 10 10 10 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
Feb feb F Fe Feb Feb b eb Feb Feb LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
) ) ) ) ) ) ) ) ) ) LINEEND ALLCAP NODIGIT 1 0 0 ENDBRACKET O O

4 4 4 4 4 4 4 4 4 4 LINESTART NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<month> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
21 21 2 21 21 21 1 21 21 21 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Published published P Pu Pub Publ d ed hed shed LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
20 20 2 20 20 20 0 20 20 20 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
April april A Ap Apr Apri l il ril pril LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

June june J Ju Jun June e ne une June LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2005 2005 2 20 200 2005 5 05 005 2005 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Mon mon M Mo Mon Mon n on Mon Mon LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Jan jan J Ja Jan Jan n an Jan Jan LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
28 28 2 28 28 28 8 28 28 28 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
13 13 1 13 13 13 3 13 13 13 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT O B-<day>
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
26 26 2 26 26 26 6 26 26 26 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT O B-<day>
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
14 14 1 14 14 14 4 14 14 14 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT O B-<month>
2008 2008 2 20 200 2008 8 08 008 2008 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

October october O Oc Oct Octo r er ber ober LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
30 30 3 30 30 30 0 30 30 30 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1996 1996 1 19 199 1996 6 96 996 1996 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

May may M Ma May May y ay May May LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
10 10 1 10 10 10 0 10 10 10 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1993 1993 1 19 199 1993 3 93 993 1993 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

29 29 2 29 29 29 9 29 29 29 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
March march M Ma Mar Marc h ch rch arch LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

January january J Ja Jan Janu y ry ary uary LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
19 19 1 19 19 19 9 19 19 19 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1995 1995 1 19 199 1995 5 95 995 1995 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2005 2005 2 20 200 2005 5 05 005 2005 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

( ( ( ( ( ( ( ( ( ( LINESTART ALLCAP NODIGIT 1 0 0 OPENBRACKET O O
revised revised r re rev revi d ed sed ised LINEIN NOCAPS NODIGIT 0 0 0 NOPUNCT O O
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1996 1996 1 19 199 1996 6 96 996 1996 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
) ) ) ) ) ) ) ) ) ) LINEEND ALLCAP NODIGIT 1 0 0 ENDBRACKET O O

22 22 2 22 22 22 2 22 22 22 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
09 09 0 09 09 09 9 09 09 09 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2004 2004 2 20 200 2004 4 04 004 2004 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

August august A Au Aug Augu t st ust gust LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1997 1997 1 19 199 1997 7 97 997 1997 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

November november N No Nov Nove r er ber mber LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
18 18 1 18 18 18 8 18 18 18 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1994 1994 1 19 199 1994 4 94 994 1994 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Codesmith codesmith C Co Cod Code h th ith mith LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
10 10 1 10 10 10 0 10 10 10 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
05 05 0 05 05 05 5 05 05 05 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
2006 2006 2 20 200 2006 6 06 006 2006 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

March march M Ma Mar Marc h ch rch arch LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2002 2002 2 20 200 2002 2 02 002 2002 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Issue issue I Is Iss Issu e ue sue ssue LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
01 01 0 01 01 01 1 01 01 01 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
. . . . . . . . . . LINEIN ALLCAP NODIGIT 1 0 0 DOT O O
09 09 0 09 09 09 9 09 09 09 LINEEND NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<year> B-<year>

January january J Ja Jan Janu y ry ary uary LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
26 26 2 26 26 26 6 26 26 26 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1994 1994 1 19 199 1994 4 94 994 1994 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Advance advance A Ad Adv Adva e ce nce ance LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Access access A Ac Acc Acce s ss ess cess LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
publication publication p pu pub publ n on ion tion LINEIN NOCAPS NODIGIT 0 0 0 NOPUNCT O O
on on o on on on n on on on LINEIN NOCAPS NODIGIT 0 0 0 NOPUNCT O O
December december D De Dec Dece r er ber mber LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
3 3 3 3 3 3 3 3 3 3 LINEIN NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2007 2007 2 20 200 2007 7 07 007 2007 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2007 2007 2 20 200 2007 7 07 007 2007 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

January january J Ja Jan Janu y ry ary uary LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
26 26 2 26 26 26 6 26 26 26 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1998 1998 1 19 199 1998 8 98 998 1998 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1994 1994 1 19 199 1994 4 94 994 1994 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

2006 2006 2 20 200 2006 6 06 006 2006 LINESTART NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

	f1 (micro): 96.99
                  precision    recall  f1-score   support

           <day>     0.9302    0.9524    0.9412        42
         <month>     0.9508    0.9831    0.9667        59
          <year>     1.0000    0.9844    0.9921        64

all (micro avg.)     0.9641    0.9758    0.9699       165


------------------------ fold 1 --------------------------------------
27 27 2 27 27 27 7 27 27 27 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
4 4 4 4 4 4 4 4 4 4 LINEIN NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
1981 1981 1 19 198 1981 1 81 981 1981 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

16 16 1 16 16 16 6 16 16 16 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
April april A Ap Apr Apri l il ril pril LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

November november N No Nov Nove r er ber mber LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1993 1993 1 19 199 1993 3 93 993 1993 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

@1994 @1994 @ @1 @19 @199 4 94 994 1994 LINESTART ALLCAP CONTAINSDIGITS 0 1 0 NOPUNCT B-<year> <PAD>

Publication publication P Pu Pub Publ n on ion tion LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Date date D Da Dat Date e te ate Date LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2 2 2 2 2 2 2 2 2 2 LINEIN NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2007 2007 2 20 200 2007 7 07 007 2007 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Published published P Pu Pub Publ d ed hed shed LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
24 24 2 24 24 24 4 24 24 24 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

( ( ( ( ( ( ( ( ( ( LINESTART ALLCAP NODIGIT 1 0 0 OPENBRACKET O O
2007 2007 2 20 200 2007 7 07 007 2007 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
12 12 1 12 12 12 2 12 12 12 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
) ) ) ) ) ) ) ) ) ) LINEEND ALLCAP NODIGIT 1 0 0 ENDBRACKET O O

2008 2008 2 20 200 2008 8 08 008 2008 LINESTART NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

16 16 1 16 16 16 6 16 16 16 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
08 08 0 08 08 08 8 08 08 08 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Revision revision R Re Rev Revi n on ion sion LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Date date D Da Dat Date e te ate Date LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
03 03 0 03 03 03 3 03 03 03 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
/ / / / / / / / / / LINEIN ALLCAP NODIGIT 1 0 0 NOPUNCT O O
05 05 0 05 05 05 5 05 05 05 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
/ / / / / / / / / / LINEIN ALLCAP NODIGIT 1 0 0 NOPUNCT O O
2008 2008 2 20 200 2008 8 08 008 2008 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

June june J Ju Jun June e ne une June LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
30th 30th 3 30 30t 30th h th 0th 30th LINEIN NOCAPS CONTAINSDIGITS 0 0 0 NOPUNCT B-<day> B-<day>
2009 2009 2 20 200 2009 9 09 009 2009 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

October october O Oc Oct Octo r er ber ober LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
23 23 2 23 23 23 3 23 23 23 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1997 1997 1 19 199 1997 7 97 997 1997 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

January january J Ja Jan Janu y ry ary uary LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1997 1997 1 19 199 1997 7 97 997 1997 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

December december D De Dec Dece r er ber mber LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1996 1996 1 19 199 1996 6 96 996 1996 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
23 23 2 23 23 23 3 23 23 23 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1997 1997 1 19 199 1997 7 97 997 1997 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Date date D Da Dat Date e te ate Date LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Submitted submitted S Su Sub Subm d ed ted tted LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
2004 2004 2 20 200 2004 4 04 004 2004 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
11 11 1 11 11 11 1 11 11 11 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
04 04 0 04 04 04 4 04 04 04 LINEEND NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>

August august A Au Aug Augu t st ust gust LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
19 19 1 19 19 19 9 19 19 19 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
2009 2009 2 20 200 2009 9 09 009 2009 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

June june J Ju Jun June e ne une June LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
27 27 2 27 27 27 7 27 27 27 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2008 2008 2 20 200 2008 8 08 008 2008 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

29 29 2 29 29 29 9 29 29 29 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
March march M Ma Mar Marc h ch rch arch LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Nov nov N No Nov Nov v ov Nov Nov LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
. . . . . . . . . . LINEIN ALLCAP NODIGIT 1 0 0 DOT O O
12th 12th 1 12 12t 12th h th 2th 12th LINEIN NOCAPS CONTAINSDIGITS 0 0 0 NOPUNCT B-<day> B-<day>
. . . . . . . . . . LINEIN ALLCAP NODIGIT 1 0 0 DOT O O
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

August august A Au Aug Augu t st ust gust LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
31 31 3 31 31 31 1 31 31 31 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1998 1998 1 19 199 1998 8 98 998 1998 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

May may M Ma May May y ay May May LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
18 18 1 18 18 18 8 18 18 18 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2000 2000 2 20 200 2000 0 00 000 2000 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

15 15 1 15 15 15 5 15 15 15 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
09 09 0 09 09 09 9 09 09 09 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Aug aug A Au Aug Aug g ug Aug Aug LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
12 12 1 12 12 12 2 12 12 12 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2008 2008 2 20 200 2008 8 08 008 2008 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
Last last L La Las Last t st ast Last LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
update update u up upd upda e te ate date LINEIN NOCAPS NODIGIT 0 0 0 NOPUNCT O O
date date d da dat date e te ate date LINEEND NOCAPS NODIGIT 0 0 0 NOPUNCT O O

May may M Ma May May y ay May May LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
21 21 2 21 21 21 1 21 21 21 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2003 2003 2 20 200 2003 3 03 003 2003 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

1 1 1 1 1 1 1 1 1 1 LINESTART NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
9 9 9 9 9 9 9 9 9 9 LINEIN NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
1988 1988 1 19 198 1988 8 88 988 1988 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

1987 1987 1 19 198 1987 7 87 987 1987 LINESTART NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
. . . . . . . . . . LINEEND ALLCAP NODIGIT 1 0 0 DOT O O

21 21 2 21 21 21 1 21 21 21 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

1998 1998 1 19 199 1998 8 98 998 1998 LINESTART NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1998 1998 1 19 199 1998 8 98 998 1998 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

October october O Oc Oct Octo r er ber ober LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
11 11 1 11 11 11 1 11 11 11 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2000 2000 2 20 200 2000 0 00 000 2000 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

January january J Ja Jan Janu y ry ary uary LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1996 1996 1 19 199 1996 6 96 996 1996 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Published published P Pu Pub Publ d ed hed shed LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
25 25 2 25 25 25 5 25 25 25 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

( ( ( ( ( ( ( ( ( ( LINESTART ALLCAP NODIGIT 1 0 0 OPENBRACKET O O
Received received R Re Rec Rece d ed ved ived LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
12 12 1 12 12 12 2 12 12 12 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
October october O Oc Oct Octo r er ber ober LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2000 2000 2 20 200 2000 0 00 000 2000 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

May may M Ma May May y ay May May LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Date date D Da Dat Date e te ate Date LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Received received R Re Rec Rece d ed ved ived LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2009 2009 2 20 200 2009 9 09 009 2009 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

December december D De Dec Dece r er ber mber LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
15 15 1 15 15 15 5 15 15 15 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1997 1997 1 19 199 1997 7 97 997 1997 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

June june J Ju Jun June e ne une June LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
26 26 2 26 26 26 6 26 26 26 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1995 1995 1 19 199 1995 5 95 995 1995 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

17 17 1 17 17 17 7 17 17 17 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2010 2010 2 20 201 2010 0 10 010 2010 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
Monday monday M Mo Mon Mond y ay day nday LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
March march M Ma Mar Marc h ch rch arch LINEEND INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>

May may M Ma May May y ay May May LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
17 17 1 17 17 17 7 17 17 17 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1993 1993 1 19 199 1993 3 93 993 1993 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

( ( ( ( ( ( ( ( ( ( LINESTART ALLCAP NODIGIT 1 0 0 OPENBRACKET O O
Dated dated D Da Dat Date d ed ted ated LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
10 10 1 10 10 10 0 10 10 10 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
Feb feb F Fe Feb Feb b eb Feb Feb LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
) ) ) ) ) ) ) ) ) ) LINEEND ALLCAP NODIGIT 1 0 0 ENDBRACKET O O

4 4 4 4 4 4 4 4 4 4 LINESTART NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<month> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
21 21 2 21 21 21 1 21 21 21 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Published published P Pu Pub Publ d ed hed shed LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
20 20 2 20 20 20 0 20 20 20 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
April april A Ap Apr Apri l il ril pril LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

June june J Ju Jun June e ne une June LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2005 2005 2 20 200 2005 5 05 005 2005 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Mon mon M Mo Mon Mon n on Mon Mon LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Jan jan J Ja Jan Jan n an Jan Jan LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
28 28 2 28 28 28 8 28 28 28 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
13 13 1 13 13 13 3 13 13 13 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT O B-<day>
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
26 26 2 26 26 26 6 26 26 26 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT O B-<day>
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
14 14 1 14 14 14 4 14 14 14 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT O B-<month>
2008 2008 2 20 200 2008 8 08 008 2008 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

October october O Oc Oct Octo r er ber ober LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
30 30 3 30 30 30 0 30 30 30 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1996 1996 1 19 199 1996 6 96 996 1996 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

May may M Ma May May y ay May May LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
10 10 1 10 10 10 0 10 10 10 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1993 1993 1 19 199 1993 3 93 993 1993 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

29 29 2 29 29 29 9 29 29 29 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
March march M Ma Mar Marc h ch rch arch LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2010 2010 2 20 201 2010 0 10 010 2010 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

January january J Ja Jan Janu y ry ary uary LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
19 19 1 19 19 19 9 19 19 19 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1995 1995 1 19 199 1995 5 95 995 1995 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2005 2005 2 20 200 2005 5 05 005 2005 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

( ( ( ( ( ( ( ( ( ( LINESTART ALLCAP NODIGIT 1 0 0 OPENBRACKET O O
revised revised r re rev revi d ed sed ised LINEIN NOCAPS NODIGIT 0 0 0 NOPUNCT O O
May may M Ma May May y ay May May LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1996 1996 1 19 199 1996 6 96 996 1996 LINEIN NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>
) ) ) ) ) ) ) ) ) ) LINEEND ALLCAP NODIGIT 1 0 0 ENDBRACKET O O

22 22 2 22 22 22 2 22 22 22 LINESTART NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
09 09 0 09 09 09 9 09 09 09 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
2011 2011 2 20 201 2011 1 11 011 2011 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2004 2004 2 20 200 2004 4 04 004 2004 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

August august A Au Aug Augu t st ust gust LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1997 1997 1 19 199 1997 7 97 997 1997 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

November november N No Nov Nove r er ber mber LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
18 18 1 18 18 18 8 18 18 18 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1994 1994 1 19 199 1994 4 94 994 1994 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Codesmith codesmith C Co Cod Code h th ith mith LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
10 10 1 10 10 10 0 10 10 10 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
05 05 0 05 05 05 5 05 05 05 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<month>
- - - - - - - - - - LINEIN ALLCAP NODIGIT 1 0 0 HYPHEN O O
2006 2006 2 20 200 2006 6 06 006 2006 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

March march M Ma Mar Marc h ch rch arch LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2002 2002 2 20 200 2002 2 02 002 2002 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Issue issue I Is Iss Issu e ue sue ssue LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
: : : : : : : : : : LINEIN ALLCAP NODIGIT 1 0 0 PUNCT O O
01 01 0 01 01 01 1 01 01 01 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<month> B-<day>
. . . . . . . . . . LINEIN ALLCAP NODIGIT 1 0 0 DOT O O
09 09 0 09 09 09 9 09 09 09 LINEEND NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<year> B-<day>

January january J Ja Jan Janu y ry ary uary LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
26 26 2 26 26 26 6 26 26 26 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1994 1994 1 19 199 1994 4 94 994 1994 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

Advance advance A Ad Adv Adva e ce nce ance LINESTART INITCAP NODIGIT 0 0 0 NOPUNCT O O
Access access A Ac Acc Acce s ss ess cess LINEIN INITCAP NODIGIT 0 0 0 NOPUNCT O O
publication publication p pu pub publ n on ion tion LINEIN NOCAPS NODIGIT 0 0 0 NOPUNCT O O
on on o on on on n on on on LINEIN NOCAPS NODIGIT 0 0 0 NOPUNCT O O
December december D De Dec Dece r er ber mber LINEIN INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
3 3 3 3 3 3 3 3 3 3 LINEIN NOCAPS ALLDIGIT 1 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
2007 2007 2 20 200 2007 7 07 007 2007 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
2007 2007 2 20 200 2007 7 07 007 2007 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

January january J Ja Jan Janu y ry ary uary LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
26 26 2 26 26 26 6 26 26 26 LINEIN NOCAPS ALLDIGIT 0 0 0 NOPUNCT B-<day> B-<day>
, , , , , , , , , , LINEIN ALLCAP NODIGIT 1 0 0 COMMA O O
1998 1998 1 19 199 1998 8 98 998 1998 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

July july J Ju Jul July y ly uly July LINESTART INITCAP NODIGIT 0 0 1 NOPUNCT B-<month> B-<month>
1994 1994 1 19 199 1994 4 94 994 1994 LINEEND NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

2006 2006 2 20 200 2006 6 06 006 2006 LINESTART NOCAPS ALLDIGIT 0 1 0 NOPUNCT B-<year> B-<year>

	f1 (micro): 96.39
                  precision    recall  f1-score   support

           <day>     0.8913    0.9762    0.9318        42
         <month>     0.9661    0.9661    0.9661        59
          <year>     1.0000    0.9688    0.9841        64

all (micro avg.)     0.9581    0.9697    0.9639       165

----------------------------------------------------------------------

** Worst ** model scores - run 1
                  precision    recall  f1-score   support

           <day>     0.8913    0.9762    0.9318        42
         <month>     0.9661    0.9661    0.9661        59
          <year>     1.0000    0.9688    0.9841        64

all (micro avg.)     0.9581    0.9697    0.9639       165


** Best ** model scores - run 0
                  precision    recall  f1-score   support

           <day>     0.9302    0.9524    0.9412        42
         <month>     0.9508    0.9831    0.9667        59
          <year>     1.0000    0.9844    0.9921        64

all (micro avg.)     0.9641    0.9758    0.9699       165

----------------------------------------------------------------------

Average over 2 folds
                  precision    recall  f1-score   support

           <day>     0.9108    0.9643    0.9365        42
         <month>     0.9585    0.9746    0.9664        59
          <year>     1.0000    0.9766    0.9881        64

all (micro avg.)     0.9611    0.9727    0.9669          

model config file saved
preprocessor saved
model saved

Few notes:

  • There is a small problem with the labels, because they are converted to IOB for grobid earlier on, so the inverse process would be conditional on the type of data that was initially supplied.
  • We manage the whole set of data in memory so it might blow up for larger datasets.
  • The layout feature information will not appear in the output if the model does use them (as they will be ignored when the data is loaded)

When I use my fulltext data, I believe the raw data would amount to more than a GB. Outputting that to stdout may make it difficult to use. Perhaps it would make sense to output it to separate files?

Here some examples of the output:

I've updated the initial comment to add all the parts for which these features should be implemented. I would implement it only for grobid, ner and classification.