s-lilo / timexy

A spaCy custom component that extracts and normalizes temporal expressions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Timexy πŸ•™ πŸ“…

Package version Codecov

A spaCy custom component that extracts and normalizes dates and other temporal expressions.

Features

  • πŸ’₯ Extract dates and durations for various languages. See here a list of currently supported languages
  • πŸ’₯ Normalize dates to timestamps or normalize dates and durations to the TimeML TIMEX3 standard

Supported Languages

  • πŸ‡©πŸ‡ͺ German
  • πŸ‡¬πŸ‡§ English
  • πŸ‡«πŸ‡· French

Installation

pip install timexy

Usage

After installation, simply integrate the timexy component in any of your spaCy pipelines to extract and normalize dates and other temporal expressions:

import spacy
from timexy import Timexy

nlp = spacy.load("en_core_web_sm")

# Optionally add config if varying from default values
config = {
    "kb_id_type": "timex3",  # possible values: 'timex3'(default), 'timestamp'
    "label": "timexy",  # default: 'time'
    "overwrite": False  # default: False
}
nlp.add_pipe("timexy", config=config)

doc = nlp("Today is the 10.10.2010. I was in Paris for six years.")
for e in doc.ents:
    print(f"{e.text}\t{e.label_}\t{e.kb_id_}")    
>>> 10.10.2010    timexy    TIMEX3 type="DATE" value="2010-10-10T00:00:00"
>>> six years     timexy    TIMEX3 type="DURATION" value="P6Y"

Contributing

Please refer to the contributing guidelines here.

About

A spaCy custom component that extracts and normalizes temporal expressions

License:MIT License


Languages

Language:Python 99.0%Language:Makefile 1.0%