crgz / abbreviated_dates

Clarify abbreviated, ambiguous and incomplete dates meanings according to cultural contexts.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Parser for Abbreviated Dates

release - release Status - Status issues stars - stars stars - stars contributions - contributions

Key FeaturesHow To UseHow it worksCommon use casesOperationsRoadmap

Have you ever tried to understand a date like 11-09, št? Is the št an abbreviation of a month or a weekday? Which of those numbers represent the month or the day? This library leverages on Good Ol' Fashioned AI to parse abbreviated, ambiguous, and incomplete dates in multiple languages.

How to support this work

Please give these GitHub repository a star

Key Features

  • Python support through a Python Bridge
  • Language auto-detection
  • Easily expandable into new languages (30 languages are currently supported)
  • Support for multiple date formats
  • Support for abbreviated weekdays
  • Support for abbreviated months
  • Support for ambiguous month/day numbers

How To Use

The most straightforward way to parse dates is to use the abbreviated_dates:parse() predicate, that wraps around most of the functionality of the module. This example shows a basic usage of the library to parse the date: "11-09, št":

?- ['./prolog/demo.pl'], solutions('11-09, št').
╔═══════════════════════╤════════════╤════════════════╗
║         Date          │  Language  │    Country     ║
╟───────────────────────┼────────────┼────────────────╢
║ Saturday, 09 Nov 2024 │ Lithuanian │   Lithuania    ║
║ Saturday, 11 Sep 2027 │ Lithuanian │     Latvia     ║
║ Thursday, 11 Sep 2025 │   Slovak   │ Czech Republic ║
║ Thursday, 11 Sep 2025 │   Slovak   │    Slovakia    ║
║ Thursday, 11 Sep 2025 │   Slovak   │ Czech Republic ║
║ Thursday, 11 Sep 2025 │   Slovak   │    Slovakia    ║
╚═══════════════════════╧════════════╧════════════════╝
true.
Click to see the demo code
:- use_module(library(abbreviated_dates)).
:- use_module(library(cli_table)).

solutions(Text):- % E.g. solutions('11-09, št').
  Starting = date(2022,09,9),
  findall([Date,Language,Country],format(Starting,Text,Date,Language,Country),Row),
  cli_table(Row,[head(['Date','Language','Country'])]).

format(Starting, Text, DateText, Language, Country):-
  parse(Starting, Text, [Date], _, Language, Country),
  format_time(string(DateText), "%A, %d %b %Y", Date).

To test the demo code shown above run this query in your SWI-Prolog shell:

pack_install(cli_table).

How it works

The abbreviation "št" could stand for:

  • Šeštadienis which means Saturday in Lithuanian
  • Štvrtok which means in Thursday in Slovak

Lithuanian is spoken in Lithuania and in Latvia. Slovak is spoken in Slovakia but also by a minority in the Czech Republic. These countries use different date representations: Czech Republic, Latvia and Slovakia have the day written first because of the "little" date endianness format used as the standard in the country. Lithuania, on the other hand, uses the "big" date endianness format which means that the month is written first. The system factor in all these facts and is able to come with the right answers:

In the case of interpreting the abbreviation as a Saturday:

  • 9 of November 2024
  • 11 of September 2027

In the case of interpreting the abbreviation as a Thursday:

  • 11 of September 2025

For further details have a look at the implementation. In addition, the unit tests might give an impression on how to use this library.

Common use cases

Consuming data from different sources:

  • Scraping: extract dates from different places with several formats and languages
  • IoT: consuming data coming from different sources with different date formats
  • Tooling: consuming dates from different logs / sources
  • Format transformations: when transforming dates coming from different files (PDF, CSV, etc.) to other formats (database, etc).

Operations

We are leveraging GNU Make to automate frequent actions. Please use the following command will show the available recipes to use for operating the local development environment:

make help
Click to see the available recipes
Command Description
help Print this help
synchronize Synchronize the local repository: Switch to the main branch, fetch changes & delete merged branches
test Run the test suite
bump Increase the version number
release Release recipe to be use from Github Actions
install Install the latest library release or the one in the VERSION variable (Eg. make install VERSION=v.0.0.207)
requirements Install the packages packs required for the development environment
publish Publish the diagrams
workflow Creates the Diagrams
clean Remove debris

Roadmap

  • Multi-language Support
  • Integrate with the Python Bridge
  • Integrate with the Julian package
  • Implement the Diagram Workflow in GitHub Actions
  • Identify the Scope of formats being covered
  • Add Coverage Metrics
  • Add support for words as date separator (and, bis, hasta...)
  • Search dates embedded in longer texts.

See the open issues for a full list of proposed features (and known issues).

Review

The package can be reviewed in the Distribution Server

License

Distributed under the MIT License. See LICENSE file for more information.

About

Clarify abbreviated, ambiguous and incomplete dates meanings according to cultural contexts.

License:MIT License


Languages

Language:Prolog 70.8%Language:Gnuplot 19.3%Language:Makefile 9.9%