Explaination about sum_format and date_format?
sagarhukkire opened this issue · comments
Hi
Thanks for your tutorial, indeed its nice heads up. I was reading config.yml and unable to understand how sum format and date_format is working. Can you explain a little bit, based on it I will add some more fields in the parser.
Thanks in advance
Sagar
date_format
Matches dates like 19.08.15 and 19. 08. 2015: it's used to parse each date in a receipt
sum_format
is analogous
I meant to say how this line is working, if i want to add for 19-Aug-2015 or 08/19/2015, how i should change following line of code. Hope now its clear
date_format: '.*?(?P(\d{2,4}(.\s?|[^a-zA-Z\d])\d{2}(.\s?|[^a-zA-Z\d])(20)?1[3-6]))\s+'
Working with regular expression is always a very... delicate endeavor.
Usually I use interactive tools like the awesome regex101 to come up with somewhat working expressions.
So here's a start, which matches your formats (including the existing ones):
(\d+)(.|-|\/)\s?([A-Za-z0-9]+)(.|-|\/)\s?(20)?(\d+)
Demo: https://regex101.com/r/HKXAbS/1
You might still need the .*?
and the \s+
at the beginning and the end to make it work.
you are winner man...thanks @mre