collective / icalendar

icalendar parser library for Python

Home Page:https://icalendar.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] TZID timezone is ignored when forward-slash is used

Cayllen opened this issue · comments

Describe the bug

When parsing an .ics file where the timezone is written with a forward-slash before it /Europe/Stockholm, the TZID is ignored and the default timezone of my server is taken as the default timezone.

BEGIN:VCALENDAR
PRODID:-//twitch.tv//StreamSchedule//1.0
VERSION:2.0
CALSCALE:GREGORIAN
REFRESH-INTERVAL;VALUE=DURATION:PT6H
X-PUBLISHED-TTL:PT6H
NAME:ne0lines
X-WR-CALNAME:ne0lines
BEGIN:VEVENT
UID:0cab49a0-1167-40f0-bfed-ecb4d117047d
DTSTAMP:20221019T102950Z
DTSTART;TZID=/Europe/Stockholm:20221021T200000
DTEND;TZID=/Europe/Stockholm:20221021T210000
SUMMARY:Just chatting
DESCRIPTION:Just Chatting.
CATEGORIES:Just Chatting
RRULE:FREQ=WEEKLY;BYDAY=FR
END:VEVENT
BEGIN:VEVENT
UID:160b153a-cf14-40ce-9bdf-e38ae74bbc96
DTSTAMP:20221019T102938Z
DTSTART;TZID=/Europe/Stockholm:20221024T200000
DTEND;TZID=/Europe/Stockholm:20221024T210000
SUMMARY:Just chatting
DESCRIPTION:Just Chatting.
CATEGORIES:Just Chatting
RRULE:FREQ=WEEKLY;BYDAY=MO
END:VEVENT
END:VCALENDAR

The point is now, when I remove the forward slashes , e.g. DTSTART;TZID=Europe/Stockholm:20221024T200000, the timezone is correctly parsed. Now this calendar is generated by Twitch, is this an issue of the package?

Python 3, newest package version is used

commented

The validator says that the ics is invalid, but the section 3.8.3.1 of the RFC allows for a / before timezone names. This is surely an issue of the package, thanks for reporting!

commented

maybe section 3.2.19 better justifies that the / is allowed.

commented

@Cayllen I can't reproduce the error - when I parse your ics file and retrieve the tzid parameter, it is returned correctly. Are you sure you're using the newest version? The snippet I tried to reproduce the error with:

import icalendar
from icalendar import Calendar

ical = """BEGIN:VCALENDAR
PRODID:-//twitch.tv//StreamSchedule//1.0
VERSION:2.0
CALSCALE:GREGORIAN
REFRESH-INTERVAL;VALUE=DURATION:PT6H
X-PUBLISHED-TTL:PT6H
NAME:ne0lines
X-WR-CALNAME:ne0lines
BEGIN:VEVENT
UID:0cab49a0-1167-40f0-bfed-ecb4d117047d
DTSTAMP:20221019T102950Z
DTSTART;TZID=/Europe/Stockholm:20221021T200000
DTEND;TZID=/Europe/Stockholm:20221021T210000
SUMMARY:Just chatting
DESCRIPTION:Just Chatting.
CATEGORIES:Just Chatting
RRULE:FREQ=WEEKLY;BYDAY=FR
END:VEVENT
BEGIN:VEVENT
UID:160b153a-cf14-40ce-9bdf-e38ae74bbc96
DTSTAMP:20221019T102938Z
DTSTART;TZID=/Europe/Stockholm:20221024T200000
DTEND;TZID=/Europe/Stockholm:20221024T210000
SUMMARY:Just chatting
DESCRIPTION:Just Chatting.
CATEGORIES:Just Chatting
RRULE:FREQ=WEEKLY;BYDAY=MO
END:VEVENT
END:VCALENDAR"""

calendar = Calendar.from_ical(ical)

print(calendar.walk('VEVENT')[0]['dtstart'].params['tzid'])
print(calendar.to_ical().decode('utf-8'))
print(icalendar.__version__)

outputs...

/Europe/Stockholm
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//twitch.tv//StreamSchedule//1.0
CALSCALE:GREGORIAN
NAME:ne0lines
REFRESH-INTERVAL;VALUE=DURATION:PT6H
X-PUBLISHED-TTL:PT6H
X-WR-CALNAME:ne0lines
BEGIN:VEVENT
SUMMARY:Just chatting
DTSTART;TZID=/Europe/Stockholm:20221021T200000
DTEND;TZID=/Europe/Stockholm:20221021T210000
DTSTAMP:20221019T102950Z
UID:0cab49a0-1167-40f0-bfed-ecb4d117047d
RRULE:FREQ=WEEKLY;BYDAY=FR
CATEGORIES:Just Chatting
DESCRIPTION:Just Chatting.
END:VEVENT
BEGIN:VEVENT
SUMMARY:Just chatting
DTSTART;TZID=/Europe/Stockholm:20221024T200000
DTEND;TZID=/Europe/Stockholm:20221024T210000
DTSTAMP:20221019T102938Z
UID:160b153a-cf14-40ce-9bdf-e38ae74bbc96
RRULE:FREQ=WEEKLY;BYDAY=MO
CATEGORIES:Just Chatting
DESCRIPTION:Just Chatting.
END:VEVENT
END:VCALENDAR

5.0.0
``` which seems correct.

When I run your code but want to output the following
print(calendar.walk('VEVENT')[0]['dtstart'].dt.tzinfo)
I get as response
None

My version is 5.0.0 aswell

Actually, it seems that this issue is related to pytz not recognizing the timezone. When debugging it jumps to the except bracket and tzinfo = None
image

Can you both also print the datetime object and its time in UTC so I can be sure that the timezone is ignored/does not work?



ical = """BEGIN:VCALENDAR
PRODID:-//twitch.tv//StreamSchedule//1.0
VERSION:2.0
CALSCALE:GREGORIAN
REFRESH-INTERVAL;VALUE=DURATION:PT6H
X-PUBLISHED-TTL:PT6H
NAME:ne0lines
X-WR-CALNAME:ne0lines
BEGIN:VEVENT
UID:0cab49a0-1167-40f0-bfed-ecb4d117047d
DTSTAMP:20221019T102950Z
DTSTART;TZID=/Europe/Stockholm:20221021T200000
DTEND;TZID=/Europe/Stockholm:20221021T210000
SUMMARY:Just chatting
DESCRIPTION:Just Chatting.
CATEGORIES:Just Chatting
RRULE:FREQ=WEEKLY;BYDAY=FR
END:VEVENT
BEGIN:VEVENT
UID:160b153a-cf14-40ce-9bdf-e38ae74bbc96
DTSTAMP:20221019T102938Z
DTSTART;TZID=/Europe/Stockholm:20221024T200000
DTEND;TZID=/Europe/Stockholm:20221024T210000
SUMMARY:Just chatting
DESCRIPTION:Just Chatting.
CATEGORIES:Just Chatting
RRULE:FREQ=WEEKLY;BYDAY=MO
END:VEVENT
END:VCALENDAR"""

calendar = Calendar.from_ical(ical)

print(calendar.walk('VEVENT')[0]['dtstart'].dt.astimezone(tz=pytz.timezone('UTC')))
print(calendar.walk('VEVENT')[0]['dtstart'].dt)
print(icalendar.__version__)
2022-10-21 18:00:00+00:00
2022-10-21 20:00:00
5.0.0

This is correct per se, but the issue is that the calendar matches the timezone of my laptop. When I change the base timezone of the python env os.environ['TZ'] = "UTC" the wrong time is calculated for UTC although the base-timezone of the ics is known (but ignored)

image

commented

The RFC specifies that a timezone with a '/' is supposed to be treated as a 'unique ID in a globally defined time zone', which probably means your ics is missing a VTIMEZONE component?

Assuming Twitch doesn't have a bug in their code, I'd suggest trimming the slash before passing it to pytz.astimezone (e.g. pytz.astimezone(timezone) becomes pytz.astimezone(timezone.strip('/'), same in the line 420) and leaving the line 422 as is, so if the TZID is unknown icalendar fallback to the cache.

commented

According to this, the RFC doesn't specify how parsers should handle TZID with a '/' which explains why I couldn't find anything about them in it.

Sometimes, it is also worth asking the source about it, in this case Twitch. They might have a reason why they choose this one. Maybe also it is a bug in their code - that happens.

I'd suggest trimming the slash before passing it to pytz.astimezone (e.g. pytz.astimezone(timezone) becomes pytz.astimezone(timezone.strip('/'), same in the line 420) and leaving the line 422 as is, so if the TZID is unknown icalendar fallback to the cache.

What I understand:

  • When the timezone has a /, it is supposed to be globally unique == this TZ is defined this in the calendar and it is not to be used anywhere else! This is clearly not the case that is meant here by Twitch: We are talking about a known timezone. They should be informed that they do something unintended. @Cayllen, can you do this?
  • If we are to handle this case:
    • A quick fix is to make sure that his kind of timezone is transferred to one without "/" and used.
    • A proper solution would be to make sure that this is not the case if a TZ is defined in the calendar with the given name.

From my side, the discussion can be concluded and a PR made. If you like @Cayllen, we can support you. It is relatively easy to copy the calendar file into a .ics file, read it and make sure that the correct timezone is in the correct element (step1 test). Then, one can add an if-clause at the place that you identified in the debugger. Would that work for you?

Are there any concerns or other considerations to take? @jacadzaca, thanks for engaging in the conversation and thanks to you @Cayllen for the clear issue!

commented

I created fix_466 with a workaround for the issue.

Let's see what the tests say in #467 - I hope they break! This case should have been considered before...
... Well, they do not break -.-

Awesome thanks! Yes will contact Twitch