What is the correct POSIX-style TZ format: <+04>-4 vs UNK-4

Currently I am writing a component for an ESP32 microcontroller. Therefore I need a list of all timezones.

I found many sources of a timezone file, but they use different formats for the timezone values. One example ist Europe/Ulyanovsk. Some files use <+04>-4 and others use UNK-4.

The ESP32 can handle UNK-4 correctly, but fails with a value like <+04>-4.

Can someone explain me, what the format with "<" and ">" means? Why don't they use an abbreviation? Unfortunately that format is also used by the iana Time Zone Database.

In this link: https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html you can read "The std string specifies the name of the time zone. It must be three or more characters long and must not contain a leading colon, embedded digits, commas, nor plus and minus signs."

So how can this be a valid TZ value?

In this link: https://man7.org/linux/man-pages/man5/tzfile.5.html you can read "Some readers mishandle POSIX-style TZ strings that contain “<” or “>”. As a partial workaround, a writer can avoid using “<” or “>” for time zone abbreviations containing only alphabetic characters."

Obviously the ESP32 is affected by this problem. So is there another official source of a timezone database that uses a format without "<" and ">"?

Thanks for any help!

Cheers Harald


Solution 1:

Here is the Posix specification for time zone names: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03

For "UNK-4", the "UNK" is the time zone abbreviation (designation in Posix-speak) and -4 is the offset and means 4h east of the prime meridian. Note that the sign of the offset is opposite what everyone else uses, including other parts of Posix.

To differentiate between the abbreviation and the offset, the abbreviation must be all alphabetic characters.

However if you would like the abbreviation to contain non-alphabetic characters, the Posix spec says that you can quote the abbreviation with a leading '<' and a trailing '>'.

  • In the quoted form, the first character shall be the ( '<' ) character and the last character shall be the ( '>' ) character. All characters between these quoting characters shall be alphanumeric characters from the portable character set in the current locale, the ( '+' ) character, or the ( '-' ) character. The std and dst fields in this case shall not include the quoting characters.

So with "<+04>-4", the abbreviation is "+04", while the offset remains 4h east.