'udate' date format (underscore date)
Essentially: udate is ISO8601 with udate_ prefix. And _ instead of -. Now extended to support timezones with a double underscore (__), followed by p for positive or n for negative UTC offsets.
Purpose:
The udate format is a custom date format that uses underscores (_) as delimiters instead of dashes and prefixes udate_. This format is particularly useful for tools like jq, which may not handle dashes (-) or strings starting with a number well in certain contexts, such as when used as JSON keys.
Key Idea:
The udate format follows a structure similar to ISO 8601 but replaces dashes with underscores and always begins with the prefix udate_, making it more friendly to parsers and tools that might otherwise interpret dashes or numeric strings in unexpected ways. Before converting back into ISO 8601 format, the udate_ prefix must be removed, and underscores can then be replaced with the proper ISO-compliant delimiter (e.g., dashes or no delimiters in the basic format). Timezones are added at the end with a double underscore (__) followed by p (for positive offsets) or n (for negative offsets), and the four-digit UTC offset (e.g., 0700 for UTC+07:00).
Note: The
n(negative) is used instead ofmto avoid confusion with months or minutes.
Example:
-
Date (Year-Month-Day):
udate_2024_01_01
This corresponds to January 1st, 2024. -
Date with Time and Positive Timezone (Year-Month-DayTHour:Minute:__pTimezone):
udate_2020_11_02_20_00__p0700
This corresponds to November 2nd, 2020, at 20:00, with a timezone offset of UTC+07:00. -
Date with Time and Negative Timezone (Year-Month-DayTHour:Minute:__nTimezone):
udate_2020_11_02_20_00__n0800
This corresponds to November 2nd, 2020, at 20:00, with a timezone offset of UTC-08:00 (Pacific Standard Time, PST).
Motivation:
Motivation
The motivation for creating the udate format arises from limitations encountered in tools such as jq, which struggles with keys containing dashes or numeric strings starting a key. By using underscores as delimiters and prefixing with udate_:
-
Tool Compatibility:
Many JSON-based tools, includingjq, handle underscores better than dashes and have issues with numeric keys. By using theudate_prefix, the format becomes a convenient way to avoid these pitfalls when parsing or processing JSON with dates as keys. -
Preprocessing Flexibility:
The format can be easily preprocessed to convert the underscores into ISO 8601-compliant dashes, allowing for full compatibility with existing standards and systems that expect ISO 8601 dates. -
Clarity:
Using a clear prefix likeudate_ensures that it is recognizable as a special format. This helps avoid confusion or misinterpretation when working in multi-step data workflows that include intermediate non-ISO date encodings.
Preprocessing Strategy:
To convert a udate back into ISO 8601 format, the following steps need to be followed:
- Remove the
udate_prefix. - Replace underscores with dashes (or remove them for ISO 8601's basic format).
- For formats that include a timezone (indicated by
__pfor positive or__nfor negative), convert the timezone to the ISO 8601±hh:mmformat.
Example in bash:
processed_date=$(echo "$udate" | sed 's/^udate_//' | sed 's/_/-/g' | sed 's/__p\([0-9]\{4\}\)$/+\1/g' | sed 's/__n\([0-9]\{4\}\)$/-\1/g')
This will convert udate_2020_11_02_20_00__n0800 to 2020-11-02T20:00-08:00.
Formalizing the udate Format
Prefix:
The udate format always begins with the prefix udate_ to clearly indicate that the following string represents a date or time format. This prefix distinguishes it from other types of data and makes it clear that the format is non-ISO compliant but convertible.
Delimiter:
In the udate format, underscores (_) replace the dashes (-) used in the extended ISO 8601 format. This substitution ensures compatibility with tools that might misinterpret or mishandle dashes or strings starting with a number, such as when using the format in JSON keys or certain command-line tools.
Timezone Delimiter:
A double underscore (__) is used to separate the date or time from the timezone. The timezone is expressed as a four-digit offset from UTC, prefixed by p for positive or n for negative, to avoid confusion with months or minutes.
udate Format Structure:
-
Date (Year-Month-Day):
Format:udate_YYYY_MM_DD- Example:
udate_2024_01_01for January 1st, 2024.
- Example:
-
Date with Time (Year-Month-DayTHour:Minute:Second):
Format:udate_YYYY_MM_DDTHH_MM_SS- Example:
udate_2024_01_01T12_30_45for January 1st, 2024, at 12:30:45 PM.
- Example:
-
Date with Time and Timezone:
Format:udate_YYYY_MM_DDTHH_MM__pTimezoneorudate_YYYY_MM_DDTHH_MM__nTimezone- Example:
udate_2020_11_02_20_00__p0700for November 2nd, 2020, 20:00, with UTC+07:00. - Example:
udate_2020_11_02_20_00__n0800for November 2nd, 2020, 20:00, with UTC-08:00.
- Example:
-
Week (Year-Week-Day):
Format:udate_YYYY_Www_D- Example:
udate_2024_W01_1for the first Monday of the first week of 2024.
- Example:
-
Day of the Year (Year-DayOfYear):
Format:udate_YYYY_DDD- Example:
udate_2024_032for February 1st, 2024 (the 32nd day of the year).
- Example:
-
Time (Hour:Minute:Second):
Format:udate_THH_MM_SS- Example:
udate_T12_30_45for 12:30:45 PM.
- Example:
Use Case:
- When working in environments that require compatibility with JSON-based tools like
jq. - When storing dates in JSON files where dashes might cause issues with key lookups or parsing.
- When needing to ensure date strings are jq-friendly while preserving flexibility to be ISO-compliant.
- When handling timezone-sensitive data, ensuring the correct interpretation of time offsets, while avoiding confusion with other time-related fields.
Summary of Formats:
- udate_YYYY_MM_DD → Standard date.
- udate_YYYY_Www_D → Week-based date.
- udate_THH_MM_SS → Time only.
- udate_YYYY_DDD → Day of the year.
- udate_YYYY_MM_DDTHH_MM_SS → Full date-time with time.
- udate_YYYY_MM_DDTHH_MM__pTimezone → Full date-time with positive timezone.
- udate_YYYY_MM_DDTHH_MM__nTimezone → Full date-time with negative timezone.
Children