'udate' date format (underscore date)
Essentially: udate
is ISO8601 with udate_
prefix. And _
instead of -
. Now extended to support timezones with a double underscore (__
), followed by p
for positive or n
for negative UTC offsets.
Purpose:
The udate format is a custom date format that uses underscores (_
) as delimiters instead of dashes and prefixes udate_
. This format is particularly useful for tools like jq
, which may not handle dashes (-
) or strings starting with a number well in certain contexts, such as when used as JSON keys.
Key Idea:
The udate format follows a structure similar to ISO 8601 but replaces dashes with underscores and always begins with the prefix udate_
, making it more friendly to parsers and tools that might otherwise interpret dashes or numeric strings in unexpected ways. Before converting back into ISO 8601 format, the udate_
prefix must be removed, and underscores can then be replaced with the proper ISO-compliant delimiter (e.g., dashes or no delimiters in the basic format). Timezones are added at the end with a double underscore (__
) followed by p
(for positive offsets) or n
(for negative offsets), and the four-digit UTC offset (e.g., 0700
for UTC+07:00).
Note: The
n
(negative) is used instead ofm
to avoid confusion with months or minutes.
Example:
-
Date (Year-Month-Day):
udate_2024_01_01
This corresponds to January 1st, 2024. -
Date with Time and Positive Timezone (Year-Month-DayTHour:Minute:__pTimezone):
udate_2020_11_02_20_00__p0700
This corresponds to November 2nd, 2020, at 20:00, with a timezone offset of UTC+07:00. -
Date with Time and Negative Timezone (Year-Month-DayTHour:Minute:__nTimezone):
udate_2020_11_02_20_00__n0800
This corresponds to November 2nd, 2020, at 20:00, with a timezone offset of UTC-08:00 (Pacific Standard Time, PST).
Motivation:
Motivation
The motivation for creating the udate format arises from limitations encountered in tools such as jq
, which struggles with keys containing dashes or numeric strings starting a key. By using underscores as delimiters and prefixing with udate_
:
-
Tool Compatibility:
Many JSON-based tools, includingjq
, handle underscores better than dashes and have issues with numeric keys. By using theudate_
prefix, the format becomes a convenient way to avoid these pitfalls when parsing or processing JSON with dates as keys. -
Preprocessing Flexibility:
The format can be easily preprocessed to convert the underscores into ISO 8601-compliant dashes, allowing for full compatibility with existing standards and systems that expect ISO 8601 dates. -
Clarity:
Using a clear prefix likeudate_
ensures that it is recognizable as a special format. This helps avoid confusion or misinterpretation when working in multi-step data workflows that include intermediate non-ISO date encodings.
Preprocessing Strategy:
To convert a udate back into ISO 8601 format, the following steps need to be followed:
- Remove the
udate_
prefix. - Replace underscores with dashes (or remove them for ISO 8601's basic format).
- For formats that include a timezone (indicated by
__p
for positive or__n
for negative), convert the timezone to the ISO 8601±hh:mm
format.
Example in bash:
processed_date=$(echo "$udate" | sed 's/^udate_//' | sed 's/_/-/g' | sed 's/__p\([0-9]\{4\}\)$/+\1/g' | sed 's/__n\([0-9]\{4\}\)$/-\1/g')
This will convert udate_2020_11_02_20_00__n0800
to 2020-11-02T20:00-08:00
.
Formalizing the udate Format
Prefix:
The udate format always begins with the prefix udate_
to clearly indicate that the following string represents a date or time format. This prefix distinguishes it from other types of data and makes it clear that the format is non-ISO compliant but convertible.
Delimiter:
In the udate format, underscores (_
) replace the dashes (-
) used in the extended ISO 8601 format. This substitution ensures compatibility with tools that might misinterpret or mishandle dashes or strings starting with a number, such as when using the format in JSON keys or certain command-line tools.
Timezone Delimiter:
A double underscore (__
) is used to separate the date or time from the timezone. The timezone is expressed as a four-digit offset from UTC, prefixed by p
for positive or n
for negative, to avoid confusion with months or minutes.
udate Format Structure:
-
Date (Year-Month-Day):
Format:udate_YYYY_MM_DD
- Example:
udate_2024_01_01
for January 1st, 2024.
- Example:
-
Date with Time (Year-Month-DayTHour:Minute:Second):
Format:udate_YYYY_MM_DDTHH_MM_SS
- Example:
udate_2024_01_01T12_30_45
for January 1st, 2024, at 12:30:45 PM.
- Example:
-
Date with Time and Timezone:
Format:udate_YYYY_MM_DDTHH_MM__pTimezone
orudate_YYYY_MM_DDTHH_MM__nTimezone
- Example:
udate_2020_11_02_20_00__p0700
for November 2nd, 2020, 20:00, with UTC+07:00. - Example:
udate_2020_11_02_20_00__n0800
for November 2nd, 2020, 20:00, with UTC-08:00.
- Example:
-
Week (Year-Week-Day):
Format:udate_YYYY_Www_D
- Example:
udate_2024_W01_1
for the first Monday of the first week of 2024.
- Example:
-
Day of the Year (Year-DayOfYear):
Format:udate_YYYY_DDD
- Example:
udate_2024_032
for February 1st, 2024 (the 32nd day of the year).
- Example:
-
Time (Hour:Minute:Second):
Format:udate_THH_MM_SS
- Example:
udate_T12_30_45
for 12:30:45 PM.
- Example:
Use Case:
- When working in environments that require compatibility with JSON-based tools like
jq
. - When storing dates in JSON files where dashes might cause issues with key lookups or parsing.
- When needing to ensure date strings are jq-friendly while preserving flexibility to be ISO-compliant.
- When handling timezone-sensitive data, ensuring the correct interpretation of time offsets, while avoiding confusion with other time-related fields.
Summary of Formats:
- udate_YYYY_MM_DD → Standard date.
- udate_YYYY_Www_D → Week-based date.
- udate_THH_MM_SS → Time only.
- udate_YYYY_DDD → Day of the year.
- udate_YYYY_MM_DDTHH_MM_SS → Full date-time with time.
- udate_YYYY_MM_DDTHH_MM__pTimezone → Full date-time with positive timezone.
- udate_YYYY_MM_DDTHH_MM__nTimezone → Full date-time with negative timezone.
Children