Generic support for ISO 8601 format in Java 6
Java 7 has introduced support in the SimpleDateFormat
class for ISO 8601 format, via the character X
(instead of lower or upper case Z
). Supporting such formats in Java 6 requires preprocessing, so the best approach is the question.
This new format is a superset of Z
(uppercase Z), with 2 additional variations:
- The "minutes" field is optional (i.e., 2-digit instead of 4-digit timezones are valid)
- A colon character (':') can be used for separating the 2-digit "hours" field from the 2-digit "minutes" field).
So, as one can observe from the Java 7 documentation of SimpleDateFormat
, the following 3 formats are now valid (instead of only the second one covered by Z
in Java 6) and, of course, equivalent:
- -08
- -0800
- -08:00
As discussed in an earlier question about a special case of supporting such an "expanded" timezone format, always with ':' as a separator, the best approach for backporting the Java 7 functionality into Java 6 is to subclass the SimpleDateformat
class and override its parse()
method, i.e:
public Date parse(String date, ParsePosition pos)
{
String iso = ... // Replace the X with a Z timezone string, using a regex
if (iso.length() == date.length())
{
return null; // Not an ISO 8601 date
}
Date parsed = super.parse(iso, pos);
if (parsed != null)
{
pos.setIndex(pos.getIndex()+1); // Adjust for ':'
}
return parsed;
}
Note that the subclassed SimpleDateFormat
objects above must be initialized with the corresponding Z
-based pattern, i.e. if the subclass is ExtendedSimpleDateformat
and you want to parse dates complying to the pattern yyyy-MM-dd'T'HH:mm:ssX
, then you should use objects instantiated as
new ExtendedSimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssZ");
In the aforementioned earlier question the regex :(?=[0-9]{2}$)
has been suggested for getting rid of the ':' and in a similar question the regex (?<=[+-]\d{2})$
has been suggested for appending the "minute" field as 00
, if needed.
Obviously, running the 2 replacements successfully can be used for achieving full functionality. So, the iso
local variable in the overridden parse()
method would be set as
iso = date.replaceFirst(":(?=[0-9]{2}$)","");
or
iso = iso.replaceFirst("(?<=[+-]\\d{2})$", "00");
with an if
check in between to make sure that the pos
value is also set properly later on and also for the length()
comparison earlier.
The question is: can we use a single regular expression to achieve the same effect, including the information needed for not unnecessarily checking the length and for correctly setting pos
a few lines later?
The implementation is intended for code that reads very large numbers of string fields that can be in any format (even totally non-date), selects only those which comply to the format and returns the parsed Java Date
object.
So, both accuracy and speed are of paramount importance (i.e., if using the 2 passes is faster, this approach is preferrable).
Solution 1:
Seems that you can use this:
import java.util.Calendar;
import javax.xml.bind.DatatypeConverter;
public class TestISO8601 {
public static void main(String[] args) {
parse("2012-10-01T19:30:00+02:00"); // UTC+2
parse("2012-10-01T19:30:00Z"); // UTC
parse("2012-10-01T19:30:00"); // Local
}
public static Date parse(final String str) {
Calendar c = DatatypeConverter.parseDateTime(str);
System.out.println(str + "\t" + (c.getTime().getTime()/1000));
return c.getTime();
}
}
Solution 2:
You can use java.time, the modern Java date and time API, in Java 6. This would seem to me as the nice and also future-proof solution. It has good support for ISO 8601.
import org.threeten.bp.OffsetDateTime;
import org.threeten.bp.format.DateTimeFormatter;
public class DemoIso8601Offsets {
public static void main(String[] args) {
System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+0200",
DateTimeFormatter.ofPattern("uuuu-MM-dd'T'HH:mm:ssXX")));
System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+02",
DateTimeFormatter.ofPattern("uuuu-MM-dd'T'HH:mm:ssX")));
System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+02:00"));
System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00Z"));
}
}
Output from this program is:
2012-10-01T19:30+02:00 2012-10-01T19:30+02:00 2012-10-01T19:30+02:00 2012-10-01T19:30Z
It requires that you add the ThreeTen Backport library to your project setup.
- In Java 8 and later and on newer Android devices (from API level 26) the modern API comes built-in.
- In Java 6 and 7 get the ThreeTen Backport, the backport of the new classes (ThreeTen for JSR 310; see the links at the bottom).
- On (older) Android use the Android edition of ThreeTen Backport. It’s called ThreeTenABP. And make sure you import the date and time classes from
org.threeten.bp
with subpackages.
As you can see from the code, +02
and +0200
require a formatter where you specify the format of the offset, while +02:00
(and Z
too) conforms with the default format and doesn’t need to be specified.
Can we parse all the offset formats using the same formatter?
When reading mixed data, you don’t want to handle each offset format specially. It’s better to use optional parts in the format pattern string:
DateTimeFormatter allInOne
= DateTimeFormatter.ofPattern("uuuu-MM-dd'T'HH:mm:ss[XXX][XX][X]");
System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+0200", allInOne));
System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+02", allInOne));
System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+02:00", allInOne));
System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00Z", allInOne));
Output is the same as above. The square brackets in [XXX][XX][X]
mean that either format +02:00
, +0200
or +02
may be present.
Links
-
Oracle tutorial: Date Time explaining how to use
java.time
. -
Java Specification Request (JSR) 310, where
java.time
was first described. -
ThreeTen Backport project, the backport of
java.time
to Java 6 and 7 (ThreeTen for JSR-310). - ThreeTenABP, Android edition of ThreeTen Backport
- Question: How to use ThreeTenABP in Android Project, with a very thorough explanation.