BigQuery comparing DATE and TIMESTAMP
Below is for BigQuery Standard SQL
#standardSQL
SELECT
IFNULL(OnSite.worksite_id, Documents.worksite_id) AS `Worksite`,
IFNULL(OnSite.timestamp, DATE(Documents.timestamp)) AS `DATE`,
COUNT(Documents.worksite_id) AS `Users_on_Site`,
COUNT(DISTINCT OnSite.uid) AS `Completed`
FROM `project.dataset.OnSite` OnSite
LEFT JOIN `project.dataset.Documents` Documents
ON OnSite.worksite_id = Documents.worksite_id
AND OnSite.timestamp = DATE(Documents.timestamp)
GROUP BY `DATE`, `Worksite`
if to apply to sample data from your question
WITH `project.dataset.OnSite` AS (
SELECT "u12345" uid, "worksite_1" worksite_id, DATE '2019-01-01' `TIMESTAMP` UNION ALL
SELECT "u12345", "worksite_1", '2019-01-02' UNION ALL
SELECT "u12345", "worksite_1", '2019-01-03' UNION ALL
SELECT "u12345", "worksite_1", '2019-01-04' UNION ALL
SELECT "u12345", "worksite_1", '2019-01-05' UNION ALL
SELECT "u12345", "worksite_1", '2019-01-06' UNION ALL
SELECT "u1", "worksite_1", '2019-01-01' UNION ALL
SELECT "u1", "worksite_1", '2019-01-02' UNION ALL
SELECT "u1", "worksite_1", '2019-01-05' UNION ALL
SELECT "u1", "worksite_1", '2019-01-06'
), `project.dataset.Documents` AS (
SELECT "1" document_id, "u12345" uid, "worksite_1" worksite_id, 'work_permit' type, TIMESTAMP '2019-01-01 00:00:00' `TIMESTAMP` UNION ALL
SELECT "2", "u12345", "worksite_2", 'job', '2019-01-02 00:00:00' UNION ALL
SELECT "3", "u12345", "worksite_1", 'work_permit', '2019-01-03 00:00:00' UNION ALL
SELECT "4", "u12345", "worksite_2", 'job', '2019-01-04 00:00:00' UNION ALL
SELECT "5", "u12345", "worksite_1", 'work_permit', '2019-01-05 00:00:00' UNION ALL
SELECT "6", "u12345", "worksite_2", 'job', '2019-01-06 00:00:00' UNION ALL
SELECT "7", "u12345", "worksite_1", 'work_permit', '2019-01-07 00:00:00' UNION ALL
SELECT "8", "u12345", "worksite_2", 'work_permit', '2019-01-09 00:00:00' UNION ALL
SELECT "9", "u12345", "worksite_1", 'job', '2019-01-09 00:00:00' UNION ALL
SELECT "10", "u12345", "worksite_2", 'work_permit', '2019-01-09 00:00:00' UNION ALL
SELECT "11", "u12345", "worksite_1", 'work_permit', '2019-01-09 00:00:00' UNION ALL
SELECT "12", "u12345", "worksite_2", 'work_permit', '2019-01-09 00:00:00' UNION ALL
SELECT "13", "u12345", "worksite_1", 'job', '2019-01-09 00:00:00' UNION ALL
SELECT "14", "u12345", "worksite_2", 'work_permit', '2019-01-09 00:00:00' UNION ALL
SELECT "15", "u12345", "worksite_1", 'work_permit', '2019-01-09 00:00:00'
)
result will be as expected
Row Worksite Date Users_on_Site Completed
1 worksite_1 2019-01-01 2 2
2 worksite_1 2019-01-02 0 2
3 worksite_1 2019-01-03 1 1
4 worksite_1 2019-01-04 0 1
5 worksite_1 2019-01-05 2 2
6 worksite_1 2019-01-06 0 2
In BigQuery documentation, it is explained that DATE
function accepts following input :
DATE(year, month, day)
: Constructs a DATE from INT64 values representing the year, month, and day.
DATE(timestamp_expression[, timezone])
: Converts a timestamp_expression to a DATE data type. It supports an optional parameter to specify a timezone. If no timezone is specified, the default timezone, UTC, is used.
In your use case, it seems like the value you are passing to DATE
is already a datetime. For this purpose, you could use DATETIME_TRUNC
, like :
DATETIME_TRUNC(IFNULL(OnSite.timestamp, Documents.timestamp), DAY)