how to iteratively SELECT by hour for averages by userid?
I'm trying to average some metrics by user id, and by hour, in a select statement.
Would love some help understanding the best approach to doing this without using parameters in a python script :P
this is the current query,
SELECT
user_id,
AVG(sentiment) as sentiment,
AVG(magnitude) as magnitude,
SUM(sentiment) as total_sentiment,
SUM(magnitude) as total_magnitude,
MAX(sentiment) as max_sentiment,
MIN(sentiment) as min_sentiment,
COUNT(user_id) as count
FROM
sentiments
WHERE
created
BETWEEN %s and %s
GROUP BY
user_id;
Postgres by the way, and sending query with pandas via python for a test :)
update trying answer,
sql = """
SELECT
user_id,
AVG(sentiment) as sentiment,
AVG(magnitude) as magnitude,
SUM(sentiment) as total_sentiment,
SUM(magnitude) as total_magnitude,
MAX(sentiment) as max_sentiment,
MIN(sentiment) as min_sentiment,
COUNT(user_id) as count
FROM
sentiments
GROUP BY
user_id,
date_part('hour', created);
"""
conn = db_conn.main()
cur = conn.cursor()
df = pd.read_sql(sql, con=conn)
cur.close()
df
This returns an error that
': function date_part(unknown, integer) does not exist
LINE 15: date_part('hour', created);
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
Thanks in advance,
Solution 1:
The easiest way to truncate a timestamp is the DATE_TRUNC function.
SELECT DATE_TRUNC('hour', created), user_id,
...
GROUP BY DATE_TRUNC('hour', created), user_id;
But if created
contains a number that's a unix timestamp, then you first need to convert it to a timestamp.
SELECT DATE_TRUNC('hour', TO_TIMESTAMP(created)), user_id,
...
GROUP BY DATE_TRUNC('hour', TO_TIMESTAMP(created)), user_id;
Solution 2:
In posgresql you can group by year, month, day and hours using,
group by
date_part('year', created),
date_part('month', created),
date_part('day', created),
date_part('hour', created)
Or doing some math with epoch
group by floor(date_part('epoch', created_at)/3600)
alternatively you could use extract, e.g. extract(epoch from created)