How to calculate Session and Session duration in Firebase Analytics raw data?

How to calculate Session Duration in Firebase analytics raw data which is linked to BigQuery?

I have used the following blog to calculate the users by using the flatten command for the events which are nested within each record, but I would like to know how to proceed with in calculating the Session and Session duration by country and time.

(I have many apps configured, but if you could help me with the SQL query for calculating the session duration and session, It would be of immense help)

Google Blog on using Firebase and big query


Solution 1:

First you need to define a session - in the following query I'm going to break a session whenever a user is inactive for more than 20 minutes.

Now, to find all sessions with SQL you can use a trick described at https://blog.modeanalytics.com/finding-user-sessions-sql/.

The following query finds all sessions and their lengths:

#standardSQL

SELECT app_instance_id, sess_id, MIN(min_time) sess_start, MAX(max_time) sess_end, COUNT(*) records, MAX(sess_id) OVER(PARTITION BY app_instance_id) total_sessions,
   (ROUND((MAX(max_time)-MIN(min_time))/(1000*1000),1)) sess_length_seconds
FROM (
  SELECT *, SUM(session_start) OVER(PARTITION BY app_instance_id ORDER BY min_time) sess_id
  FROM (
    SELECT *, IF(
                previous IS null 
                OR (min_time-previous)>(20*60*1000*1000),  # sessions broken by this inactivity 
                1, 0) session_start 
                #https://blog.modeanalytics.com/finding-user-sessions-sql/
    FROM (
      SELECT *, LAG(max_time, 1) OVER(PARTITION BY app_instance_id ORDER BY max_time) previous
      FROM (
        SELECT user_dim.app_info.app_instance_id
          , (SELECT MIN(timestamp_micros) FROM UNNEST(event_dim)) min_time
          , (SELECT MAX(timestamp_micros) FROM UNNEST(event_dim)) max_time
        FROM `firebase-analytics-sample-data.ios_dataset.app_events_20160601`
      )
    )
  )
)
GROUP BY 1, 2
ORDER BY 1, 2

enter image description here

Solution 2:

With the new schema of Firebase in BigQuery, I found that the answer by @Maziar did not work for me, but I am not sure why. Instead I have used the following to calculate it, where a session is defined as a user engaging with your app for a minimum of 10 seconds and where the session stops if a user doesn't engage with the app for 30 minutes. It provides total number of sessions and the session length in minutes, and it is based on this query: https://modeanalytics.com/modeanalytics/reports/5e7d902f82de/queries/2cf4af47dba4

SELECT COUNT(*) AS sessions,
       AVG(length) AS average_session_length
  FROM (
  
SELECT global_session_id,
       (MAX(event_timestamp) - MIN(event_timestamp))/(60 * 1000 * 1000) AS length
  FROM (
SELECT user_pseudo_id,
       event_timestamp,
       SUM(is_new_session) OVER (ORDER BY user_pseudo_id, event_timestamp) AS global_session_id,
       SUM(is_new_session) OVER (PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS user_session_id
  FROM (
       SELECT *,
              CASE WHEN event_timestamp - last_event >= (30*60*1000*1000) 
                     OR last_event IS NULL 
                   THEN 1 ELSE 0 END AS is_new_session
         FROM (
              SELECT user_pseudo_id,
                     event_timestamp,
                     LAG(event_timestamp,1) OVER (PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS last_event
                FROM `dataset.events_2019*`
              ) last
       ) final
       ) session
 GROUP BY 1
       
       ) agg
WHERE length >= (10/60)

Solution 3:

As you know, Google has changed the schema of BigQuery firebase databases: https://support.google.com/analytics/answer/7029846

Thanks to @Felipe answer, the new format will be changed as follow:

SELECT SUM(total_sessions) AS Total_Sessions, AVG(sess_length_seconds) AS Average_Session_Duration
FROM (
  SELECT user_pseudo_id, sess_id, MIN(min_time) sess_start, MAX(max_time) sess_end, COUNT(*) records, 
    MAX(sess_id) OVER(PARTITION BY user_pseudo_id) total_sessions,
    (ROUND((MAX(max_time)-MIN(min_time))/(1000*1000),1)) sess_length_seconds
  FROM (
    SELECT *, SUM(session_start) OVER(PARTITION BY user_pseudo_id ORDER BY min_time) sess_id
    FROM (
      SELECT *, IF(previous IS null OR (min_time-previous) > (20*60*1000*1000), 1, 0) session_start 
      FROM (
        SELECT *, LAG(max_time, 1) OVER(PARTITION BY user_pseudo_id ORDER BY max_time) previous
        FROM (SELECT user_pseudo_id, MIN(event_timestamp) AS min_time, MAX(event_timestamp) AS max_time
          FROM `dataset_name.table_name` GROUP BY user_pseudo_id)
      )
    )
  )
  GROUP BY 1, 2
  ORDER BY 1, 2
)

Note: change dataset_name and table_name based on your project info

Sample result: enter image description here