GA4 data allows you to send the raw event data to BigQuery, a cloud data warehouse. Having the data allows you to not only overcome GA4 reporting limitation but also provides a way to power your marketing with rich data integrations. Check, 9 Reasons Why You Should Turn On GA4 BigQuery Integration to understand how BigQuery can help your organization.
BigQuery event data export comes in two flavors
- Daily - Data is available the next day
- Streaming - data is available in just a few seconds.
Daily export create events_YYYYMMDD tables while streaming creates events_intraday_YYYYMMDD tables. (Learn more at events_intraday and events tables in BigQuery for GA4 data)
If the Streaming export option is enabled, a table named events_intraday_YYYYMMDD is created. This table is populated continuously as events are recorded throughout the day.
If the Daily export option is enabled, a table named events_YYYYMMDD is created after the day is over. The latest table contains the data from the previous day.
If both, Daily and Streaming options are selected then at the end of the daya events_intraday_YYYYMMDD table is deleted events_YYYYMMDD is complete.
Both the tables has similar data schema except a slight difference. Keep reading to learn about that difference.
Limitations of Daily Export
Daily export table is limited to 1 Million events each day. If you continuously exceed that limit then the export is disabled till you fix the issue. There are several ways to limit the data that's sent to BigQuery each day. Two of the most common ways are
- Limit the events that you send to BigQuery. This option might not work for all the cases.
- Use multiple properties so that each can send 1 million events. This is more difficult to setup up but it is doable.
- Use streaming instead of daily. Streaming does not have the same limit as daily export has and might be a better solution than the above two. But Streaming has other limitation.
Limitation of Streaming Export
Not all devices on which events are triggered send their data to GA4 on the same day the events are triggered. To account for this latency, daily tables (events_YYYYMMDD
) are updated up to three days after the event occurred. So if data comes in within the next 3 days, it is reflected in GA4 reports and also in event_YYYYMMDD (daily table). Events are updated with the correct timestamp event when they arrive late (within 3 days of occurrence).
However the same is not true for streaming table. Streaming table does not get updated.
As a result you might see some discrepancy between the data that's recorded in daily table and streaming table.
In addition to the above difference there is another data point that's missing in the Streaming table.
The traffic_source column in daily table contains a record of the traffic source that first acquired the user. This record stored the following information
- traffic_source.name (reporting dimension: User campaign)
- traffic_source.source (reporting dimension: User source)
- traffic_source.medium (reporting dimension: User medium)
Howvere, this column is not populated in intraday tables.
BigQuery streaming export does not include the following user-attribution data for new users as well. User-attribution data for existing users is included in the streaming table but that data requires about a day to fully process, so it is not recommend to use that data from streaming export. If this data is critical then you will need to use the daily export.
Need help with GA4 and BigQuery?
We are here to help. Send an email to support@optizent.com or fill the contact us form.