7 Essential SQL Commands with GA4 and BigQuery

SQL is becoming an essential skill for digital marketers and marketing analysts. This article shares seven essential SQL commands or statements that you need to learn and master when dealing with marketing data. These SQL statements are also great for beginners to learn SQL throughout this article. Using GA4 data in BigQuery, if you are not familiar with how to send your GA4 data to BigQuery, check out the link in the description of this video. Now let’s look at the seven essential SQL commands or statements.

The first statement that you want to learn and master is the SELECT statement; it is the most fundamental statement in SQL you will be using quite extensively. It allows you to select records from one or multiple tables.

To use the SELECT statement, type SELECT in your query window, and then write the list of column names that need to be selected. Select all the columns from a given table. So star or asterisk means select all the columns, then write FROM, and then write the table’s name that needs to be selected from the data. All tables are listed under your projects. If I expand any of these projects, I can see all the data sets under that project. Within the data set, I have that list of tables Google Analytics 4 sends the data in the events table. So this is the one that I will be using each day of data stored in a separate table, and that’s why you see 218 tables. So let’s pick one of them and write it here. 

By enclosing the complete table name in single quotes, the complete table name starts with the name of the project followed by a dot and the name of the data set, then another dot and the table that I want to select the data. It is all you need once you construct this query. Go ahead and run it, and here are all the results from this table so you can see all the events with a timestamp which event is happening, etc. So that’s how the SELECT statement works. It lets you see the data from a given table or multiple tables. We’ll look at the multiple table example later.

1. If the site gets a lot of traffic and you only want to look at a sample of records, thencan use the next statement, which is the LIMIT statement. The LIMIT statement works in BigQuery to limit the number of returned records.

 2. If you look at this example, it shows you that there are 739 records; however, a lot of time, you just want to sample what’s in the database. Don’t waste your time looking at every single record in that scenario. The LIMIT statement allows you to limit the number of returned records in your result set. To write a LIMIT statement, you can use the same line or enter it on the next line. Write the number of records you want to return, so let’s say LIMIT 2 and run. It’s only going to return two records. So that’s how you use SELECT and LIMIT statements.

3. To limit and look at the records that match specific criteria, this is where a WHERE clause comes into play. How to write a WHERE clause? To see the records where the session_start, remove LIMIT. Without removing limits will only display two records. Instead of that, write WHERE and then give the column name. In this case, its event_name, and then within parenthesis the value to limit it. Select all the columns from this table where the event_name equals the session_start. That’s all you need. Go ahead and click on the run.

Now you can see all the records where the event_name equals the session_start. Other events are not showing up here. It is a great way to isolate a particular Google Analytics 4 event that you want to narrow down. Use this on any column. Use a lot of conditions in the WHERE clause where all the magic happens in SQL. Knowing and being able to use the WHERE clause dictates whether or not the info you are working with is really good and can be used for further insights.

4. The following SQL statement is COUNT, which allows you to count the number of records matching the criteria. To count how many records are there in the table where the event_name is session_start. It means how many sessions started on a given day. In that case, use the COUNT statement to look at the number of session_start events. Modify the statement that you already have. Instead of selecting all the records matching these criteria, use the COUNT statement here, and within parenthesis, give a list of columns to use for this query.

Leave asterisk since all we need is the count of all the records that meet this criterion. Once done, go ahead and click run; it will show the number of records that meet that criteria, which means 176 sessions started on this particular day. Keep in mind each table contains one day’s worth of data, and that’s what you’re looking at here. So that’s how you count the number of records matching your criteria. 

5. The next SQL statement that marketers and marketing analysts should focus on is SUM. SUM allows you to sum the values in a given column; when Google Analytics4 sends the data to BigQuery, it sends the e-commerce revenue data in one particular column. It also sends the information about each item in the transaction. It provides information about how many items were sold, what the total revenue for each item etc. To find out the total value of the product sold in a given time frame, apply SUM to any of the numeric columns to find the sum of the values contained in that column. Modify the query that has already been written and use the same BigQuery table that contains Google Analytics4 data.

First, change the COUNT to SUM; instead of an asterisk, add the column’s name to execute the SUM. In this case, the column is called e-commerce purchase revenue – Get acquainted with the schema of GA4 data in BigQuery to know this. Next, remove the WHERE clause because we don’t want to limit the event_name is the session_start. Go ahead and run it. It displays the total sum of the revenue. It takes all the columns that contain that value and returns the SUM. This query sums all the eCommerce purchase revenue column values and returns the value. It comes in handy as you calculate the transaction value, the lifetime value of the customers, etc. 

6. The next statement that you need to be familiar with is GROUP BY. It allows you to produce SUM, COUNT, etc., by a value in another column. Change to COUNT, add the asterisk, and add another column called event_name. Selecting the event_name and then also counting the records. It is where to add the statement Group By event_name. It is going to roll up the count by event_name.

As a result, it will display each and every event and the total number of times recorded that event. Let’s run the program. It displays the number of times each of the Google Analytics4 events captured where the user engagement, session_start, page view, etc., and the count of those events. It is how you can determine how many times a particular event is happening in a given time frame or match specific criteria. Used GROUP BY for count and can be used in sum as well. Figure out the column you want to sum up, add that instead of asterisk and run your query.

7. Finally, a SQL statement helps join multiple tables to get the results. This statement is called JOIN. The query has the event_name and the count of those events. However, the event_ name is in lowercase with an underscore. That is how Google Analytics 4 captures the data, so those same event_name’s are passed to BigQuery. 

There is an easier way to write queries within the interface. Go to the table, click on the three dots, and click on the query that opens up a new query window with everything written. Add the column names, add an asterisk, and click on the run. It will show the data that’s in this table. Join this table to this query, and instead of the event_name, we will use the description from the event_names table. Enter a new line and type in JOIN, then provide the name of the table that you want to join to, which is event_ name’s in this example.

Join two tables data that’s coming from GA with the names table. Next, specify the columns that use to connect these two tables. The event_ name column in both the tables is the one that contains a common value. Use that to join these two tables. To join the tables, you write ON and then specify the column from the first table followed by an equal sign and the column from the second table. Use the appropriate name of that column, which means to start with the project data set table name and then column name. That’s a lot of writing. A bonus command or statement that you can use while writing SQL is using aliases for tables and columns. 

Creating an alias is easy. Just add that in front of the table or the column name. In this case, we’ll assign an alias to the table, call it as ga4 table and then call this table as etable. Now specify ga4 table dot event_name is equal to etable dot event_ name rather than writing all the characters. Now use the event_description instead of the event_name.

We pull the event_description from the second table and then count the total number of records. Go ahead and click on the run. The result set displays session_start, first visit, etc. There are four events here because this code finds the matching records in both tables. 

In conclusion, we covered seven essential SQL command source statements that every digital marketer or marketing analyst should learn and master and a bonus SQL command. The SELECT is the most basic command which selects records from one or more tables. The LIMIT allows you to limit the number of returned records in the result. 

SET command will enable us to set criteria to select the tables’ records. COUNT statement counts the number of records in the result set. SUM calculates the total of all the values in a given column. GROUB BY is used with sum or count, which allows for rolling up the totals based on a value in another column. JOIN statement joins multiple tables to get the result set. The bonus was statement alias easily refer to the tables and the columns without writing the full table name or the column name. 

Need help with GA4 and or BigQuery?

Reach out to us at support@optizent.com or fill the form below and we will be in touch. We can not only help you get the most out of your Analytics tools but also training your team.

Other posts you might be interested in

https://www.optizent.com/blog/ga4-bigquery-export-why-and-how/

https://www.optizent.com/blog/how-to-connect-ga4-to-bigquery/

Leave a Reply

Your email address will not be published.