We’re so busy building datamodels, that we almost forget to blog about what you can actually do with all these nice tables and “useful columns”.
Hint: they’re there to be used: to help answer business questions.
Here’s an article on the ga4_transactions table, hopefully it will bring you some inspiration (don’t forget to share, comment, like, and send us flowers if it did).
The business question
Every store has a bestseller list. Fewer have an answer to this question:
of the customers who bought more than once, what did they buy first?
It matters, because not all first purchases are equal. A customer who starts with accessories might churn after one order. One who starts with a core category (outerwear, jeans, whatever it is for your brand) might come back three times. If you know which category acts as the gateway to loyalty, you can promote it more aggressively to new customers, build landing pages around it, and make sure it’s always well-stocked.
Now.. GA4 doesn’t give you this. But ga4_transactions does, because it tracks each purchase as the Nth transaction for that user, and every transaction has the items (and their categories) attached.

What you need
Two things from ga4_transactions:
running_totals.transactions = the purchase number for this user. Filter to = 1 to isolate first purchases. Use > 1 to identify repeat buyers.
items = an array of purchased items, each with item_category (and item_category2 through item_category5 if your taxonomy goes deeper). You unnest this to get one row per item.
The logic is:
- find all users who have made more than one purchase (repeat buyers)
- look at what categories are bought first
- count how many repeat buyers started in each category
The query
WITH repeat_buyers AS (
-- users who have made at least 2 purchases
SELECT DISTINCT user_pseudo_id
FROM `superform_outputs.ga4_transactions`
WHERE running_totals.transactions = 2 -- they reached a 2nd purchase
AND duplicate_count = 1 -- make sure no dupes are present, even if dedupe is false
),
first_purchases AS (
-- the first purchase of each repeat buyer, with items unnested
SELECT
t.user_pseudo_id,
repeat_buyers.user_pseudo_id IS NOT NULL AS is_repeat_buyer,
t.transaction_date,
t.transaction_id,
item.item_category AS first_category,
item.item_name as first_name,
item.item_revenue
FROM `superform_outputs.ga4_transactions` t
CROSS JOIN UNNEST(t.items) AS item
LEFT JOIN repeat_buyers USING(user_pseudo_id)
WHERE t.running_totals.transactions = 1 -- first purchase only
AND t.duplicate_count = 1
)
SELECT
first_category, -- choose aggregation level
--first_name,
COUNT(DISTINCT user_pseudo_id) as first_time_buyers,
COUNT(DISTINCT IF(is_repeat_buyer, user_pseudo_id, NULL)) AS leads_to_repeat_purchase,
ROUND(AVG(item_revenue), 2) AS avg_first_item_revenue,
ROUND(100 * SAFE_DIVIDE(
COUNT(DISTINCT IF(is_repeat_buyer, user_pseudo_id, NULL)),
COUNT(DISTINCT user_pseudo_id)
), 1) as pct_repeat_purchasers,
FROM first_purchases
WHERE first_category IS NOT NULL
GROUP BY ALL
ORDER BY leads_to_repeat_purchase DESC
Replace
superform_outputswith your actual dataset name
The CROSS JOIN UNNEST(t.items) unpacks the items array so you get one row per item per transaction. A single first purchase with 3 items will produce 3 rows — which is what you want here, since you’re interested in what category they bought, not just that they bought.
Reading the results

The top rows tell you which categories most repeat buyers started with. A few patterns worth looking for:
A category that’s disproportionately over-represented compared to its share of overall sales is your loyalty gateway. Push it to new customers.
A category that sells a lot but rarely leads to a second purchase, worth knowing too. Maybe it’s gift purchases, one-off items, or a category that attracts bargain hunters who never come back at full price.
Price point vs. loyalty might be interesting. The avg_first_item_revenue column lets you see if cheaper entry points lead to more loyalty, or if customers who start by spending more are already more committed.
A note on user_pseudo_ids (cookies)
This query uses user_pseudo_id – the default “user” definition every site has.
If you have a more accurate identifier, like user_id in the purchase hit, you can configure GA4Dataform to use that for the transactions table.
in /includes/custom/modules/ga4/config.js add the following:
TRANSACTION_TOTALS_UID: 'user_id', // or use a fallback: // 1. user_id // 2. if that is empty, user_pseudo_id, // 3. if hit is anonymous, use a random string // TRANSACTION_TOTALS_UID: 'COALESCE(user_id, user_pseudo_id, GENERATE_UUID())'
Then, in the query above, replace user_pseudo_id with the value you filled in in the config.
Your turn!
Now it’s your turn. What questions do you have that the ga4_transactions table can answer?
Whether you’re hunting for your next “gateway” category or auditing the lifetime value of a specific promotion (join to ga4_sessions), the data is already there waiting for you.
Write the query, run it against your dataset, and let us know what you find.



