Today we updated GA4Dataform to v11
!
If we’d name our versions, this would have been called Tasty Transactions – because it contains a new model, based on purchases and refunds.
A new ga4_transactions table
Full column descriptions are available as description in the output table schema. A brief summary:
This table will have 1 row per transactions (purchase events), with in it:
- transaction_id (the key)
- transaction_date (the partition column)
- session, user, and stream info
- the
time
struct with all timestamps - the
ecommerce
struct - the
items
of course - A column with
item_totals
– containing e.g. item revenue
If you have refunds configured properly, this table will also aggregate all refund events in the same row:
- A struct
net_item_totals
will be calculated based on item value minus refund value (and quantity) - A
refund_info
stuct, with aggregated refund info AND an array with all refund events
User totals
Based on a user identifier, we calculate running purchase totals, based on what we know since the start of data collection.
transactions
– this is the Nth purchase of this user- running sums of
purchase_revenue
and item quantity - time since last purchase (and for convenience: time to next purchase ) – if any
With this information, it’s super easy to build queries to generate useful queries, like a line items table, or more
Sample query: average time between purchase
Here’s a quick sample query to calculate repurchase rates and the avg times to the next purchase:
SELECT running_totals.transactions as Nth_purchase, count(*) as purchases, countif(running_totals.seconds_to_next_purchase is NULL) as no_next_purchase, round( 100 * countif(running_totals.seconds_to_next_purchase is not NULL) / count(*), 1 ) || '%' as pct_with_next_purchase, round( avg(running_totals.seconds_to_next_purchase / (3600 * 24)), 1 ) as days_to_next_purchase FROM `superform_outputs.ga4_transactions` WHERE true and uid is not null GROUP BY ALL order by Nth_purchase
How to install or update
As usual, just visit https://setup.ga4dataform.com/ and proceed as normal. You will get v11 by default, including a new config.js
file with the configuration options set.
If you already have an older version of GA4Dataform installed, the installer detects this and upgrades. On the next new run (usually the next day, or after a manual run), the ga4_transactions
table will be available.
Note that when you upgrade, the default configuration is used. If you want to override the defaults, add the configuration options to custom/config.js
Configuration options
There’s 2 configuration options:
- The
TRANSACTIONS_DEDUPE
option- true: deduplicate the output table based on transaction_id (keep only the first)
- false: keep all transactions in, even when they have the same transaction_id
- What user ID to use to calculate the running totals:
TRANSACTION_TOTALS_UID
(string)- default:
user_pseudo_id
– the cookie - if a column from the events table is present on the purchase hit that can be used, fill this in here. Eg.
user_id
- default:
The default block:
// by default, we leave duplicate transaction_ids alone // note: setting this to true will still keep NULL transaction_id TRANSACTIONS_DEDUPE: false, // we keep a running count for transactions, based on an identifier // if you have an other one, you can change it here // (e.g. "user_id" - make sure it's a valid column) TRANSACTION_TOTALS_UID: 'user_pseudo_id',
More information
We’re currently updating the official documentation, so when that is ready, you can read all about it there.
Meanwhile, if you have questions, feedback, bug reports, or neat queries to share: contact us or drop a comment below!