GA4Dataform v11 released

Written by
Jules Stuifbergen
Reading progress

Today we updated GA4Dataform to v11!

If we’d name our versions, this would have been called Tasty Transactions – because it contains a new model, based on purchases and refunds.

A new ga4_transactions table

Full column descriptions are available as description in the output table schema. A brief summary:

This table will have 1 row per transactions (purchase events), with in it:

  • transaction_id (the key)
  • transaction_date (the partition column)
  • session, user, and stream info
  • the time struct with all timestamps
  • the ecommerce struct
  • the items of course
  • A column with item_totals – containing e.g. item revenue

If you have refunds configured properly, this table will also aggregate all refund events in the same row:

  • A struct net_item_totals will be calculated based on item value minus refund value (and quantity)
  • A refund_info stuct, with aggregated refund info AND an array with all refund events

User totals

Based on a user identifier, we calculate running purchase totals, based on what we know since the start of data collection.

  • transactions – this is the Nth purchase of this user
  • running sums of purchase_revenue and item quantity
  • time since last purchase (and for convenience: time to next purchase ) – if any

With this information, it’s super easy to build queries to generate useful queries, like a line items table, or more

Sample query: average time between purchase

Here’s a quick sample query to calculate repurchase rates and the avg times to the next purchase:

SELECT
  running_totals.transactions as Nth_purchase,
  count(*) as purchases,
  countif(running_totals.seconds_to_next_purchase is NULL) as no_next_purchase,
  round(
    100 * countif(running_totals.seconds_to_next_purchase is not NULL) / count(*),
    1
  ) || '%' as pct_with_next_purchase,
  round(
    avg(running_totals.seconds_to_next_purchase / (3600 * 24)),
    1
  ) as days_to_next_purchase

FROM
  `superform_outputs.ga4_transactions`
WHERE true
and uid is not null
GROUP BY ALL
order by Nth_purchase

How to install or update

As usual, just visit https://setup.ga4dataform.com/ and proceed as normal. You will get v11 by default, including a new config.js file with the configuration options set.
If you already have an older version of GA4Dataform installed, the installer detects this and upgrades. On the next new run (usually the next day, or after a manual run), the ga4_transactions table will be available.

Note that when you upgrade, the default configuration is used. If you want to override the defaults, add the configuration options to custom/config.js

Configuration options

There’s 2 configuration options:

  1. The TRANSACTIONS_DEDUPE option
    • true: deduplicate the output table based on transaction_id (keep only the first)
    • false: keep all transactions in, even when they have the same transaction_id
  2. What user ID to use to calculate the running totals: TRANSACTION_TOTALS_UID (string)
    • default: user_pseudo_id – the cookie
    • if a column from the events table is present on the purchase hit that can be used, fill this in here. Eg. user_id

The default block:

// by default, we leave duplicate transaction_ids alone
// note: setting this to true will still keep NULL transaction_id
  TRANSACTIONS_DEDUPE: false,

// we keep a running count for transactions, based on an identifier
// if you have an other one, you can change it here
// (e.g. "user_id" - make sure it's a valid column)
TRANSACTION_TOTALS_UID: 'user_pseudo_id',

More information

We’re currently updating the official documentation, so when that is ready, you can read all about it there.

Meanwhile, if you have questions, feedback, bug reports, or neat queries to share: contact us or drop a comment below!

Jules Stuifbergen

Published at December 16, 2024

Leave a Reply

Your email address will not be published. Required fields are marked *

Continue Reading