By GALEN KASHTOCK

NOVEMBER 21, 2022

HeapTechnology

Heap: Query Heap data from your warehouse

By GALEN KASHTOCK

NOVEMBER 21, 2022

HeapTechnology

Query Heap data from your warehouse

Have you ever wanted to join website data with other data such as your customer relationship management (CRM) tool or order management system (OMS)? While Heap provides rich integrations to bring data into Heap, sometimes it’s helpful to bring data out of Heap for analysis. Enter Heap Connect, Heap’s solution for sending data to your warehouse.

In this post, we’ll cover:

  • Potential use cases for Heap Connect
  • Setting up and getting started with Heap Connect
  • How Brooklyn Data’s Heap data is used in the modern data stack
  • Determining if Heap Connect is right for you

Use cases

Let’s review a couple use cases for Heap Connect: visualizations and joining Heap to other data:

Visualization

The Heap user interface (UI) provides an excellent, always up-to-date medium for product analytics. However, there are a few use cases that are better handled by a business intelligence (BI) tool:

  • Enhanced visuals: Heap provides the following chart types in its user interface: line, bar, stacked, pie, and table. With Heap data in your BI tool of choice, you can create enhanced visualizations, including maps. In addition to maps, BI tools open other possibilities including scatter plots, box plots, histograms, and more.

Map Visualization

  • Robust filters: BI tools have increased filter flexibility beyond what is available in the Heap UI. Here’s an example of various filter options on date ranges in Looker:

Date Filters

  • Dynamic dashboards: Dashboards in the Heap UI have some interactive capabilities, but BI tools add additional functionality, including use cases such as:
    • The use of parameters to instantly change the grain of all trend charts from ‘daily’ to ‘monthly’
    • Linking to a specific order in your order management system (OMS) that needs action
  • Visualizing Heap alongside other platforms: Show Heap data alongside other KPIs in an executive overview report. For example, an order conversion funnel can be visualized alongside actual sales to paint a more complete picture of how user behavior impacts the bottom line.

Joining Heap to other data

A very powerful use case for Heap Connect is to join Heap data with other platforms, such as a CRM. By joining Heap data to a CRM, users can unlock insights by combining product analytics with customer data. In this example, Heap gives a sample query of how to join Heap data to Salesforce, a popular CRM.

Frame (34)

Let’s take this one step further, and pretend that we are a B2B company who is a featuring a new product on our website. If one of our current customers is showing interest in the new product on our website, we can use a reverse ETL tool such as Census or Hightouch to send these users to our various marketing platforms. A common use case for reverse ETL is to take data that is modeled and trusted from your database and send it to customer engagement platforms where action can be taken on the data. In the next section, we’ll review how to set up and get started with Heap Connect.

Set up Heap Connect

Connecting Heap Connect

Configuring Heap Connect is fast, and Heap provides a helpful guide for connecting to BigQuery, Snowflake, Redshift, and S3. Once connected, Heap automatically includes some tables such as users, sessions, pageviews, and all_events.

Syncing events

While no additional action is needed to sync data from users, sessions, and pageviews, there is one additional step needed to sync specific events to the all_events table. To sync an event to the warehouse, simply navigate to the event in the UI, and toggle the button in the 'Connect' section:

Frame (27)

Reviewing Schema

Here’s a (partial) view of the sessions table delivered by Heap. It includes various IDs and other information tied to the session such as device type, landing page, and referral information.

Frame (28)

To learn more about the fields Heap includes, take a look at their sessions schema. In addition to tables with product analytics data, Heap provides meta data such as this _sync_history table:

Frame (29)

Getting started with common queries

Once your data is connected, it can be daunting to know where to start. Thankfully, Heap provides a list of common queries to get you started. After getting comfortable running queries, you may find there are certain queries that you would like updated daily, and other queries that are really complex and require business logic that has been validated by a key stakeholder. Enter the modern data stack, which can take your usage of Heap Connect to the next level. In the next section, we’ll review how the Heap data from Brooklyn Data’s website fits in the modern data stack.

Frame (24)

How Heap Connect fits in Brooklyn Data’s stack

At Brooklyn Data, our data from Heap Connect in this example follows a common workflow seen in the modern data stack: Snowflake → dbt → Looker. In the ‘Reviewing the schema’ section, we shared an image of the sessions table as the data comes from Heap. Here’s an example of how we use the sessions data downstream.

Transforming data using dbt

In the image below, you’ll see the path of data involved in populating our final sessions model using dbt, which is visible in Looker. Here’s a little more information about what happens in each stage:

  1. The green boxes indicate the raw data in Snowflake.

  2. Next, the boxes with the source_ prefix perform basic transformations on the data, such as standardizing time zones and concatenating fields like user_id and session_id, to generate a unique session ID as session_id is not unique in the raw data.

  3. Robust transformations of the data occur in the boxes with the int_ prefix, which include calculating sequences, such as the session number of a user, or the number of pageviews in each session. Data from multiple tables are joined together.

  4. Finally, we have our heap_sessions model, which is what Looker connects to.

Frame (33)

Visualizing Data in Looker

After configuring a model in Looker, an Explore is used to visualize data. Here is a blank canvas for our sessions Explore:

Frame (31)

In the example below, we look at the landing pages that are driving the most pageviews per session, grouped by device type. In the next section, we’ll give some guidance on determining if Heap Connect is right for you.

Frame (32)

Is Heap Connect right for me?

Heap Connect isn’t for all users of Heap. Here are key questions to help determine if Heap Connect is right for you:

Can I already do everything I need to do in the UI?

If the Heap UI handles all of your use cases, then you probably don’t need Heap Connect.

What benefit do I get in bringing data out of Heap?

Are there strong business use cases to bringing data in to your warehouse? Think about the different ways Heap data can transform your business operations. For example, in the ‘Joining Heap to other data’ section, we discussed an end-to-end use case of combining Heap and Salesforce data to power marketing efforts.

Do I have the resources available to maintain, troubleshoot, and provide guidance on data?

While using the Heap UI doesn’t require a technical person, technical people are required to properly install and maintain Heap Connect. If you already have people in your organization that support a modern data stack, they would be ideal for supporting Heap Connect.

Want more Heap? Learn about autocaptured properties and capturing data from the past!