Learn
Learn

How to load data from Google analytics to Bigquery

How to load data from Google analytics to Bigquery

March 13, 2023
March 13, 2023
How to load data from Google analytics to Bigquery
Let’s begin our journey to learn & uncover detailed information about Google Analytics to BigQuery Integration from scratch.

It is no secret that Google has been at the frontier of cutting-edge tech-solutions for more than 20 years when it comes to data. Google has delivered one of the best website analytics platforms, Google Analytics is the one-stop platform to access all the marketing data. It enables users to meticulously monitor & track website data. To get the most out of this data, it is critical to create a comprehensive digital marketing ecosystem. 

How to get a 360-degree comprehensive view of Google Analytics historical data? Some users choose to export Google Analytics data as CSV files, which can be a cumbersome and ineffective choice. Others download and store data from Google Analytics to Google Sheets. 

As the marketing tech stack grows & becomes more complex, data warehouse solutions like Google BigQuery must be put to use. It acts as a central data repository for the crucial organization wide data. It enables the organization to unleash the possibility of performing insightful analysis and predictive modeling. Let’s begin our journey to learn & uncover detailed information about Google Analytics to BigQuery Integration from scratch.

[CTA_MODULE]

Introduction to Google analytics

With good cause, Google Analytics (GA) has long held the top spot among website analytics tools. Google Analytics offers a variety of interactive features that lets the organization evaluate and rapidly detect traffic trends. It enables the organizations to gauge how well any running marketing campaign is performing.

Google Analytics provides reliable & true information about site visitors and their interactions, creating opportunities to track conversion. It not only lets the website stakeholders know how many people visit their site but also allows them to learn about the visitors interacting & engaging with the website. 

The most recent version of Google Analytics, known as Google Analytics 4 (GA4), has a number of advantages over the earlier Universal Analytics version that can be quite beneficial for organizations. It enables cross-domain tracking without the need to develop any code modifications. Offers additional features to modify reports & simpler audience segmentation.

For organizations utilizing Google-based services for marketing, Google Analytics is crucial for monitoring how well those services are performing because it seamlessly interacts with other Google services. Organizations can only improve the ROI (return on investment) of the website by understanding & analyzing the underlying marketing campaigns data.

Data is a key component in every digital marketing strategy, and the ability to effectively leverage the data is the game changer for any successful  organization. A profit making organization should proactively respond to constantly changing demands & here, web data can assist organizations to learn what is happening on the website. 

Organizations work tremendously to gain a competitive edge and generate profits that will eventually fuel business expansion. A strong Business Intelligence technology serves as the catalyst for it to expand to its full potential. Utilizing Google Analytics before beginning any marketing campaigns ensures that you are getting the desired outcomes from it.

Introduction to Google BigQuery

Google BigQuery is a widely used data warehouse solution which uses cloud computing to store massive volumes of data. It is a serverless architecture which means it can be accessed from anywhere. Additionally, it doesn't need extensive efforts or hardware requirements to set up. This Google product can deliver with minimal to no upkeep, in contrast to certain Cloud-based Data Warehouses available in the market. 

BigQuery offers centralized control over your data and compute resources, and Identity and Access Management (IAM) aids in securing those resources using the Google Cloud access paradigm. Google Cloud security offers a reliable yet adaptable strategy that incorporates basic perimeter security & a more intricate and granular defense-in-depth approach.

Google BigQuery allows various applications of descriptive and prescriptive analysis like- BI (Business Intelligence), ML (Machine Learning), etc. You can execute queries on data using external tables or federated queries, such as Cloud Storage, or Google Sheets saved in Google Drive, or also you can query data stored in BigQuery.

Google BigQuery – a fully managed Cloud Data Warehouse for analytics from Google Cloud Platform (GCP), is one of the most popular Cloud-based analytics solutions. Google BigQuery's unique architecture and seamless integration with other services from GCP is winning hearts all around! Google BigQuery is a tool that businesses are learning to use today to improve their transactions. 

Today, there are many tools available in the market specially designed for organizations performing ELT. Data Warehouse solutions like Google BigQuery have become essential to Data Analytics in order to streamline, automate, and visualize the ELT process. Google BigQuery is the most preferred choice across the world due to its scalability, security, and powerful processing capabilities. 

Google Analytics to BigQuery Integration

Google Analytics to  BigQuery Integration is undoubtedly the most necessary step an organization must take now. BigQuery blends Google Analytics' well-known and adored features with a streaming, event-based analytics platform's strength and flexibility. 

Sampling is one of the biggest challenges Google Analytics faces when a complex or customized report is run, or when a wide range of dates are selected. As a result, Google extrapolates the remaining data points by only counting a portion of them. This leads to approximations in some cases.

However, there is no need to worry about sampling & all thanks to the Google Analytics to  BigQuery Integration. Each hit that is migrated to BigQuery is accessible for querying, and returns the results within a few seconds.

Three main objectives are served by the Google Analytics to BigQuery integration:

  • Query the raw obtained data from GA to draw meaningful conclusions
  • A centralized repository of relevant data combined across multiple platforms
  • A 360-degree comprehensive view of GA data available for visualization

Each capacity builds on the one before it, and each function broadens the scope of what Google Analytics to BigQuery Integration is capable of. Read along to dig deeper to learn more about Google Analytics to BigQuery Integration.

Note: Google has announced the dismissal of all GA3 (Universal Analytics) standard properties from July 1, 2023. After that, you'll have at least six months to view the already processed data in GA3 (Universal Analytics). 

Before Universal Analytics stops processing new hits, switch to Google Analytics 4 & build the necessary historical data.

Google analytics to Bigquery integration use- cases

Google Analytics interface works well when answering basic marketing queries. Google Analytics data is only shown as numbers aggregated session-wise. Goals and goal funnels in the Google Analytics interface are session-based, which means that it requires a person to visit numerous times to complete a job. 

For instance, counting the number of sessions that originated from a mobile device is simple in Google Analytics. It would be far more challenging to determine the total number of video play events by a specific user across numerous sessions.

The Google Analytics interface provides a number of capabilities that make it simple to do on-the-fly analysis and is comparatively simple to use. For complex & detailed problem statements, it is essential to integrate Google Analytics to BigQuery. The same underlying data can be extracted, stored, combined & visualized on a greater front. Google Analytics to BigQuery integration presents opportunities to:

  • Combine GA (Google Analytics) and CRM data to understand traffic sources associated with qualified leads.
  • Add spend data for various ad channels to understand the overall cost per user.
  • Import back-end commerce data for a clearer picture of product performance.
  • Obtain raw hit-level data into BigQuery for querying.

Benefits of Google analytics to BigQuery integration

  • Recognize your clientele: By merging data from all of your channels and user touchpoints, you can get a consolidated perspective of your customer journey. Make quick decisions based on actionable insights by easily visualizing each stage of your sales funnel.
  • Improved the conversion rates: Utilize impressions, website visits, and clicks data that is ready for analysis from several sources in one location. Recognize the content that converts best for you and focus more on it to boost sales.
  • Increasing Marketing ROI: Reallocates your spending to the most effective Ad strategy with the use of precise campaign reports that are available to you in almost real time.

Methods to load data from Google analytics to BigQuery

Is your organization also looking for ways to access all of the raw Google Analytics data? We’ve got you covered, discover Google Analytics to BigQuery Integration in two easy ways! If you and the organization you work for are currently utilizing Analytics data in various ways, you ought to strongly consider expanding your knowledge in this field. Going beyond GA4's reporting (and analysis) interface can reveal a wealth of new information. Let’s begin to make use of most of the recent methods and approaches to harness the power of Google Analytics and BigQuery.

This step-by-step guide will give you a birds-eye view of Google Analytics to  BigQuery integration in a seamless manner. Any organization can set up an integration between the brand-new Google Analytics 4 (GA4), and BigQuery in a matter of minutes. Read along to discover & learn about the ways available to integrate Google BigQuery for the greatest good of your organization!

Method 1: Google analytics to Bigquery using Fivetran

The majority of businesses will undoubtedly migrate Google Analytics data to a centralized data repository in upcoming times. Fivetran, one of the leading data integration tools, assists you in navigating this change as smoothly as possible. Fivetran works to create and install a metrics system to ensure reliable data tracking. Without having to comprehend the new data format and rewrite SQL queries, one can save the reports required and can further generate new ones. To integrate Google Analytics to BigQuery using Fivetran, adhere to below mentioned setup instructions.

Prerequisites

  • Google Analytics accounts
  • BigQuery account 
  • Fivetran account having the ability to create or manage destinations

Steps to Load data from Google Analytics to BigQuery using Fivetran 

  • Step 1: Set up Google Analytics as a Fivetran connector
  • Step 2: Start the initial sync.
  • Step 3: Permit Fivetran to access BigQuery cluster
  • Step 4: Complete the Fivetran configuration for BigQuery as the destination

Step 1: Set up Google Analytics as a Fivetran connector

  • In the connector setup form, enter the name of your preferred destination schema.
  • Sign into your Google Analytics account and then click Authorize with Google.
  • Choose how many months of historical reporting data you want to load in your initial sync.
  • Decide whether to sync All Accounts or Specific Accounts.
  • The accounts you want to sync should be chosen.
  • The properties that you want to sync should be chosen.
  • Click Add Report.
  • Enter the name of the destination table you have chosen. It must adhere to the nomenclature defined by Fivetran and should be distinct within this connector.

NOTE: Give the table a name that will make it simple for you to distinguish it from the report in the future. Once a report is configured, you may activate or disable the sync for that report by choosing its table in the connector's schema tab.

  • A choice is to be made between Pre-built Report and Custom Report under Configuration Type.
  • If you opt to use a pre-built report, choose the available desired pre-built report.
  • Select the dimensions and metrics you want to sync if you have chosen to use Custom Report.
  • (Optional) If you wish to alter reports according to the chosen parameters, turn on the Show Advanced Options option. If you don't wish to use the advanced options, just skip the next two bullet points.
  • Enter the precise name of the dimension you want to filter on in the Filter field name field.
  • Enter the dimension value in the Filter value field. By doing this, the report will be filtered based on the chosen dimension.
  • Click OK to save the report.
  • To add another report, repeat steps 7 through 13 in that case.
  • Finally, click save and test option.

Step 2: Start the initial sync.

Right after you've successfully created & configured the desired connector, it will show as paused. To commence the initial sync, you must follow the below mentioned steps:

  • Navigate to the Schema tab of your new connector in your Fivetran dashboard.
  • Examine the schema and set up the sync-ready tables.
  • Change the toggle from Paused to Enabled in the top right of the connector page.

Now, Fivetran will begin to sync the desired Google Analytics data.

Step 3: Permit Fivetran to access BigQuery cluster

In order for Fivetran to establish and manage your data's tables and recurrently load data into those tables, you must enable us access to your BigQuery cluster.

  • Get the Project ID link
  • Link to the Fivetran service account
  • Set up the service account link

Get the Project ID link

  • Navigate to the projects list in your Google Cloud Console.
  • Make a note of your Project ID when you locate it. It is necessary in order to configure Fivetran.
Select Project Google BigQuery

Link to the Fivetran service account

  • Go to the Fivetran destination setup form in another tab.
  • Choose BigQuery and I already have a destination.
  • Note the service account for Fivetran. It will require rights in BigQuery, which you must allow.
  • Go to the Fivetran destination setup form in another tab.
  • Choose BigQuery and mark ‘I already have a destination’.
  • Note the service account for Fivetran. It will require rights in BigQuery, which you must allow.
Service Account

Set up the service account link

  • To see the list of project principals, return to the IAM & admin tab.
  • Choose + Add.
Add IAM User
  • Enter the Fivetran service account you located in Step 2 or the service account you created in Step 3 in the New Principals area. The complete email address is the service account.
Add Fivetran Service Account
  • Click Select a role > BigQuery > BigQuery User.
Add Fivetran Service Account

Step 4: Complete the Fivetran configuration for BigQuery as the destination

  • Login & access your Fivetran account.
  • Click + Add Destination on the Destinations page.
  • Enter the name of your choice for the destination on the ‘Add destination to your account page’.
  • Select Add.
  • The destination type should be ‘BigQuery’.
  • Enter the Project ID you discovered in Step 1 into the destination setup form.
  • Enter the Data Location.
  • Select the site for data processing. You might also need to select an AWS region and cloud service provider, as detailed in Destinations documentation, depending on the plan you're on and the cloud service provider you've chosen.
  • Decide on a time zone.
  • Finally, click save and test option.

Note: BigQuery enables you to use conventional SQL queries to perform analytics over enormous volumes of data. Fivetran can sync with BigQuery as frequently as once per five minutes.

Big Congratulations, you have successfully migrated your Google Analytics data to BigQuery using Fivetran!!

Benefits of Using Fivetran 

More than 200 pre-built connectors to data sources such as applications, databases, event logs, files, and cloud services are available with Fivetran. Before transferring the data to your data warehouse, connectors organize it into standardized schemas. To maintain consistency in data structure, the schemas are standardized which enables you to use the same analytics code snippets as anyone else who makes use of the same connectors.

  • Automatic adaptation to API and schema changes
  • Fully-managed, maintenance-free data replication
  • Changes within the warehouse that employ unique logic and views
  • 200+ already constructed connectors
  • Ready-for-analysis schemas
  • Rapid access to data with incremental loads
  • Replication of data and schema in full
  • Support for dimensions that change gradually (Type 2 data)


Instead of using ETL, Fivetran offers a data pipeline solution that follows the extract-load-transform (ELT) technique. Each of the three key weaknesses of ETL is addressed by switching the order of the loading and transformation phases and by utilizing a standardized tool. The advantages of ELT include the following:

  • The pipeline is made simpler since analysts now can handle downstream data transformation rather than data engineers.
  • Since data engineers' upstream work is no longer impacted by changing analytics because data transformation takes place in the data warehouse, the pipeline is more robust and less dangerous.
  • Because the pipeline is easier to reach, maintaining it requires less labor. It is now possible for an outside party to develop and maintain a standardized solution to sell to various customers because the pipeline has been significantly simplified and is fundamentally more resilient. Organizations can effectively outsource and automate the pipeline's extract and load stages by buying a standardized tool.

Method 2: Google Analytics to BigQuery using BigQuery Data Transfer Service

There are three key steps for setting up the Google Analytics to BigQuery integration.

  • Create a project in GCP and enable BigQuery, if necessary.
  • The first step is to create a project (if not already created) and enable BigQuery because all operations in GCP must be associated with a project.

  • The service account should be added as a project member.
  • A non-human account used to conduct automated services is known as a service account.
  • The service account must have Editor access for the project for both GA4 and GA360.
  • For GA4, the service account is firebase-measurement@system.gserviceaccount.com
  • For GA360, the service account is analytics-processing-dev@system.gserviceaccount.com
  • Connect to a billing account
  • Set up a billing account, enable billing in GCP, and connect it to the project there.
  • The costs of data storage and processing connected with the GA export may be covered by the free usage tier in GCP, but in order to complete the integration, you must have a payment method on file in the cloud.
  • Set up the connection in the Google Analytics user interface.
  • For GA4, select Product Linking from the Property column on the Admin screen of the GA UI. Choose the GCP project, data source, export streams, and frequency (daily or continuous streaming). You're done, congrats!
  • For GA360, click Link BigQuery after selecting All Products from the Property column on the Admin tab of the GA UI. When exporting a view to BigQuery, enter the project ID or number (found on the GCP Home screen), pick the export frequency, and then choose the view (three times per day or continuous streaming). Voila- it's done!
  • XXXXXXXXX is the UA view ID, and XXXXXXXXX is the dataset label.
  • Tables: There will be a Google Analytics session YYYYMMDD table for each day of that dataset. The Google Analytics sessions intraday YYYYMMDD table, which includes information for the current day, is also displayed (exported three times per day).
  • Streaming tables: If you enabled streaming, instead of ga_sessions_intraday_YYYYMMDD, you’ll see ga_realtime_sessions_YYYYMMDD and ga_realtime_sessions_view_YYYYMMDD for the current day.
  • Data should begin entering BigQuery within 24 hours, i.e. The first Data Export.
  • Daily export: Each day, a new table containing the data from the previous day will be created (typically in the morning in the timezone set for reporting).

Note: Google Analytics will offer an export of historical data (Historical backfill) the first time you link a Google Analytics property to Google BigQuery. BigQuery will soon contain Google Analytics data dating back 13 months or the most recent 10 billion hits, whichever comes first.

FAQ's

The most frequent problems encountered when setting up the interface are listed below. The data transferred to Google BigQuery contains gaps that are permanent due to these inaccuracies.

  • Inadequate user rights when configuring the integration
  • The user configuring the integration has to have access to the Cloud project and Edit permissions within the GA UI.
  • The service account was given the wrong permissions.
  • There are numerous roles in GCP that users can be given. Instead of project-level Editor access, you might have assigned BQ Editor access to the service account. 
  • Billing issues
  • If a legitimate, current payment method is not stored on file with GCP, the data export will not take place. You can create a second credit or debit card as a safety net.
  • BigQuery API not enabled
  • BigQuery must be enabled in the GCP APIs & Services panel because it is not enabled by default.

Limitations using BigQuery data transfer service

Despite the fact that the BigQuery Data Transfer Service is a native data integration solution available, it has several downsides. The following are the limitations of using BigQuery Data Transfer Service to integrate Google Analytics to BigQuery:

  • When using Google's Data Streaming option, Google determines and sets the amount of updates to the Data Warehouse. This could be a hindrance if you're seeking for real-time data streaming.
  • Maximum data backfill requests supported by BigQuery Data Transfer Service are 180 days. This implies that any historical data would need to be manually transferred.
  • Using this strategy will inevitably need a corporation to spend in tech bandwidth to move data because the business teams that need it are not very tech-savvy. This can be a pricey venture.
  • It's important to keep in mind that BigQuery does not support connecting datasets that have been previously saved on servers located in various locations. So you must establish & build datasets in the same places, throughout your project. Because there is no way to change the position after setup, you must be careful from the beginning. 
  • The Google Analytics Service account must have EDIT access to the project at all times, and BQ must be enabled at all times. If this is modified for any reason, all subsequent exports will fail until authorization is restored.
  • The most significant drawback of utilizing the BigQuery Data Transfer Service is the potential of data loss. If any of the above stated error persists or any other error is encountered which can possibly stop the data export, the data would be permanently destroyed. The BigQuery Data Transfer Service does not provide any means of prevention or recovery of data loss that occur as a result of a failed export.
  • You can not change the timestamp in the data from UTC to PST. Such alterations are not supported by the BigQuery Data Transfer service.
  • Data entering BigQuery through the BigQuery Data Transfer service can only come from Google products. You'd need to utilize a different provider in the future if you wanted to import data from any other additional data sources.

Conclusion

Google Analytics 4 downplays the concept of a session in order to provide different attribution views, it's vital to remember that BigQuery has always provided us with the option to analyze at the user level. We can attribute actions to any duration, from one hit previous to conversion to as far back as our historical data allows.

Google Analytics to BigQuery can also have a significant influence on businesses focusing on e-commerce. Analysts that use the Enhanced Ecommerce reports inside the Google Analytics interface are aware that they are session-based. However, in real-world settings, a consumer may add an item to their cart in one visit and return later to finalize the transaction. BigQuery allows teams to see purchasing behavior from consumers who make a purchase in more than one session.

The best aspect is that these infinite lookback windows and aggregations are applicable to all fields and scopes, making it possible to create user-based segments without having to worry about the annoying 90-day limited lookback window that the GA interface imposes. 

Connect GA4 to your BigQuery data warehouse
Free Trial

Related posts

No items found.
No items found.
How to load data from salesforce to snowflake: definitive guide 
Blog

How to load data from salesforce to snowflake: definitive guide 

Read post
How to load data from Salesforce to BigQuery: The ultimate guide
Blog

How to load data from Salesforce to BigQuery: The ultimate guide

Read post
How to load data from Salesforce to Redshift: A definitive guide
Blog

How to load data from Salesforce to Redshift: A definitive guide

Read post

Start for free

Join the thousands of companies using Fivetran to centralize and transform their data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.