Snowflake
Load your Amplitude event data into your Snowflake account. You can set up recurring syncs through the Amplitude UI, as well as manually start a sync of your historical data.
Considerations
Depending on your company's network policy, you may need to add these IP addresses to your allowlist so Amplitude's servers can access your Snowflake instance:
- Amplitude US IP addresses:
- 52.33.3.219
- 35.162.216.242
- 52.27.10.221
- Amplitude EU IP addresses:
- 3.124.22.25
- 18.157.59.125
- 18.192.47.195
- Amplitude US IP addresses:
Limits
- Maximum running time for a single Snowflake SQL query is 12 hours.
- Each project can have only one Snowflake Export destination for each data type. This means one destination for events and one destination for merged IDs per project.
Reuse existing Snowflake credentials
When you create a new Snowflake export or import, Amplitude lets you select previously saved credentials from your organization instead of entering them again. Reusable credentials include connections from other exports, imports, and connections made in other projects (if you have permission).
Shared credentials update together
Amplitude shares credentials at the underlying level. If you update the password or key pair on one connection, Amplitude updates all connections that share those credentials. Review which connections use the credentials before making changes.
To reuse existing credentials, select Use existing credentials at the credential entry step when creating a new connection. Amplitude lists all saved credentials from your organization. To reuse credentials from another project, you need permission to create Data Warehouse connections in both projects.
Set up the integration
Creating a recurring data export is a three-step process you can handle yourself. Each sync typically completes within five to ten minutes, though timing may vary depending on system load and data volume. This method also lets you watch jobs.
Use a dedicated warehouse for Amplitude data
Amplitude strongly recommends using a dedicated Snowflake warehouse instance for loading Amplitude data. Using a shared warehouse can cause load capacity issues and disrupt other Snowflake integrations. Configure a separate warehouse to ensure optimal performance and prevent conflicts with your other Snowflake operations.
To set up a recurring export of your Amplitude data to Snowflake, follow these steps:
Required user permissions
You need admin/manager privileges in Amplitude, and a role that allows you to enable resources in Snowflake.
In Amplitude Data, click Catalog and select the Destinations tab.
In the Warehouse Destinations section, click Snowflake.
Under Export Data to Snowflake, choose which data to include in this export: Export events ingested today and moving forward, Export all merged Amplitude IDs, or export both. For events, you can also specify filtering conditions to export only events that meet certain criteria.
Review the Event table and Merge IDs table schemas and click Next.
In the Snowflake Credentials For Amplitude section, enter the following information:
- Account Identifier: The account identifier on your Snowflake account, in the format
<org-name>-<account-name>. For more information, refer to Snowflake's account identifier documentation{:target="_blank" rel="noopener noreferrer"}. - Warehouse: The warehouse Amplitude uses to load the data. Use a dedicated warehouse for Amplitude data to prevent load capacity issues with other Snowflake integrations. Sharing a warehouse can disrupt your other Snowflake operations.
- Database: The database where Amplitude stores data. Dedicate this database specifically to Amplitude data.
- Role: The role that the connection uses to write this data. The default value of role is
AMPLITUDEonly. - Username: The username Amplitude uses to connect to the Snowflake account.
Amplitude offers password-based and key pair authentication for Snowflake.
- Account Identifier: The account identifier on your Snowflake account, in the format
Snowflake Password Authentication Deprecation
Starting in May 2026, Snowflake is removing support for single-factor password authentication. This affects data exports from Amplitude to Snowflake. Amplitude recommends migrating to key pair authentication for enhanced security and future compatibility with Snowflake. For migration guidance, review the Snowflake Password Authentication Deprecation FAQ.If you use password-based authentication, note that your password is case-sensitive.
To use password authentication, select the Password option and enter your password in the Password field. Key pair authentication (Recommended): To use key pair authentication, select the Key pair option and click Generate Key. To use key pair authentication, provide the organization and account names in the format ORGNAME-ACCOUNTNAME.
When using key pair authentication, you must use the format ORGNAME-ACCOUNTNAME. Without this specific format, Snowflake generates a JWT token is invalid error.
- Copy the autogenerated SQL query and run it in Snowflake to give Amplitude the proper permissions.
- Click Next. Amplitude attempts to upload test data using the credentials you entered. If the upload is successful, click Finish.
Amplitude automatically sends all future events to Snowflake.
From here, Amplitude generates micro-batch files and loads them to customer-owned Snowflake accounts on a best-effort basis. Exports typically run every 10 minutes, but may run less frequently depending on system load and data volume. You can typically see the data in your Snowflake accounts within 20 minutes after Amplitude receives the events, though timing may vary.
Export transformation and custom event definitions
Alongside your event data, Amplitude can export your transformation and custom event definitions to Snowflake. This keeps the logic you build in Amplitude in sync with your warehouse, so you can apply the same semantics to your own copy of the raw events and reduce data drift.Amplitude exports the definitions, not transformed rows. Your raw event tables stay as they are. Each sync writes a full snapshot of your current definitions, so you can rebuild transformed and custom events downstream and keep a history of how those definitions change over time.
Availability
Amplitude rolls out this capability gradually. If you don't see Data Configuration in the Snowflake export setup, contact your Amplitude account team to enable it.
What Amplitude exports
- Event Transformations: your merge and merged-event transformations, the two transformation types Amplitude Data surfaces as transformations in the UI.
- Custom Events: your custom event definitions, including the filters and grouping that define each one.
Amplitude exports active definitions only. It doesn't export disabled transformations or deleted custom events, and it doesn't export user property, channel classifier, or derived property transformations.
Set up the data configuration export
Required user permissions
You need admin/manager privileges in Amplitude and a Snowflake role that can create tables in the AMPLITUDE schema of your target database. The connection test doesn't verify this grant, so confirm it before your first sync to avoid a silent export failure.
- In Amplitude Data, click Catalog and select the Destinations tab.
- In the Warehouse Destinations section, click Snowflake.
- In the export setup, select Data Configuration, then choose Event Transformations, Custom Events, or both.
- Set how often Amplitude refreshes the snapshot with the schedule picker. Daily works well for most teams.
- Enter or reuse your Snowflake credentials, run the generated SQL grant in Snowflake, and click Finish.
Amplitude writes each data type to its own table in the AMPLITUDE schema and appends a new snapshot on every run. To read the current definitions, query the rows with the most recent snapshot_at. To inspect a past state, filter snapshot_at to the snapshot you want. Amplitude keeps about one year of snapshots.
Export historical Amplitude data to Snowflake
To export your historical data from Amplitude into Snowflake, navigate to the integration, then click Export Data and select a date range.
This process can take from a single day to several weeks. It depends on your data volume, warehouse size, cluster count, network bandwidth, and the number of concurrent historical data exports you have, among other factors.
Reduce your Snowflake computation costs
To reduce Snowflake computation costs when receiving data from Amplitude, try these methods:
The effectiveness of these recommendations depends on the frequency with which you export to your Snowflake instance. Remember to use a dedicated warehouse for Amplitude data to prevent performance issues with other Snowflake operations.
- Modify warehouse size and number of clusters for your dedicated Amplitude warehouse. For shorter export cadences (15 to 30 minutes), try starting with
xsmallorsmall. Then upgrade as needed. - When backfilling data into Snowflake, start with a small warehouse and upgrade as needed.
- You can also try reducing the auto suspend time to 60s. This option might not be available within the Snowflake UI, but you can manually set it through a direct Snowflake query.
Snowflake export format
Data locations
Amplitude exports data to your Snowflake account using the following naming convention:
| Component | Format | Description |
|---|---|---|
| Database | DB_{org_id} | Uses your Amplitude organization ID for the database name. |
| Schema | SCHEMA_{project_id} | Uses your Amplitude project ID for the schema name. |
| Events Table | EVENTS_{project_id} | Uses your Amplitude project ID for the events table name. |
| Merged Users Table | MERGE_IDS_{project_id} | Uses your Amplitude project ID for the merged users table name. |
For example, if your organization ID is 12345 and your project ID is 67890:
- Events:
DB_12345.SCHEMA_67890.EVENTS_67890. - Merged Users:
DB_12345.SCHEMA_67890.MERGE_IDS_67890.
Each project can have only one Snowflake Export destination for each data type. This means one destination for events and one destination for merged IDs per project.
Event table
Event table schema
The Event table schema includes the following columns:
| Column | Type | Description |
|---|---|---|
amplitude_attribution_ids | ARRAY | Hashed attribution ids on the event |
amplitude_id | NUMBER | The original Amplitude ID for the user. Use this field to automatically handle merged users. Example: 2234540891 |
app | NUMBER | Project ID found in your project's Settings page. Example: 123456 |
city | VARCHAR | City. Example: “San Francisco” |
client_event_time | TIMESTAMP_NTZ | Local timestamp (UTC) of when the device logged the event. Example: 2015-08-10T12:00:00.000000 |
client_upload_time | TIMESTAMP_NTZ | The local timestamp (UTC) of when the device uploaded the event. Example: 2015-08-10T12:00:00.000000 |
country | VARCHAR | Country. Example: "United States" |
data | VARIANT | Dictionary where Amplitude stores certain fields such as first_event and merged_amplitude_id |
device_carrier | VARCHAR | Device Carrier. Example: Verizon |
device_family | VARCHAR | Device family. Example: Apple iPhone |
device_id | VARCHAR | The device specific identifier. Example: C8F9E604-F01A-4BD9-95C6-8E5357DF265D |
device_type | VARCHAR | Device type. Example: Apple iPhone 5s |
dma | VARCHAR | Designated marketing area (DMA). Example; San Francisco-Oakland-San Jose, CA |
event_id | NUMBER | A counter that distinguishes events. Example: 1 |
event_properties | VARIANT | |
event_time | TIMESTAMP_NTZ | Amplitude timestamp (UTC) which is the client_event_time adjusted by the difference between server_received_time and client_upload_time, specifically: event_time = client_event_time + (server_received_time - client_upload_time) Amplitude uses this timestamp to organize events on Amplitude charts. NOTE: If the difference between server_received_time and client_upload_time is less than 60 seconds, Amplitude doesn't adjust event_time, and it equals client_event_time. Example: 2015-08-10T12:00:00.000000 |
event_type | VARCHAR | Event type |
group_properties | VARIANT | |
groups | VARIANT | Group types. For more information, go to the Accounts documentation. |
ip_address | VARCHAR | IP address. Example: "123.11.111.11" |
language | VARCHAR | |
library | VARCHAR | |
location_lat | FLOAT | Latitude. Example: 12.3456789 |
location_lng | FLOAT | Longitude. Example: -123.4567890 |
os_name | VARCHAR | OS name. Example: ios |
os_version | VARCHAR | OS version. |
paying | VARCHAR | True if the user has ever logged any revenue, otherwise (none). Note: You can modify the property value with the Identify API. Example: true |
platform | VARCHAR | |
processed_time | TIMESTAMP_NTZ | |
region | VARCHAR | Region. Example: California |
sample_rate | NUMBER | |
server_received_time | TIMESTAMP_NTZ | |
server_upload_time | TIMESTAMP_NTZ | Amplitude timestamp (UTC) of when Amplitude servers received the event. Example: 2015-08-10T12:00:00.000000 |
session_id | NUMBER | The session start time in milliseconds since epoch. Example: 1396381378123 |
start_version | VARCHAR | App version when Amplitude first tracked the user. Example: 1.0.0 |
user_id | VARCHAR | A readable ID you specify. Use something that doesn't change; for that reason, avoid using the user's email address. |
user_properties | VARIANT | |
uuid | VARCHAR | A unique identifier per row (event sent). Example: bf0b9b2a-304d-11e6-934f-22000b56058f |
version_name | VARCHAR | The app version. Example: 1.0.0 |
Event table clustering
The exported events table uses these clustering keys by default (in order):
TO_DATE(EVENT_TIME)TO_DATE(SERVER_UPLOAD_TIME)EVENT_TYPEAMPLITUDE_ID
This optimizes query performance for time-based queries. You can modify the clustering keys to match your query patterns.
Merged User table
Merged User table schema
The Merged User table schema contains the following:
| Column | Type | Description |
|---|---|---|
amplitude_id | NUMBER(38,0) | The Amplitude ID that merges into a user's original Amplitude ID. |
merge_event_time | TIMESTAMP | The time of the event when Amplitude associates a user's new Amplitude ID with their original Amplitude ID. |
merge_server_time | TIMESTAMP | The server time of the event when Amplitude associates a user's new Amplitude ID with their original Amplitude ID. |
merged_amplitude_id | NUMBER(38,0) | The Amplitude ID originally assigned when Amplitude first creates the user. |
Merged User table clustering
Amplitude clusters the merged IDs table by DATE_TRUNC('HOUR', MERGE_SERVER_TIME). This optimizes queries that filter by when user merges occurred. You can modify the clustering keys to match your query patterns.
Event transformations table
Amplitude writes transformation definitions to {database}.AMPLITUDE.EVENT_TRANSFORMATIONS_{project_id}, where {database} is the database you connected and {project_id} is your Amplitude project ID. Each row is one transformation definition at a given snapshot.
| Column | Type | Description |
|---|---|---|
project_id | NUMBER | Amplitude project ID. |
snapshot_at | TIMESTAMP | When Amplitude produced this snapshot. |
export_job_id | STRING | Identifier of the export run that wrote this snapshot. |
transformation_id | STRING | Stable transformation identifier. |
name | STRING | Transformation display name in the UI. |
description | STRING | Free-form description. |
transform_type | STRING | Transformation type, such as merged event or merged-event derived property. |
merged_props | VARIANT | Mapping from target property to source properties for merge-type transforms. |
merge_priority | VARIANT | Encoded priority rules for merge resolution. |
merged_events | VARIANT | Definition of which events merge to create the transformation. |
derived_props | VARIANT | Properties the transformation creates. This isn't the same as Amplitude's standalone Derived Properties feature. |
derived_props_for_merge_events | VARIANT | Merge-event-specific derived-property mapping used during merge transformations. |
is_enabled | BOOLEAN | Whether the transformation is active. Amplitude doesn't apply disabled transformations. |
created_at | TIMESTAMP | When someone created the transformation. |
created_by | STRING | User who created the transformation. |
last_updated_at | TIMESTAMP | When someone last modified the transformation. |
last_updated_by | STRING | User who last modified the transformation. |
VARIANT hold semi-structured JSON. Query them with Snowflake's semi-structured data functions{:target="_blank" rel="noopener noreferrer"}.Custom events table
Amplitude writes custom event definitions to {database}.AMPLITUDE.CUSTOM_EVENTS_{project_id}. Each row is one custom event definition at a given snapshot.
| Column | Type | Description |
|---|---|---|
project_id | NUMBER | Amplitude project ID. |
snapshot_at | TIMESTAMP | When Amplitude produced this snapshot. |
export_job_id | STRING | Identifier of the export run that wrote this snapshot. |
custom_event_id | STRING | Stable custom event identifier. |
name | STRING | Canonical custom event name. |
display_name | STRING | Human-friendly name shown in the UI. |
description | STRING | Free-form description. |
definition | VARIANT | Definition expression, such as filters and groups, for the custom event. |
category | STRING | Optional category or group. |
classifications | ARRAY | Additional tags or classifications. |
suggestion_status | STRING | Suggestion state, such as suggested, accepted, or rejected. |
has_properties | BOOLEAN | Whether the custom event has custom properties defined. |
is_official | BOOLEAN | Whether this is an org-official custom event. |
is_restricted | BOOLEAN | Whether access is limited to certain users or roles. |
is_autotrack | BOOLEAN | Whether autotagging generated this custom event. |
is_hidden | BOOLEAN | Whether the custom event is hidden from standard UI lists. |
created_at | TIMESTAMP | When someone created the custom event. |
created_by | STRING | User who created the custom event. |
last_updated_at | TIMESTAMP | When someone last modified the custom event. |
last_updated_by | STRING | User who last modified the custom event. |
Was this helpful?