Google Cloud Storage (GCS)
Overview
This destination writes data to GCS bucket.
The Airbyte GCS destination allows you to sync data to cloud storage buckets. Each stream is written to its own directory under the bucket.
Getting started
Requirements
- Allow connections from Airbyte server to your GCS cluster (if they exist in separate VPCs).
- An GCP bucket with credentials (for the COPY strategy).
Setup guide
- Fill up GCS info
- GCS Bucket Name
- See this for instructions on how to create a GCS bucket. The bucket cannot have a retention policy. Set Protection Tools to none or Object versioning.
- GCS Bucket Region
- HMAC Key Access ID
- See this on how to generate an access key. For more information on hmac keys please reference the GCP docs
- We recommend creating an Airbyte-specific user or service account. This user or account will require the following permissions for the bucket:
You can set those by going to the permissions tab in the GCS bucket and adding the appropriate the email address of the service account or user and adding the aforementioned permissions.
storage.multipartUploads.abort
storage.multipartUploads.create
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list
- Secret Access Key
- Corresponding key to the above access ID.
- GCS Bucket Name
- Make sure your GCS bucket is accessible from the machine running Airbyte. This depends on your networking setup. The easiest way to verify if Airbyte is able to connect to your GCS bucket is via the check connection tool in the UI.
Sync mode support
Features
Feature | Support | Notes |
---|---|---|
Full Refresh Sync | ✅ | Warning: this mode deletes all previously synced data in the configured bucket path. |
Incremental - Append Sync | ✅ | Warning: Airbyte provides at-least-once delivery. Depending on your source, you may see duplicated data. Learn more here |
Incremental - Append + Deduped | ❌ | |
Namespaces | ❌ | Setting a specific bucket path is equivalent to having separate namespaces. |