Auto Insights in GCP
This guide walks you through setting up Auto Insights in GCP, enabling automatic data analysis and insights. Follow the steps to configure and optimize your setup efficiently.
Prerequisite
VPC dedicated to AACP has been configured as mentioned in Step 4: Configure Virtual Private Network.
Service account and base IAM policy attached to the service account as described in Step 3: Configure IAM.
PDP provisioning has been triggered successfully as mentioned in Step 6: Trigger Private Data Handling Provisioning.
Project Setup
Step 1: Configure IAM
Step 1a: IAM Binding to the Service Account
Assign additional roles to the aac-automation-sa
service account created in Create a Service Account section:
roles/compute.loadBalancerAdmin (Compute Load Balancer Admin)
roles/compute.instanceAdmin.v1 (Compute Instance Admin (v1))
roles/compute.storageAdmin (Compute Storage Admin)
roles/container.clusterAdmin (Kubernetes Engine Cluster Admin)
roles/storage.admin (Storage Admin)
roles/redis.admin (Cloud Memorystore Redis Admin)
Step 2: Configure Subnet
Note
Designer Cloud shares a subnet configuration with Machine Learning, Auto Insights, and App Builder. If you are deploying more than one of those applications, you only need to configure the subnets once.
Auto Insights in a private data processing environment requires following subnet groups:
aac-gke-node (required) - The GKE cluster uses this subnet to execute Alteryx software jobs (connectivity, conversion, processing, publishing).
aac-public (required) - This group doesn’t run any services, but is used by the aac-gke-node group for egress out of the cluster.
aac-private (required) - This group runs services private to the PDP.
Step 2a: Create Subnets in the VPC
Configure subnets in the aac_vpc
VPC.
Create subnets according to the below example. You can adjust the subnet size and secondary subnet size to match your network architecture.
The address spaces are designed to accommodate a fully scaled-out data processing environment. You can choose a smaller address space if required, but you could run into scaling issues under heavy processing loads.
Important
The Subnet Name is not a flexible field, it must match the table below.
You may select any region from the Supported Regions list. However, you must use the same region for the Subnet Region now and when you reach the Trigger Provisioning step later.
Subnet Name | Subnet Size | Secondary Subnet Name | Secondary Subnet Size | Comments |
---|---|---|---|---|
aac_gke_node | 10.0.0.0/22 | aac-gke-pod | 10.4.0.0/14 | gke cluster, gke pod and gke service subnets |
aac-gke-service | 10.64.0.0/20 | |||
aac_public | 10.10.0.0/25 | N/A | N/A | Public egress |
aac_private | 10.10.1.0/24 | N/A | N/A | Internal communication inside the VPC |
Step 2b: Subnet Route Tables
Note
You are required to configure the vpc with network connection to internet in your project.
The <gateway id> could be either a NAT gateway or internet gateway, depending on your network architecture.
Below route table is shown as an example.
Address Prefix | next hop |
---|---|
/22 CIDR Block (aac-gke-node) /24 CIDR Block (aac-private) /25 CIDR Block (aac-public) 0.0.0.0/0 |
<gateway id> |
Private Data Processing
Caution
Changing or removing any AAC-provisioned public cloud resources after Private Data Handling has been set up can cause inconsistencies. These inconsistencies may lead to errors during job execution or when deprovisioning the Private Data Handling setup.
Step 1: Trigger Auto Insights Deployment
Auto Insights provisioning triggers from the Admin Console in AAC. You need Workspace Admin privileges within a workspace in order to see it.
From the AAC landing page, select the circle icon on the top right with your initials in it. Select Admin Console from the menu.
Select Private Data Handling from the left navigation menu.
Select the Auto Insights checkbox and then select Save.
Selecting Update triggers the deployment of the cluster and resources in the AWS account. This runs a set of validation checks to verify the correct configuration of the AWS account.
Once the initial validation checks complete, provisioning will commence. A message box on the screen will periodically refresh with status updates.
Note
The provisioning process takes approximately 35–40 minutes to complete.
After the provisioning completes, you can view the created resources (for example, EC2 instances and node groups) through the AWS console. It is very important that you don't modify them on your own. Manual changes might cause issues with the function of the private data processing.
Step 2: Update IAM Role of Kubernetes Service Account
After successful creation of ‘Private Data Processing’, a kubernetes service account credential-pod-sa
is created to allow kubernetes credential service to retrieve private data access credentials from the key vault.
Go to Key Management and select the key ring and key created in Step 5: Create Key Ring and Key.
Select PERMISSIONS and then GRANT ACCESS.
Provide New Principal
principal://iam.googleapis.com/projects/<project number>/locations/global/workloadIdentityPools/<project id>.svc.id.goog/subject/ns/credential/sa/credential-pod-sa
Note
Replace <project number> and <project id> with project’s project number and project id.
Provide Role Cloud KMS CryptoKey Encrypter/Decrypter and Secret Manager Admin roles.
Select Save.