Create a destination for Microsoft Azure + Databricks

Last updated: Oct 15, 2024
HEALTH TECH VENDOR
IMPLEMENTATION

For cloud connectivity with Redox, you decide which cloud provider and cloud product(s) to use. Then, you'll need to create a cloud destination in your Redox organization.

You'll need to perform some steps in your cloud product(s) and some in Redox. You can perform Redox setup in our dashboard or with the Redox Platform API.

Cloud products

This article is for this combination of cloud products:

  • Microsoft Azure (with Data Lake)
  • Databricks

Configure in Microsoft Azure

  1. Navigate to the Microsoft Azure dashboard and log in. Review Azure's quickstart guide to get started.
  2. Create an application through Azure Entra. Review Azure's help article. This is where you'll get a client ID and tenant ID, which you'll need for Redox setup later.
  3. Create a new secret for your application. This is where you'll get client secret value, which you'll need for Redox setup later.
  4. Create a new storage account. Set the primary service to the Data Lake Storage option.
  5. Add a new container. You'll need the name of the container for Redox setup later.
  6. Assign the Blob Data Owner role to the application you created in step #2.

Create a cloud destination in Redox

Next, create a cloud destination in your Redox organization. This destination will be where your data is pushed to.

In the dashboard

  1. From the Product type field, select Databricks.
  2. For the configure destination step, populate these fields. Then click the Next button.
    1. Storage account name: Enter the name of the storage account you created in Azure. Locate this value in the Azure dashboard.
    2. Container name: Enter the name of the container you created in Azure. Locate this value in the Azure container configuration.
    3. File name prefix (optional): Enter any prefix you want prepended to new files when they're created in the Data Lake container. Add / to put the files in a subdirectory. For example, redox/ puts all the files in the redox directory.
  3. For the auth credential step, either a drop-down list of existing auth credentials displays or a new auth credential form opens. Learn how to create an auth credential for OAuth 2.0 2-legged.

With the Redox Platform API

  1. In your terminal, prepare the /v1/authcredentials request.
  2. Specify these values in the request.
    • Locate the clientId and clientSecret value in the Microsoft Azure dashboard.
      Example: Create auth credential for Azure + Databricks
      json
      1
      curl 'https://api.redoxengine.com/platform/v1/authcredentials' \
      2
      --request POST \
      3
      --header 'Authorization: Bearer $API_TOKEN' \
      4
      --header 'accept: application/json' \
      5
      --header 'content-type: application/json' \
      6
      --data '{
      7
      "organization": "<Redox_organization_id>"
      8
      "name": "<human_readable_name_for_auth_credential>"
      9
      "environmentId": "<Redox_environment_ID>"
      10
      "authStrategy": "OAuth_2.0_2-legged"
      11
      "url": "https://login.microsoftonline.com/<tenant id from azure console step 1>/oauth2/v2.0/token"
      12
      "grantType": "client_credentials"
      13
      "clientId": "<client_id_from_Azure>"
      14
      "keyId": "<client_secret_from_Azure>"
      15
      "scope": "https://storage.azure.com/.default"
      16
      }
  3. You should get a successful response with details for the new auth credential.
  4. In your terminal, prepare the /v1/environments/{environmentId}/destinations request.
  5. Specify these values in the request.
    • Set authCredential to the auth credential ID from the response you received in step #4.
    • Populate cloudProviderSettings with the settings below (adjust values based on the storage account and container setup in Azure configuration).
      • The fileNamePrefix is optional, and if added, it gets prepended to the created file path in the Data Lake container.
        Example: Values for Azure + Databricks cloudProviderSettings
        json
        1
        {
        2
        "cloudProviderSettings": {
        3
        "typeId": "azure",
        4
        "productId": "databricks",
        5
        "settings": {
        6
        "storageAccountName": "<storage_account_name_from_Azure>",
        7
        "containerName": "<container_name_from_Azure>",
        8
        "fileNamePrefix": "<optional_file_name_prefix>",
        9
        //This can have `/` indicating a directory path"
        10
        }
        11
        }
        12
        }
  6. You should get a successful response with details for the new destination for Microsoft Azure and Databricks.
  7. Your new destination will now be able to receive messages. We push data to the Data Lake storage account as a JSON file, which is ingested into Microsoft Azure.