DNS Zone Setup Best Practices on Azure

DNS Zone Setup Best Practices on Azure

Deep dive for using DNS with Cloudera Data Services on Azure

In Cloudera deployments on public cloud, one of the key configuration elements is the DNS. Get it wrong and your deployment may become wholly unusable with users unable to access and use the Cloudera data services. If the DNS is set up less ideal than it could be, connectivity and performance issues may arise. In this blog, we’ll take you through our tried and tested best practices for setting up your DNS for use with Cloudera on Azure.

To get started and give you a feel for the dependencies for the DNS, in an Azure deployment for Cloudera, these are the Azure managed services being used: 

  • AKS cluster: data warehouse, data engineering, machine learning, and Data flow
  • MySQL database: data engineering
  • Storage account: all services
  • Azure database for PostgreSQL DB: data lake and data hub clusters
  • Key vault: all services

Typical customer governance restrictions and the impact

Most Azure users use private networks with a firewall as egress control. Most users have restrictions on firewalls for wildcard rules. Cloudera resources are created on the fly, which means wildcard rules may be declined by the security team.

Most Azure users use hub-spoke network topology. DNS servers are usually deployed in the hub virtual network or an on-prem data center instead of in the Cloudera VNET. That means if DNS is not configured correctly, the deployment will fail.

Most Cloudera customers deploying on Azure allow the use of service endpoints; there is a smaller set of organizations that do not allow the use of service endpoints. Service endpoint is a simpler implementation to allow resources on a private network to access managed services on Azure Cloud. If service endpoints are not allowed, firewall and private endpoints will be the other two options. Most cloud users do not like opening firewall rules because that will introduce the risk of exposing private data on the internet. That leaves private endpoints the only option, which will also introduce additional DNS configuration for the private endpoints.

Connectivity from private network to Azure managed services

Firewall to Internet

Route from firewall to Azure managed service endpoint on the internet directly.

Service endpoint

Azure provides service endpoints for resources on private networks to access the managed services on the internet without going through the firewall. That can be configured at a subnet level. Since Cloudera resources are deployed in different subnets, this configuration must be enabled on all subnets.

The DNS records of the managed services using service endpoints will be on the internet and managed by Microsoft. The IP address of this service will be a public IP, and routable from the subnet. Please refer to the Microsoft documentation for detail. 

Not all managed services support services endpoint. In a Cloudera deployment scenario, only storage accounts, PostgreSQL DB, and Key Vault support service endpoints. 

Fortunately, most users allow service endpoints. If a customer doesn’t allow service endpoints, they have to go with a private endpoint, which is similar to what needs to be configured in the following content.

Private Endpoint

There is a network interface with a private IP address created with a private endpoint, and there is a private link service associated with a specific network interface, so that other resources in the private network can access this service through the private network IP address.

The key here is for the private resources to find a DNS resolve for that private IP address. There are two options to store the DNS record:

  • Azure managed public DNS zones will always be there, but they store different types of IP addresses for the private endpoint. For example: 
    • Storage account private endpoint—the public DNS zone stores the public IP address of that service.
    • AKS API server private endpoint—the public DNS zone stores the private IP of that service.
  • Azure Private DNS zone: The DNS records will be synchronized to the Azure Default DNS of LINKED VNET. 

Private endpoint is eligible to all Azure managed services that are used in Cloudera deployments. 

As a consequence, for storage accounts, users either use service endpoints or private endpoints. Because the public DNS zone will always return a public IP, the private DNS zone becomes a mandatory configuration. 

For AKS, these two DNS alternatives are both suitable. The challenges of private DNS zones will be discussed next.

Challenges of private DNS zone on Azure private network

Important Assumptions

As mentioned above for the typical scenario, most Azure users are using a hub-and-spoke network architecture, and deploy custom private DNS on hub VNET.

The DNS records will be synchronized to Azure default DNS of linked VNET. 

Simple Architecture Use Cases

One VNET scenario with private DNS zone:

When a private endpoint is created, Cloudera on Azure will register the private endpoint to the private DNS zone. The DNS record will be synchronized to Azure Default DNS of linked VNET. 

If users use custom private DNS, they can configure conditional forward to Azure Default DNS for the domain suffix of the FQDN.

Hub-and-spoke VNET with Azure default DNS:

With hub-spoke VNET and Azure default DNS, that is still acceptable. The only problem is that the resources on the un-linked VNET will not be able to access the AKS. But since AKS is used by Cloudera, that does not pose any major issues.

The Challenge Part

The most popular network architecture among Azure consumers is hub-spoke network with custom private DNS servers deployed either on hub-VNET or on-premises network. 

Since DNS records are not synchronized to the Azure Default DNS of the hub VNET, the custom private DNS server cannot find the DNS record for the private endpoint. And because the Cloudera VNET is using the custom private DNS server on hub VNET, the Cloudera resources on Cloudera VNET will go to a custom private DNS server for DNS resolution of the FQDN of the private endpoint. The provisioning will fail.

With the DNS server deployed in the on-prem network, there isn’t Azure default DNS associated with the on-prem network, so the DNS server couldn’t find the DNS record of the FQDN of the private endpoint.

Configuration best practices

Against the background

Option 1: Disable Private DNS Zone

Use Azure managed public DNS zone instead of a private DNS zone. 

  • For data warehouse: create data warehouses through the Cloudera command line interface with the parameter “privateDNSZoneAKS”: set to”None.”

  • For Liftie-based data services: the entitlement “LIFTIE_AKS_DISABLE_PRIVATE_DNS_ZONE” must be set. Customers can request this entitlement to be set either through a JIRA ticket or have their Cloudera solution engineer to make the request on their behalf.

The sole drawback of this option is that it does not apply to data engineering, since that data service will create and use a MySQL private DNS zone on the fly. There is at present no option to disable private DNS zones for data engineering.

Option 2: Pre-create Private DNS Zones

Pre-create private DNS zones and link both Cloudera and hub VNETs to them. 

The advantage of this approach is that both data warehouse and Liftie-based data services support pre-created private DNS zones. There are however also a few drawbacks:

  • For Liftie, the private DNS zone needs to be configured when registering the environment. Once past the environment registration stage, it cannot be configured. 
  • DE will need a private DNS zone for MySQL and it doesn’t support pre-configured private DNS zones.
  • On-premises networks can’t be linked to a private DNS zone. If the DNS server is on an on-prem network, there are no workable solutions.

Option 3: Create DNS Server as a Forwarder.

Create a couple of DNS servers (for HA consideration) with load balancer in Cloudera VNET, and configure conditional forward to Azure Default DNS of the Cloudera VNET. Configure conditional forward from the company custom private DNS server to the DNS server in the Cloudera subnet.

The drawback of this option is that additional DNS servers are required, which leads to additional administration overhead for the DNS team.

Option 4: Azure-Managed DNS Resolve

Create a dedicated /28 subnet in Cloudera VNET for Azure private DNS resolver inbound endpoint. Configure conditional forward from custom private DNS to the Azure private DNS resolver inbound endpoint.

Summary

Bringing all things together, consider these best practices for setting up your DNS with Cloudera on Azure:

  • For the storage account, key vault, postgres DB
    • Use service endpoints as the first choice.
    • If service endpoint is not allowed, pre-create private DNS zones and link to the VNET where the DNS server is deployed. Configure conditional forwards from custom private DNS to Azure default DNS.
    • If the custom private DNS is deployed in the on-premises network, use Azure DNS resolver or another DNS server as DNS forwarder on the Cloudera VNET. Conditional forward the DNS lookup from the private DNS to the resolver endpoint.
  • For the data warehouse, DataFlow, or machine learning data services
    • Disable the private DNS zone and use the public DNS zone instead. 
  • For the data engineering data service
    • Configure the Azure DNS resolver or another DNS server as a DNS forwarder on the Cloudera VNET. Conditional forward the DNS lookup from the private DNS to the resolver endpoint. Please refer to Microsoft documentation for the details of setting up an Azure DNS Private Resolver

For more background reading on network and DNS specifics for Azure, have a look at our documentation for the various data services: DataFlow, Data Engineering, Data Warehouse, and Machine Learning. We’re also happy to discuss your specific needs; in that case please reach out to your Cloudera account manager or get in touch.

Dongkai Yu
Solutions Engineer
More by this author

1 Comments

by Venkata Panchumarthi on

Hi, I am Venkata Panchumarthi. I read this article and it is very informative. I like the way you explained about the topic. Thank you so much for sharing all this wonderful info. It is so appreciated!!!

Leave a comment

Your email address will not be published. Links are not permitted in comments.