From Cloud-First to Cloud-Smart to Repatriation

From Cloud-First to Cloud-Smart to Repatriation

VMware Explore 2024 happened this week in Las Vegas. I think many people were curious about what Hock Tan, CEO of Broadcom, had to say during the general session. He delivered interesting statements and let everyone in the audience know that “the future of enterprise is private – private cloud, private AI, fueled by your own private data“. On social media, the following slide about “repatriation” made quite some noise:

VMware Explore 2024 Keynote Repatriation

The information on this slide came from Barcley’s CIO Survey in April 2024 and it says that 8 out of 10 CIOs today are planning to move workloads from the public cloud back to their on-premises data centers. It is interesting, and in some cases even funny, that other vendors in the hardware and virtualization business are chasing this ambulance now. Cloud migrations are dead, let us do reverse cloud migrations now. Hybrid cloud is dead, let us do hybrid multi-clouds now and provide workload mobility. My social media walls are full of such postings now. It seems Hock Tan presented the Holy Grail to the world.

Where is this change of mind from? Why did only 43% during COVID-19 plan a reverse cloud migration and now “suddenly” more than 80%?

I could tell you the story now about cloud-first not being cool anymore, that organizations started to follow a smarter cloud approach, and then concluded that cloud migrations are still not happening based on their expectations (e.g., costs and complexity). And that it is time now to bring workloads back on-premises. It is not that simple.

I looked at Barclay’s CIO survey and the chart (figure 20 in the survey) that served as a source for Hock Tan’s slide:

Barclays CIO Survey April 2024 Cloud RepatriationWe must be very careful with our interpretation of the results. Just because someone is “planning” a reverse cloud migration, does it mean they are executing? And if they execute such an exercise, is this going to be correctly reflected in a future survey?

And which are the workloads and services that are brought back to an enterprise’s data center? Are we talking about complete applications? Or is it more about load balancers, security appliances, databases and storage, and specific virtual machines? And if we understand the workloads, what are the real reasons to bring them back? Figure 22 of the survey shows “Workloads that Respondents Intend to Move Back to Private Cloud / On-Premise from Public Cloud”:

Barclays CIO Survey April 2024 Workload to migrate

Okay, we have a little bit more context now. Just because some workloads are potentially migrated back to private clouds, what does it mean for public cloud vs. private cloud spend? Question #11 of the survey “What percentage of your workloads and what percentage of your total IT spend are going towards the public cloud, and how have those evolved over time?” focuses on this matter.

Barclays CIO Survey April 2024 Percentage of Workloads and Spend My interpretation? Just because one slide or illustration talks about repatriation does not mean, that the entire world is just doing reverse migrations now. Cloud migrations and reverse cloud migrations can happen at the same time. You could bring one application or some databases back on-premises but decide to move all your virtual desktops to the public cloud in parallel. We could still bring workloads back to our data center and increase public cloud spend. 

Sounds like cloud-smart again, doesn’t it? Maybe I am an organization that realized that the applications A, B, C, and D shouldn’t run in Azure, AWS, Google, and Oracle anymore, but the applications W, X, Y, and Z are better suited for these hyperscalers.

What else?

I am writing about my views and my opinions here. There is more to share. During the pandemic, everything had to happen very quickly, and everyone suddenly had money to speed up migrations and application modernization projects. After that, I think it is a natural thing that everything was slowing down a bit after this difficult and exhausting phase.

Some of the IT teams are probably still documenting all their changes and new deployments on an internal wiki, and their bosses started to hire FinOps specialists to analyze their cloud spend. It is no shocking surprise to me that some of the financial goals haven’t been met and result in a reverse cloud migration a few years later.

But that is not all. Try to think about the past years. What else happened?

Yes, we almost forgot about Artificial Intelligence (AI) and Sovereign Clouds.

Before 2020, not many of us were thinking about sovereign clouds, data privacy, and AI.

Most enterprises are still hosting their data on-premises behind their own firewall. And some of this data is used to train or finetune models. We see (internal) chatbots popping up using Retrieval Augmented Generation (RAG), which delivers answers based on actual data and proprietary information.

Okay. What else? 

Yep, there is more. There are new technologies and offerings available that were not here before. We just covered AI and ML (machine learning) workloads that became a potential cost or compliance concern.

The concept of sovereign clouds has gained traction due to increasing concerns about data sovereignty and compliance with local regulations.

The adoption of hybrid and hybrid multi-cloud strategies has been a significant trend from 2020 to 2024. Think about VMware’s Cloud Foundation approach with Azure, Google, Oracle etc., AWS Outposts, Azure Stack, Oracle’s DRCC, or Nutanix’s.

Enterprises started to upskill and train their people to deliver their own Kubernetes platforms.

Edge computing has emerged as a crucial technology, particularly for industries like manufacturing, telecommunications, and healthcare, where real-time data processing is critical.

Conclusion

Reverse cloud migrations are happening for many different reasons like cost management, performance optimization, data security and compliance, automation and operations, or because of lock-in concerns.

Yes, (cloud) repatriation became prominent, but I think this is just a reflection of the maturing cloud market – and not an ambulance.

And no, it is not a better moment to position your hybrid multi-cloud solutions, unless you understand the services and workloads that need to be migrated from one cloud to another. Just because some CIOs plan to bring back some workloads on-premises, does it mean/imply that they will do it? What about the sunk cost fallacy?

Perhaps IT leaders are going to be more careful in the future and are trying to find other ways for potential cost savings and strategic benefits to achieve their business outcomes – and keep their workloads in the cloud versus repatriating them.

Businesses are adopting a more nuanced workload-centric strategy.

What’s your opinion?

Distributed Hybrid Infrastructure Offerings Are The New Multi-Cloud

Distributed Hybrid Infrastructure Offerings Are The New Multi-Cloud

Since VMware belongs to Broadcom, there was less focus and messaging on multi-cloud or supercloud architectures. Broadcom has drastically changed the available offerings and VMware Cloud Foundation is becoming the new vSphere. Additionally, we have seen big changes regarding the partnerships with hyperscalers (the Azures and AWSes of this world) and the VMware Cloud partners and providers. So, what happened to multi-cloud and how come that nobody (at Broadcom) talks about it anymore?

What is going on?

I do not know if it’s only me, but I do not see the term “multi-cloud” that often anymore. Do you? My LinkedIn feed is full of news about artificial intelligence (AI) and how Nvidia employees got rich. So, I have to admit that I lost track of hybrid clouds, multi-clouds, or hybrid multi-cloud architectures. 

Cloud-Inspired and Cloud-Native Private Clouds

It seems to me that the initial idea of multi-cloud has changed in the meantime and that private clouds are becoming platforms with features. Let me explain.

Organizations have built monolithic private clouds in their data centers for a long time. In software engineering, the word “monolithic” describes an application that consists of multiple components, which form something larger. To build data centers, we followed the same approach by using different components like compute, storage, and networking. And over time, IT teams started to think about automation and security, and the integration of different solutions from different vendors.

The VMware messaging was always pointing in the right direction: They want to provide a cloud operating system for any hardware and any cloud (by using VMware Cloud Foundation). On top of that, build abstraction layers and leverage a unified control plane (aka consistent automation and operations).

And I told all my customers since 2020 that they need to think like a cloud service provider, get rid of silos, implement new processes, and define a new operating model. That is VMware by Broadcom’s messaging today and this is where they and other vendors are headed: a platform with features that provide cloud services.

In other words, and this is my opinion, VMware Cloud Foundation is today a platform with different components like vSphere, vSAN, NSX, Aria, and so on. Tomorrow, it is still called VMware Cloud Foundation, a platform that includes compute, storage, networking, automation, operations, and other features. No more other product names, just capabilities, and services like IaaS, CaaS, DRaaS or DBaaS. You just choose the specs of the underlying hardware and networking, deploy your private clouds, and then start to build and consume your services.

Replace the name “VMware Cloud Foundation” in the last paragraph with AWS Outposts or Azure Stack. Do you see it now? Distributed unmanaged and managed hybrid cloud offerings with a (service) consumption interface on top.

That is the shift from monolithic data centers to cloud-native private clouds.

From Intercloud to Multi-Cloud

It is not the first time that I write about interclouds, that not many of us know. In 2012, there was this idea that different clouds and vendors need to be interoperable and agree on certain standards and protocols. Think about interconnected private and public clouds, which allow you to provide VM mobility or application portability. Can you see the picture in front of you? What is the difference today in 2024?

In 2023, I truly believed that VMware figured it out when they announced VMware Cloud on Equinix Metal (VMC-E). To me, VMC-E was different and special because of Equinix, who is capable of interconnecting different clouds, and at the same time could provide a baremetal-as-a-service (BMaaS) offering.

Workload Mobility and Application Portability

Almost 2 years ago, I started to write a book about this topic, because I wanted to figure out if workload mobility and application portability are things, that enterprises are really looking for. I interviewed many CIOs, CTOs, chief architects and engineers around the globe, and it became VERY clear: it seems nobody was changing anything to make app portability a design requirement.

Almost all of the people I have spoken to, told me, that a lot of things must happen that could trigger a cloud-exit and therefore they see this as a nice-to-have capability that helps them to move virtual machines or applications faster from one cloud to another.

VMware Workload Mobility

And I have also been told that a lift & shift approach is not providing any value to almost all of them.

But when I talked to developers and operations teams, the answers changed. Most of them did not know that a vendor could provide mobility or portability. Anyway, what has changed now?

Interconnected Multi-Clouds and Distributed Hybrid Clouds

I mentioned it already before. Some vendors have realized that they need to deliver a unified and integrated programmable platform with a control plane. Ideally, this control plane can be used on-premises, as a SaaS solution, or both. And according to Gartner, these are the leaders in this area (Magic Quadrant for Distributed Hybrid Infrastructure):

Gartner Magic-Quadrant-for-Distributed-Hybrid-Infrastructure

In my opinion, VMware and Nutanix are providing a hybrid multi-cloud approach.

AWS and Microsoft are providing hybrid cloud solutions. In Microsoft’s case, we see Azure Stack HCI, Azure Kubernetes Service (AKS incl. Hybrid AKS) and Azure Arc extending Microsoft’s Azure services to on-premises data centers and edge locations.

The only vendor, that currently offers true multi-cloud capabilities, is Oracle. Oracle has Dedicated Region Cloud@Customer (DRCC) and Roving Edge, but also partnerships with Microsoft and Google that allow customers to host Oracle databases in Azure and Google Cloud data centers. Both partnerships come with a cross-cloud interconnection.

That is one of the big differences and changes for me at the moment. Multi-cloud has become less about mobility or portability, a single global control plane, or the same Kubernetes distribution in all the clouds, but more about bringing different services from different cloud providers closer together.

This is the image I created for the VMC-E blog. Replace the words “AWS” and “Equinix” with “Oracle” and suddenly you have something that was not there before, an interconnected multi-cloud.

What’s Next?

Based on the conversations with my customers, it does not feel that public cloud migrations are happening faster than in 2020 or 2022 and we still see between 70 and 80% of the workloads hosted on-premises. While we see customers who are interested in a cloud-first approach, we see many following a hybrid multi-cloud and/or multi-cloud approach. It is still about putting the right applications in the right cloud based on the right decisions. This has not changed.

But the narrative of such conversations has changed. We will see more conversations about data residency, privacy, security, gravity, proximity, and regulatory requirements. Then there are sovereign clouds.

Lastly, enterprises are going to deploy new platforms for AI-based workloads. But that could still take a while.

Final Thoughts

As enterprises continue to navigate the above mentioned complexities, the need for flexible, scalable, and secure infrastructure solutions will only grow. There are a few compelling solutions that bridge the gap between traditional on-premises systems and modern cloud environments.

And since most enterprises are still hosting their workloads on-premises, they have to decide if they want to stretch the private cloud to the public cloud, or the other way around. Both options can co-exist, but would make it too big and too complex. What’s your conclusion?

AZ-104 Study Guide – Azure Storage

AZ-104 Study Guide – Azure Storage

 

If you are looking for the full AZ-104 study guide: https://www.cloud13.ch/2023/10/31/az-104-study-guide-microsoft-azure-administrator/ 

It is clear to me that networking is probably the most complex topic in Azure. The concept is very different from the on-premises world,  you have so many options and a lot of topics to understand. Let us focus on Azure storage as the next topic. As always, I will follow John Savill’s guidance and look for the documentation online.

Storage Accounts

An Azure storage account contains all of your Azure Storage data objects: blobs, files, queues, and tables. The storage account provides a unique namespace for your Azure Storage data that’s accessible from anywhere in the world over HTTP or HTTPS. Data in your storage account is durable and highly available, secure, and massively scalable.

When naming your storage account, keep these rules in mind:

  • Storage account names must be between 3 and 24 characters in length and may contain numbers and lowercase letters only.
  • Your storage account name must be unique within Azure. No two storage accounts can have the same name.

Azure Storage Redundancy

Data in an Azure Storage account is always replicated three times in the primary region. Azure Storage offers two options for how your data is replicated in the primary region:

  • Locally redundant storage (LRS) copies your data synchronously three times within a single physical location in the primary region. LRS is the least expensive replication option, but isn’t recommended for applications requiring high availability or durability.Diagram showing how data is replicated in a single data center with LRS
    • Zone-redundant storage (ZRS) copies your data synchronously across three Azure availability zones in the primary region. For applications requiring high availability, Microsoft recommends using ZRS in the primary region, and also replicating to a secondary region.

    Diagram showing how data is replicated in the primary region with ZRS

    Redundancy in a secondary region

    For applications requiring high durability, you can choose to additionally copy the data in your storage account to a secondary region that is hundreds of miles away from the primary region. If your storage account is copied to a secondary region, then your data is durable even in the case of a complete regional outage or a disaster in which the primary region isn’t recoverable.

    Azure Storage offers two options for copying your data to a secondary region:

    • Geo-redundant storage (GRS) copies your data synchronously three times within a single physical location in the primary region using LRS. It then copies your data asynchronously to a single physical location in the secondary region. Within the secondary region, your data is copied synchronously three times using LRS.

    Diagram showing how data is replicated with GRS or RA-GRS

    • Geo-zone-redundant storage (GZRS) copies your data synchronously across three Azure availability zones in the primary region using ZRS. It then copies your data asynchronously to a single physical location in the secondary region. Within the secondary region, your data is copied synchronously three times using LRS.

    Diagram showing how data is replicated with GZRS or RA-GZRS

    Geo-redundant storage (with GRS or GZRS) replicates your data to another physical location in the secondary region to protect against regional outages. With an account configured for GRS or GZRS, data in the secondary region is not directly accessible to users or applications, unless a failover occurs. The failover process updates the DNS entry provided by Azure Storage so that the secondary endpoint becomes the new primary endpoint for your storage account. During the failover process, your data is inaccessible. After the failover is complete, you can read and write data to the new primary region.

    The following table describes key parameters for each redundancy option:

    Parameter LRS ZRS GRS/RA-GRS GZRS/RA-GZRS
    Percent durability of objects over a given year at least 99.999999999% (11 9’s) at least 99.9999999999% (12 9’s) at least 99.99999999999999% (16 9’s) at least 99.99999999999999% (16 9’s)
    Availability for read requests At least 99.9% (99% for cool or archive access tiers) At least 99.9% (99% for cool access tier)

    At least 99.9% (99% for cool or archive access tiers) for GRS

    At least 99.99% (99.9% for cool or archive access tiers) for RA-GRS

    At least 99.9% (99% for cool access tier) for GZRS

    At least 99.99% (99.9% for cool access tier) for RA-GZRS

    Availability for write requests At least 99.9% (99% for cool or archive access tiers) At least 99.9% (99% for cool access tier) At least 99.9% (99% for cool or archive access tiers) At least 99.9% (99% for cool access tier)
    Number of copies of data maintained on separate nodes Three copies within a single region Three copies across separate availability zones within a single region Six copies total, including three in the primary region and three in the secondary region Six copies total, including three across separate availability zones in the primary region and three locally redundant copies in the secondary region

    Azure Blobs

    Azure Storage offers three types of blob storage:

    • Block Blobs. Block blobs are composed of blocks and are ideal for storing text or binary files, and for uploading large files efficiently.
    • Append Blobs. Append blobs are also made up of blocks, but they are optimized for append operations, making them ideal for logging scenarios.
    • Page blobs. Page blobs are made up of 512-byte pages up to 8 TB in total size and are designed for frequent random read/write operations. Page blobs are the foundation of Azure IaaS Disks

    Overview of Azure page blobs

    Page blobs are a collection of 512-byte pages, which provide the ability to read/write arbitrary ranges of bytes. Hence, page blobs are ideal for storing index-based and sparse data structures like OS and data disks for Virtual Machines and Databases. For example, Azure SQL DB uses page blobs as the underlying persistent storage for its databases. Moreover, page blobs are also often used for files with Range-Based updates.

    Key features of Azure page blobs are its REST interface, the durability of the underlying storage, and the seamless migration capabilities to Azure. These features are discussed in more detail in the next section. In addition, Azure page blobs are currently supported on two types of storage: Premium Storage and Standard Storage. Premium Storage is designed specifically for workloads requiring consistent high performance and low latency making premium page blobs ideal for high performance storage scenarios. Standard storage accounts are more cost effective for running latency-insensitive workloads.

    Azure page blobs are the backbone of the virtual disks platform for Azure IaaS. Both Azure OS and data disks are implemented as virtual disks where data is durably persisted in the Azure Storage platform and then delivered to the virtual machines for maximum performance. Azure Disks are persisted in Hyper-V VHD format and stored as a page blob in Azure Storage. In addition to using virtual disks for Azure IaaS VMs, page blobs also enable PaaS and DBaaS scenarios such as Azure SQL DB service, which currently uses page blobs for storing SQL data, enabling fast random read-write operations for the database. Another example would be if you have a PaaS service for shared media access for collaborative video editing applications, page blobs enable fast access to random locations in the media. It also enables fast and efficient editing and merging of the same media by multiple users.

    The following visual illustrates the guidelines to choose the various Azure data transfer tools depending upon the network bandwidth available for transfer, data size intended for transfer, and frequency of the transfer.

    Azure data transfer tools

    Premium block blob storage accounts

    Premium block blob storage accounts make data available via high-performance hardware. Data is stored on solid-state drives (SSDs) which are optimized for low latency. SSDs provide higher throughput compared to traditional hard drives. File transfer is much faster because data is stored on instantly accessible memory chips. All parts of a drive accessible at once. By contrast, the performance of a hard disk drive (HDD) depends on the proximity of data to the read/write heads.

    Access tiers for blob data

    Data stored in the cloud grows at an exponential pace. To manage costs for your expanding storage needs, it can be helpful to organize your data based on how frequently it will be accessed and how long it will be retained. Azure storage offers different access tiers so that you can store your blob data in the most cost-effective manner based on how it’s being used. Azure Storage access tiers include:

    • Hot tier – An online tier optimized for storing data that is accessed or modified frequently. The hot tier has the highest storage costs, but the lowest access costs.
    • Cool tier – An online tier optimized for storing data that is infrequently accessed or modified. Data in the cool tier should be stored for a minimum of 30 days. The cool tier has lower storage costs and higher access costs compared to the hot tier.
    • Cold tier – An online tier optimized for storing data that is rarely accessed or modified, but still requires fast retrieval. Data in the cold tier should be stored for a minimum of 90 days. The cold tier has lower storage costs and higher access costs compared to the cool tier.
    • Archive tier – An offline tier optimized for storing data that is rarely accessed, and that has flexible latency requirements, on the order of hours. Data in the archive tier should be stored for a minimum of 180 days.

    Object replication for block blobs

    Object replication asynchronously copies block blobs between a source storage account and a destination account. Some scenarios supported by object replication include:

    • Minimizing latency. Object replication can reduce latency for read requests by enabling clients to consume data from a region that is in closer physical proximity.
    • Increase efficiency for compute workloads. With object replication, compute workloads can process the same sets of block blobs in different regions.
    • Optimizing data distribution. You can process or analyze data in a single location and then replicate just the results to additional regions.
    • Optimizing costs. After your data has been replicated, you can reduce costs by moving it to the archive tier using life cycle management policies.

    Diagram showing how object replication works

    Append Blobs

    An append blob is composed of blocks and is optimized for append operations. When you modify an append blob, blocks are added to the end of the blob only, via the Append Block operation. Updating or deleting of existing blocks is not supported. Unlike a block blob, an append blob does not expose its block IDs.

    Each block in an append blob can be a different size, up to a maximum of 4 MiB, and an append blob can include up to 50,000 blocks. The maximum size of an append blob is therefore slightly more than 195 GiB (4 MiB X 50,000 blocks).

    Azure Files

    Azure Files offers fully managed file shares in the cloud that are accessible via the industry standard Server Message Block (SMB) protocol, Network File System (NFS) protocol, and Azure Files REST API. Azure file shares can be mounted concurrently by cloud or on-premises deployments.

    SMB Azure file shares are accessible from Windows, Linux, and macOS clients. NFS Azure file shares are accessible from Linux clients. Additionally, SMB Azure file shares can be cached on Windows servers with Azure File Sync for fast access near where the data is being used.

    Active Directory as Authentication Source

    On-premises Active Directory Domain Services (AD DS) integration with Azure Files provides the methods for storing directory data while making it available to network users and administrators. Security is integrated with AD DS through logon authentication and access control to objects in the directory. With a single network logon, administrators can manage directory data and organization throughout their network, and authorized network users can access resources anywhere on the network. AD DS is commonly adopted by enterprises in on-premises environments or on cloud-hosted VMs, and AD DS credentials are used for access control. 

    Files AD workflow diagram

    Azure File Sync

    Azure File Sync enables centralizing your organization’s file shares in Azure Files, while keeping the flexibility, performance, and compatibility of a Windows file server. While some users may opt to keep a full copy of their data locally, Azure File Sync additionally can transform Windows Server into a quick cache of your Azure file share. You can use any protocol that’s available on Windows Server to access your data locally, including SMB, NFS, and FTPS. You can have as many caches as you need across the world.

    An Azure hybrid file services topology diagram.

    Azure Queue Storage

    Azure Queue Storage is a service for storing large numbers of messages. You access messages from anywhere in the world via authenticated calls using HTTP or HTTPS. A queue message can be up to 64 KB in size. A queue may contain millions of messages, up to the total capacity limit of a storage account.

    Azure Table Storage

    Azure Table storage stores large amounts of structured data. The service is a NoSQL datastore which accepts authenticated calls from inside and outside the Azure cloud. Azure tables are ideal for storing structured, non-relational data. Common uses of Table storage include:

    • Storing TBs of structured data capable of serving web scale applications
    • Storing datasets that don’t require complex joins, foreign keys, or stored procedures and can be denormalized for fast access
    • Quickly querying data using a clustered index
    • Accessing data using the OData protocol and LINQ queries with WCF Data Service .NET Libraries

    You can use Table storage to store and query huge sets of structured, non-relational data, and your tables will scale as demand increases.

    Tables storage component diagram

     

    Azure Managed Disks

    Azure managed disks are block-level storage volumes that are managed by Azure and used with Azure Virtual Machines. Managed disks are like a physical disk in an on-premises server but, virtualized. With managed disks, all you have to do is specify the disk size, the disk type, and provision the disk. Once you provision the disk, Azure handles the rest.

    The available types of disks are ultra disks, premium solid-state drives (SSD), standard SSDs, and standard hard disk drives (HDD). For information about each individual disk type, see Select a disk type for IaaS VMs.

    Disk type comparison

    The following table provides a comparison of the five disk types to help you decide which to use.

    Ultra disk Premium SSD v2 Premium SSD Standard SSD Standard HDD
    Disk type SSD SSD SSD SSD HDD
    Scenario IO-intensive workloads such as SAP HANA, top tier databases (for example, SQL, Oracle), and other transaction-heavy workloads. Production and performance-sensitive workloads that consistently require low latency and high IOPS and throughput Production and performance sensitive workloads Web servers, lightly used enterprise applications and dev/test Backup, non-critical, infrequent access
    Max disk size 65,536 GiB 65,536 GiB 32,767 GiB 32,767 GiB 32,767 GiB
    Max throughput 4,000 MB/s 1,200 MB/s 900 MB/s 750 MB/s 500 MB/s
    Max IOPS 160,000 80,000 20,000 6,000 2,000, 3,000*
    Usable as OS Disk? No No Yes Yes Yes

    * Only applies to disks with performance plus (preview) enabled.

    Note: You can adjust ultra disk IOPS and throughput performance at runtime without detaching the disk from the virtual machine. After a performance resize operation has been issued on a disk, it can take up to an hour for the change to take effect. Up to four performance resize operations are permitted during a 24-hour window.

    AZ-104 Study Guide – Azure Networking Part 2

    AZ-104 Study Guide – Azure Networking Part 2

     

    If you are looking for the full AZ-104 study guide: https://www.cloud13.ch/2023/10/31/az-104-study-guide-microsoft-azure-administrator/ 

    In part 1, I covered vNets, public IPs, vNet peering, NSGs and Azure Firewall. Part 2 is about DNS services, S2S VPN, Express Route, vWAN, Endpoints, Azure Load Balancers and Azure App Gateway.

    Azure DNS Services

    Azure DNS is a hosting service for DNS domains that provides name resolution by using Microsoft Azure infrastructure. By hosting your domains in Azure, you can manage your DNS records by using the same credentials, APIs, tools, and billing as your other Azure services.

    Azure Private DNS provides a reliable, secure DNS service to manage and resolve domain names in a virtual network without the need to add a custom DNS solution. By using private DNS zones, you can use your own custom domain names rather than the Azure-provided names available today.

    DNS overview

    The records contained in a private DNS zone aren’t resolvable from the Internet. DNS resolution against a private DNS zone works only from virtual networks that are linked to it.

    When you create a private DNS zone, Azure stores the zone data as a global resource. This means that the private zone is not dependent on a single VNet or region. You can link the same private zone to multiple VNets in different regions. If service is interrupted in one VNet, your private zone is still available. 

    The Azure DNS private zones auto registration feature manages DNS records for virtual machines deployed in a virtual network. When you link a virtual network with a private DNS zone with this setting enabled, a DNS record gets created for each virtual machine deployed in the virtual network.

    Screenshot of enable auto registration on add virtual network link page.Note: Auto registration works only for virtual machines. For all other resources like internal load balancers, you can create DNS records manually in the private DNS zone linked to the virtual network.

    You can also configure Azure DNS to resolve host names in your public domain. For example, if you purchased the contoso.xyz domain name from a domain name registrar, you can configure Azure DNS to host the contoso.xyz domain and resolve www.contoso.xyz to the IP address of your web server or web app.

    Diagram of DNS deployment environment using the Azure portal.

    Quickstart guides:

    Site-to-Site VPN

    First of all, to create a site-to-site (S2S) VPN, you need a VPN Gateway.

    A VPN gateway is a type of virtual network gateway. A VPN gateway sends encrypted traffic between your virtual network and your on-premises location across a public connection. You can also use a VPN gateway to send traffic between virtual networks. When you create a VPN gateway, you use the -GatewayType value ‘Vpn’.

    A Site-to-site (S2S) VPN gateway connection is a connection over IPsec/IKE (IKEv1 or IKEv2) VPN tunnel. S2S connections can be used for cross-premises and hybrid configurations. A S2S connection requires a VPN device located on-premises that has a public IP address assigned to it. For information about selecting a VPN device, see the VPN Gateway FAQ – VPN devices.

    Diagram of site-to-site VPN Gateway cross-premises connections.

    VPN Gateway can be configured in active-standby mode using one public IP or in active-active mode using two public IPs. In active-standby mode, one IPsec tunnel is active and the other tunnel is in standby. In this setup, traffic flows through the active tunnel, and if some issue happens with this tunnel, the traffic switches over to the standby tunnel. Setting up VPN Gateway in active-active mode is recommended in which both the IPsec tunnels are simultaneously active, with data flowing through both tunnels at the same time. An additional advantage of active-active mode is that customers experience higher throughputs.

    You can create more than one VPN connection from your virtual network gateway, typically connecting to multiple on-premises sites. When working with multiple connections, you must use a RouteBased VPN type (known as a dynamic gateway when working with classic VNets). Because each virtual network can only have one VPN gateway, all connections through the gateway share the available bandwidth. This type of connection is sometimes referred to as a “multi-site” connection.

    Diagram of site-to-site VPN Gateway cross-premises connections with multiple sites.

    How-to guides:

    Diagram of a site-to-site VPN connection coexisting with an ExpressRoute connection for two different sites.

    ExpressRoute

    ExpressRoute lets you extend your on-premises networks into the Microsoft cloud over a private connection with the help of a connectivity provider. With ExpressRoute, you can establish connections to Microsoft cloud services, such as Microsoft Azure and Microsoft 365.

    Connectivity can be from an any-to-any (IP VPN) network, a point-to-point Ethernet network, or a virtual cross-connection through a connectivity provider at a colocation facility. ExpressRoute connections don’t go over the public Internet. This allows ExpressRoute connections to offer more reliability, faster speeds, consistent latencies, and higher security than typical connections over the Internet. For information on how to connect your network to Microsoft using ExpressRoute, see ExpressRoute connectivity models.

    ExpressRoute connection overview

    Across on-premises connectivity with ExpressRoute Global Reach

    You can enable ExpressRoute Global Reach to exchange data across your on-premises sites by connecting your ExpressRoute circuits. For example, if you have a private data center in California connected to an ExpressRoute circuit in Silicon Valley and another private data center in Texas connected to an ExpressRoute circuit in Dallas.

    With ExpressRoute Global Reach, you can connect your private data centers together through these two ExpressRoute circuits. Your cross data-center traffic will traverse through the Microsoft network.

    Diagram that shows a use case for Express Route Global Reach.

    For more information, see ExpressRoute Global Reach.

    Virtual WAN

    Azure Virtual WAN is a networking service that brings many networking, security, and routing functionalities together to provide a single operational interface. Some of the main features include:

    • Branch connectivity (via connectivity automation from Virtual WAN Partner devices such as SD-WAN or VPN CPE)
    • Site-to-site VPN connectivity
    • Remote user VPN connectivity (point-to-site)
    • Private connectivity (ExpressRoute)
    • Intra-cloud connectivity (transitive connectivity for virtual networks)
    • VPN ExpressRoute inter-connectivity
    • Routing, Azure Firewall, and encryption for private connectivity

    The Virtual WAN architecture is a hub and spoke architecture with scale and performance built in for branches (VPN/SD-WAN devices), users (Azure VPN/OpenVPN/IKEv2 clients), ExpressRoute circuits, and virtual networks. It enables a global transit network architecture, where the cloud hosted network ‘hub’ enables transitive connectivity between endpoints that may be distributed across different types of ‘spokes’.

    Virtual WAN diagram.

    Learning Module: Introduction to Azure Virtual WAN

    Service Endpoints

    Service endpoints provide the following benefits:

    • Improved security for your Azure service resources: VNet private address spaces can overlap. You can’t use overlapping spaces to uniquely identify traffic that originates from your VNet. Service endpoints enable securing of Azure service resources to your virtual network by extending VNet identity to the service. Once you enable service endpoints in your virtual network, you can add a virtual network rule to secure the Azure service resources to your virtual network. The rule addition provides improved security by fully removing public internet access to resources and allowing traffic only from your virtual network.
    • Optimal routing for Azure service traffic from your virtual network: Today, any routes in your virtual network that force internet traffic to your on-premises and/or virtual appliances also force Azure service traffic to take the same route as the internet traffic. Service endpoints provide optimal routing for Azure traffic.Endpoints always take service traffic directly from your virtual network to the service on the Microsoft Azure backbone network. Keeping traffic on the Azure backbone network allows you to continue auditing and monitoring outbound Internet traffic from your virtual networks, through forced-tunneling, without impacting service traffic. For more information about user-defined routes and forced-tunneling, see Azure virtual network traffic routing.
    • Simple to set up with less management overhead: You no longer need reserved, public IP addresses in your virtual networks to secure Azure resources through IP firewall. There are no Network Address Translation (NAT) or gateway devices required to set up the service endpoints. You can configure service endpoints through a single selection on a subnet. There’s no extra overhead to maintaining the endpoints.

    Private Endpoints

    A private endpoint is a network interface that uses a private IP address from your virtual network. This network interface connects you privately and securely to a service that’s powered by Azure Private Link. By enabling a private endpoint, you’re bringing the service into your virtual network.

    The service could be an Azure service such as:

    Azure Load Balancer (Layer 4)

    With Azure Load Balancer, you can scale your applications and create highly available services. Load balancer supports both inbound and outbound scenarios. Load balancer provides low latency and high throughput, and scales up to millions of flows for all TCP and UDP applications.

    Diagram depicts public and internal load balancers directing traffic to web and business tiers.

    Key scenarios that you can accomplish using Azure Standard Load Balancer include:

    Azure App Gateway (Layer 7)

    Azure Application Gateway is a web traffic (OSI layer 7) load balancer that enables you to manage traffic to your web applications. Traditional load balancers operate at the transport layer (OSI layer 4 – TCP and UDP) and route traffic based on source IP address and port, to a destination IP address and port.

    Application Gateway can make routing decisions based on additional attributes of an HTTP request, for example URI path or host headers. For example, you can route traffic based on the incoming URL. So if /images is in the incoming URL, you can route traffic to a specific set of servers (known as a pool) configured for images. If /video is in the URL, that traffic is routed to another pool that’s optimized for videos.

    imageURLroute

    The App Gateway features can be found here: https://learn.microsoft.com/en-us/azure/application-gateway/features

    Here is a service comparison from all the load balancing options:

    Azure Load Balancing Comparison

    Global Load Balancing

    Azure load-balancing services can be categorized along two dimensions: global versus regional and HTTP(S) versus non-HTTP(S).

    Global vs. regional

    • Global: These load-balancing services distribute traffic across regional back-ends, clouds, or hybrid on-premises services. These services route end-user traffic to the closest available back-end. They also react to changes in service reliability or performance to maximize availability and performance. You can think of them as systems that load balance between application stamps, endpoints, or scale-units hosted across different regions/geographies.
    • Regional: These load-balancing services distribute traffic within virtual networks across virtual machines (VMs) or zonal and zone-redundant service endpoints within a region. You can think of them as systems that load balance between VMs, containers, or clusters within a region in a virtual network.

    The following table summarizes the Azure load-balancing services.

    Service Global/Regional Recommended traffic
    Azure Front Door Global HTTP(S)
    Azure Traffic Manager Global Non-HTTP(S)
    Azure Application Gateway Regional HTTP(S)
    Azure Load Balancer Regional or Global Non-HTTP(S)

    Azure load-balancing services

    Here are the main load-balancing services currently available in Azure:

    • Azure Front Door is an application delivery network that provides global load balancing and site acceleration service for web applications. It offers Layer 7 capabilities for your application like SSL offload, path-based routing, fast failover, and caching to improve performance and high availability of your applications. 
    • Traffic Manager is a DNS-based traffic load balancer that enables you to distribute traffic optimally to services across global Azure regions, while providing high availability and responsiveness. Because Traffic Manager is a DNS-based load-balancing service, it load balances only at the domain level. For that reason, it can’t fail over as quickly as Azure Front Door, because of common challenges around DNS caching and systems not honoring DNS TTLs.
    • Application Gateway provides application delivery controller as a service, offering various Layer 7 load-balancing capabilities. Use it to optimize web farm productivity by offloading CPU-intensive SSL termination to the gateway.
    • Load Balancer is a high-performance, ultra-low-latency Layer 4 load-balancing service (inbound and outbound) for all UDP and TCP protocols. It’s built to handle millions of requests per second while ensuring your solution is highly available. Load Balancer is zone redundant, ensuring high availability across availability zones. It supports both a regional deployment topology and a cross-region topology.
    AZ-104 Study Guide – Azure Networking Part 1

    AZ-104 Study Guide – Azure Networking Part 1

    If you are looking for the full AZ-104 study guide: https://www.cloud13.ch/2023/10/31/az-104-study-guide-microsoft-azure-administrator/ 

    Learn about the various Azure networking services available that provide connectivity to your resources in Azure, deliver and protect applications, and help secure your network.

    Virtual Networks

    A virtual network cannot span regions or subscriptions, it has its boundaries, but can span (subnets) across availability zones in a region. Virtual networks are always IPv4 ranges, IPv6 ranges are optional.

    Azure Virtual Network is a service that provides the fundamental building block for your private network in Azure. An instance of the service (a virtual network) enables many types of Azure resources to securely communicate with each other, the internet, and on-premises networks. These Azure resources include virtual machines (VMs).

    A virtual network is similar to a traditional network that you’d operate in your own datacenter. But it brings extra benefits of the Azure infrastructure, such as scale, availability, and isolation.

    Diagram of resources created in virtual network quickstart.

    A training module on designing and implementing core Azure networking infrastructure, including virtual networks can be found here: https://learn.microsoft.com/en-us/training/modules/introduction-to-azure-virtual-networks/

    Public IPs

    Public IP addresses enable Internet resources to communicate with Azure resources and enable Azure resources to communicate outbound with Internet and public-facing Azure services. A public IP address in Azure is dedicated to a specific resource, until it’s unassigned by a network engineer. A resource without a public IP assigned can communicate outbound through network address translation services, where Azure dynamically assigns an available IP address that isn’t dedicated to the resource.

    In Azure Resource Manager, a public IP (this is also bound to a region) address is a resource that has its own properties. Some of the resources you can associate a public IP address resource with:

    • Virtual machine network interfaces
    • Virtual machine scale sets
    • Public Load Balancers
    • Virtual Network Gateways (VPN/ER)
    • NAT gateways
    • Application Gateways
    • Azure Firewall
    • Bastion Host
    • Route Server

    Public IP addresses are created with one of the following SKUs:

    Public IP address Standard Basic
    Allocation method Static For IPv4: Dynamic or Static; For IPv6: Dynamic.
    Idle Timeout Have an adjustable inbound originated flow idle timeout of 4-30 minutes, with a default of 4 minutes, and fixed outbound originated flow idle timeout of 4 minutes. Have an adjustable inbound originated flow idle timeout of 4-30 minutes, with a default of 4 minutes, and fixed outbound originated flow idle timeout of 4 minutes.
    Security Secure by default model and be closed to inbound traffic when used as a frontend. Allow traffic with network security group (NSG) is required (for example, on the NIC of a virtual machine with a Standard SKU Public IP attached). Open by default. Network security groups are recommended but optional for restricting inbound or outbound traffic.
    Availability zones Supported. Standard IPs can be nonzonal, zonal, or zone-redundant. Zone redundant IPs can only be created in regions where 3 availability zones are live. IPs created before availability zones aren’t zone redundant. Not supported.
    Routing preference Supported to enable more granular control of how traffic is routed between Azure and the Internet. Not supported.
    Global tier Supported via cross-region load balancers. Not supported.

    Azure Public IP

    vNet Peering

    What if I have more than one network or vNets that should communicate with each other? Azure has the idea of peering that can happen in the same region or globally.

    Diagram of Azure resources created in tutorial.

    Virtual network peering enables you to seamlessly connect two or more Virtual Networks in Azure. The virtual networks appear as one for connectivity purposes. The traffic between virtual machines in peered virtual networks uses the Microsoft backbone infrastructure. Like traffic between virtual machines in the same network, traffic is routed through Microsoft’s private network only.

    Azure supports the following types of peering:

    • Virtual network peering: Connecting virtual networks within the same Azure region.

    • Global virtual network peering: Connecting virtual networks across Azure regions.

    Note: Network traffic between peered virtual networks is private. Traffic between the virtual networks is kept on the Microsoft backbone network. No public Internet, gateways, or encryption is required in the communication between the virtual networks.

    Important: Peering connections are non-transitive! From the FAQ:

    If I peer VNetA to VNetB and I peer VNetB to VNetC, does that mean VNetA and VNetC are peered?

    No. Transitive peering is not supported. You must manually peer VNetA to VNetC.

    Network Security Groups

    You can use an Azure network security group (NSG) to filter network traffic between Azure resources in an Azure virtual network. A network security group contains security rules that allow or deny inbound network traffic to, or outbound network traffic from, several types of Azure resources. For each rule, you can specify source and destination, port, and protocol.

    Diagram of NSG processing.

    A network security group contains as many rules as desired, within Azure subscription limits. Each rule specifies the following properties:

    Property Explanation
    Name A unique name within the network security group. The name can be up to 80 characters long. It must begin with a word character, and it must end with a word character or with ‘_’. The name may contain word characters or ‘.’, ‘-‘, ‘_’.
    Priority A number between 100 and 4096. Rules are processed in priority order, with lower numbers processed before higher numbers, because lower numbers have higher priority. Once traffic matches a rule, processing stops. As a result, any rules that exist with lower priorities (higher numbers) that have the same attributes as rules with higher priorities aren’t processed.
    Azure default security rules are given the highest number with the lowest priority to ensure that custom rules are always processed first.
    Source or destination Any, or an individual IP address, classless inter-domain routing (CIDR) block (10.0.0.0/24, for example), service tag, or application security group. If you specify an address for an Azure resource, specify the private IP address assigned to the resource. Network security groups are processed after Azure translates a public IP address to a private IP address for inbound traffic, and before Azure translates a private IP address to a public IP address for outbound traffic. Fewer security rules are needed when you specify a range, a service tag, or application security group. The ability to specify multiple individual IP addresses and ranges (you can’t specify multiple service tags or application groups) in a rule is referred to as augmented security rules. Augmented security rules can only be created in network security groups created through the Resource Manager deployment model. You can’t specify multiple IP addresses and IP address ranges in network security groups created through the classic deployment model.
    Protocol TCP, UDP, ICMP, ESP, AH, or Any. The ESP and AH protocols aren’t currently available via the Azure portal but can be used via ARM templates.
    Direction Whether the rule applies to inbound, or outbound traffic.
    Port range You can specify an individual or range of ports. For example, you could specify 80 or 10000-10005. Specifying ranges enables you to create fewer security rules. Augmented security rules can only be created in network security groups created through the Resource Manager deployment model. You can’t specify multiple ports or port ranges in the same security rule in network security groups created through the classic deployment model.
    Action Allow or deny

    Azure Firewall and Routing

    Azure Firewall is a cloud-native and intelligent network firewall security service that provides the best of breed threat protection for your cloud workloads running in Azure. It’s a fully stateful firewall as a service with built-in high availability and unrestricted cloud scalability. It provides both east-west and north-south traffic inspection. To learn what’s east-west and north-south traffic, see East-west and north-south traffic.

    Firewall Standard overview

    Azure Firewall includes the following features:

    • Built-in high availability
    • Availability Zones
    • Unrestricted cloud scalability
    • Application FQDN filtering rules
    • Network traffic filtering rules
    • FQDN tags
    • Service tags
    • Threat intelligence
    • DNS proxy
    • Custom DNS
    • FQDN in network rules
    • Deployment without public IP address in Forced Tunnel Mode
    • Outbound SNAT support
    • Inbound DNAT support
    • Multiple public IP addresses
    • Azure Monitor logging
    • Forced tunneling
    • Web categories
    • Certifications

    Azure Firewall now supports three different SKUs to cater to a wide range of customer use cases and preferences.

    • Azure Firewall Premium is recommended to secure highly sensitive applications (such as payment processing). It supports advanced threat protection capabilities like malware and TLS inspection. 
    • Azure Firewall Standard is recommended for customers looking for Layer 3–Layer 7 firewall and needs autoscaling to handle peak traffic periods of up to 30 Gbps. It supports enterprise features like threat intelligence, DNS proxy, custom DNS, and web categories.
    • Azure Firewall Basic is recommended for SMB customers with throughput needs of 250 Mbps.

    Table of Azure Firewall Sku features.

    A training module for Azure Firewalls can be found here: https://learn.microsoft.com/en-us/training/modules/introduction-azure-firewall/ 

    User-Defined Route

    To send traffic through this Azure Firewall, you can create a user-defined route (UDR).

    Click here for part 2.

     

    AZ-104 Study Guide – What is an Azure Landing Zone?

    AZ-104 Study Guide – What is an Azure Landing Zone?

    If you are looking for the full AZ-104 study guide: https://www.cloud13.ch/2023/10/31/az-104-study-guide-microsoft-azure-administrator/ 

    When I discuss a hybrid or multi-cloud architecture with my customers, especially organizations that just started to journey to the cloud, they mention the so-called cloud landing zones. I remember, a few years ago, when I asked my customers and the internal specialist what a “landing zone” is, I received an answer like this:

    A landing zone is a concept or architecture, which helps you to get started with your cloud migrations based on best or leading practices. You start small, and you expand later – you put your apps in the right place.

    Does this sound familiar? If everyone is saying almost the same, then there’s no need to investigate this term or definition further, right?

    Since I want to better understand Azure and pass my first solutions architect exam, I thought I would google “landing zone” to check, if it’s only me who has doubts about the correct and full understanding of these so-called landing zones.

    Guess what? It is almost 2024 and still a lot of people do not know exactly what landing zones are. As always, you find some content from people like John Savill and Thomas Maurer.

     

    An Azure landing zone is part of the Cloud Adoption Frame (CAF) for Azure and describes an environment that follows key design principles. These design principles are about application migration, application modernization, and innovation. These landing zones use Azure subscriptions to isolate specific application and platform resources.

    An Azure landing zone architecture is scalable and modular to meet various deployment needs. A repeatable infrastructure allows you to apply configurations and controls to every subscription consistently. Modules make it easy to deploy and modify specific Azure landing zone architecture components as your requirements evolve. The Azure landing zone conceptual architecture represents an opinionated target architecture for your Azure landing zone. You should use this conceptual architecture as a starting point and tailor the architecture to meet your needs.

    A conceptual architecture diagram of an Azure landing zone.

    According to Microsoft, an Azure landing zone is the foundation for a successful cloud environment.

    Landing Zone Options

    There are different options or approaches when it comes to the implementation of landing zones:

    After studying the Azure landing zone documentation, I think my past conversations (mentioned at the beginning) were mostly about the so-called “migration landing zones”, which focus on deploying foundation infrastructure resources in Azure, that are then used to migrate virtual machines in to.

    C A F Migration landing zone, image describes what gets installed as part of C A F guidance for initial landing zone.

    The CAF Migration blueprint lays out a landing zone for your workloads. You still need to perform the assessment and migration of your Virtual Machines / Databases on top of this foundational architecture.

    Azure Landing Zones and Azure VMware Solution

    If you are looking for VMware-related information about Azure VMware Solution and Azure landing zones, have a look at the following resources:

    1. Azure landing zone review for Microsoft Azure VMware Solution
    2. Landing zone considerations for Azure VMware Solution

    Azure Landing Zone Accelerators

    When you browse through the Azure documentation, you will find out that Azure landing zones have two different kinds of subscriptions:

    • Platform landing zone: A platform landing zone is a subscription that provides shared services (identity, connectivity, management) to applications in application landing zones. Consolidating these shared services often improves operational efficiency. One or more central teams manage the platform landing zones. In the conceptual architecture (see figure 1), the “Identity subscription”, “Management subscription”, and “Connectivity subscription” represent three different platform landing zones. The conceptual architecture shows these three platform landing zones in detail. It depicts representative resources and policies applied to each platform landing zone.

    There’s a ready-made deployment experience called the Azure landing zone portal accelerator. The Azure landing zone portal accelerator deploys the conceptual architecture and applies predetermined configurations to key components such as management groups and policies. It suits organizations whose conceptual architecture aligns with the planned operating model and resource structure.

    • Application landing zone: An application landing zone is a subscription for hosting an application. You pre-provision application landing zones through code and use management groups to assign policy controls to them. In the conceptual architecture, the “Landing zone A1 subscription” and “Landing zone A2 subscription” represent two different application landing zones. The conceptual architecture shows only the “Landing zone A2 subscription” in detail. It depicts representative resources and policies applied to the application landing zone.

    Application landing zone accelerators help you deploy application landing zones. 

    Note: There is Azure VMware Solution landing zone accelerator available

    Azure Landing Zone Review

    Review your Azure platform readiness so adoption can begin, and assess your plan to create a landing zone to host workloads that you plan to build in or migrate to the cloud. This assessment is designed for customers with two or more years experience. If you are new to Azure, this assessment will help you identify investment areas for your adoption strategy.

    The official Microsoft assessments website describing this service and checklist further can be found here: https://learn.microsoft.com/en-us/assessments/21765fea-dfe6-4bc4-8bb7-db9df5a6f6c0/ 

    Conclusion

    AWS defines a landing zone as a “well-architected, multi-account AWS environment that is scalable and secure“. An AWS or Azure landing zone is a starting point from which customers can quickly launch and deploy workloads and apps with the right security and governance in mind and in place. It is not about technical decisions only, but it also involves business decisions to be made about account structure, identities, networking, access management, and security with an organization’s growth and business goals for the future.

    Landing zones have a lot to do with the right guardrails and policies in place.