Oracle Cloud Infrastructure 2025 Networking Professional Study Guide

Oracle Cloud Infrastructure 2025 Networking Professional Study Guide

When I first stepped into the world of cloud networking, it wasn’t through Oracle, AWS, or Azure. It was about 13 years ago, working at a small cloud service provider that ran its own infrastructure stack. We didn’t use hyperscale magic, we built everything ourselves.

Our cloud was stitched together with VMware vCloud Director, Cisco Nexus 1000v, physical Cisco routers and switches, and a good amount of BGP. We managed our own IP transits, IP peerings, created interconnects, configured static and dynamic routing, and deployed site-to-site VPNs for customers.

Years later, after moving into cloud-native networking and skilling up on Oracle Cloud Infrastructure (OCI), I realized how many of the same concepts apply, but with better tools, faster provisioning, and scalable security. OCI offers powerful services for building modern network topologies: Dynamic Routing Gateways, Service Gateways, FastConnect, Network Firewalls, and Zero Trust Packet Routing (ZPR).

This study guide is for anyone preparing for the OCI 2025 Networking Professional certification. 

Exam Objectives

Review the exam topics:

  • Design and Deploy OCI Virtual Cloud Networks (VCN)
  • Plan and Design OCI Networking Solutions and App Services
  • Design for Hybrid Networking Architectures
  • Transitive Routing
  • Implement and Operate Secure OCI Networking and Connectivity Solutions
  • Migrate Workloads to OCI
  • Troubleshoot OCI Networking and Connectivity Issues

VCN – Your Virtual Cloud Network

Think of a VCN as your private, software-defined data center in the cloud. It is where everything begins. Subnets, whether public or private, live inside it. You control IP address ranges (CIDRs), route tables, and security lists, which together determine who can talk to what and how. Every other networking component in OCI connects back to the VCN, making it the central nervous system of your cloud network.

Internet Gateway – Letting the Outside World In (and Out)

If your VCN needs to connect to the public internet – say, to allow inbound HTTP traffic to a web server or to allow your compute instances to fetch updates – you’ll need an Internet Gateway. It attaches to your VCN and enables this connectivity.

This image shows a simple layout of a VCN with a public subnet that uses an internet gateway.

But it is just one piece of the puzzle. You still need to configure route tables and security rules correctly. Otherwise, traffic won’t flow. 

Local Peering Gateway – Talking Across VCNs (in the Same Region)

When you have got multiple VCNs in the same OCI region, maybe for environment isolation or organizational structure, a Local Peering Gateway (LPG) allows them to communicate privately. No internet, no extra costs. Just fast, internal traffic. It’s especially useful when designing multi-VCN architectures that require secure east-west traffic flow within a single region.

This image shows the basic layout of two VCNs that are locally peered, each with a local peering gateway.

Dynamic Routing Gateway – The Multi-Path Hub

The Dynamic Routing Gateway (DRG) is like the border router for your VCN. If you want to connect to on-prem via VPN, FastConnect, or peer across regions, you’re doing it through the DRG. It supports advanced routing, enables transitive routing, and connects you to just about everything external. It’s your ticket to hybrid and multi-region topologies.

Remote Peering Connection – Cross-Region VCN Peering

Remote Peering Connections (RPCs) let you extend your VCN communication across regions. Let’s say you have got a primary environment in US East and DR in Germany, you’ll need a DRG in each region and an RPC between them. It’s all private, secure, and highly performant. And it’s one of the foundations for multi-region, global OCI architectures.

This image shows the basic layout of two VCNs that are remotely peered, each with a remote peering connection on the DRG

Note: Without peering, a given VCN would need an internet gateway and public IP addresses for the instances that need to communicate with another VCN in a different region. 

Service Gateway – OCI Services Without Public Internet

The Service Gateway is gold! It allows your VCN to access OCI services like Object Storage or Autonomous Database without going over the public internet. Traffic stays on the Oracle backbone, meaning better performance and tighter security. No internet gateway or NAT gateway is required to reach those specific services.

This image shows the basic layout of a VCN with a service gateway

NAT Gateway – Internet Access

A NAT Gateway allows outbound internet access for private subnets, while keeping those instances hidden from unsolicited inbound traffic. When a host in the private network initiates an internet-bound connection, the NAT device’s public IP address becomes the source IP address for the outbound traffic. The response traffic from the internet therefore uses that public IP address as the destination IP address. The NAT device then routes the response to the host in the private network that initiated the connection.

This image shows the basic layout of a VCN with a NAT gateway and internet gateway

Private Endpoints – Lock Down Your Services

With Private Endpoints, you can expose services like OKE, Functions, or Object Storage only within a VCN or peered network. It’s the cloud-native way to implement zero-trust within your OCI environment, making sure services aren’t reachable over the public internet unless you explicitly want them to be. You can think of the private endpoint as just another VNIC in your VCN. You can control access to it like you would for any other VNIC: by using security rules

This diagram shows a VCN with a private endpoint for a resource.

The private endpoint gives hosts within your VCN and your on-premises network access to a single resource within the Oracle service of interest (for example, one database in Autonomous Database Serverless). Compare that private access model with a service gateway (explained before):

If you created five Autonomous Databases for a given VCN, all five would be accessible through a single service gateway by sending requests to a public endpoint for the service. However, with the private endpoint model, there would be five separate private endpoints: one for each Autonomous Database, and each with its own private IP address.

The list of supported services with a service gateway can be found here.

Oracle Services Network (OSN) – The Private Path to Oracle

The Oracle Services Network is the internal highway for communication between your VCN and Oracle-managed services. It underpins things like the Service Gateway and ensures your service traffic doesn’t touch the public internet. When someone says “use OCI’s backbone,” this is what they’re talking about.

Network Load Balancer – Lightweight, Fast, Private

Network Load Balancer is a load balancing service which operates at Layer-3 and Layer-4 of the Open Systems Interconnection (OSI) model. This service provides the benefits of high availability and offers high throughput while maintaining ultra low latency. You have three modes in Network Load Balancer in which you can operate:

  • Full Network Address Translation (NAT) mode 
  • Source Preservation mode
  • Transparent (Source/Destination Preservation) mode

The Network Load Balancer service supports three primary network load balancer policy types:

  1. 5-Tuple Hash: Routes incoming traffic based on 5-Tuple (source IP and port, destination IP and port, protocol) Hash. This is the default network load balancer policy.
  2. 3-Tuple Hash: Routes incoming traffic based on 3-Tuple (source IP, destination IP, protocol) Hash.
  3. 2-Tuple Hash: Routes incoming traffic based on 2-Tuple (source IP, destination IP) Hash.

Site-to-Site VPN – The Hybrid Gateway

Connecting your on-premises network to OCI? The Site-to-Site VPN offers a quick, secure way to do it. It uses IPSec tunnels, and while it’s great for development and backup connectivity, you might find bandwidth a bit constrained for production workloads. That’s where FastConnect steps in.

When you set up Site-to-Site VPN, it has two redundant IPSec tunnels. Oracle encourages you to configure your CPE device to use both tunnels (if your device supports it).

This image shows Scenario B: a VCN with a regional private subnet and a VPN IPSec connection.

FastConnect – Dedicated, Predictable Connectivity

FastConnect gives you a private, dedicated connection between your data center and OCI. It’s the go-to solution when you need stable, high-throughput performance. It comes via Oracle partners, 3rd party providers, or colocations and bypasses the public internet entirely. In hybrid setups, FastConnect is the gold standard.

This image shows a colocation setup where you have two physical connections and virtual circuits to the FastConnect location.

Have a look at the FastConnect Redundancy Best Practices!

IPsec over FastConnect

You can also layer IPSec encryption over FastConnect, giving you the security of VPN and the performance of FastConnect. This is especially useful for compliance or regulatory scenarios that demand encryption at every hop, even over private circuits.

Diagram showing the termination ends of both virtual circuit and IPSec tunnel

Note: IPSec over FastConnect is available for all three connectivity models (partner, third-party provider, colocation with Oracle) and multiple IPSec tunnels can exist over a single FastConnect virtual circuit.

FastConnect – MACsec Encryption

FastConnect natively supports line-rate encryption between the FastConnect edge device and your CPE without concern for the cryptographic overhead associated with other methods of encryption, such as IPsec VPNs. With MACsec, customers can secure and protect all their traffic between on-premises and OCI from threats, such as intrusions, eavesdropping, and man-in-the-middle attacks. 

Border Gateway Protocol (BGP) – The Routing Protocol of the Internet

If you are using FastConnect, Site-to-Site VPN, or any complex DRG routing scenario, you are likely working with BGP. OCI uses BGP to dynamically exchange routes between your on-premises network and your DRG.

BGP enables route prioritization, failover, and smarter traffic engineering. You’ll need to understand concepts like ASNs, route advertisements, and local preference.

BGP is also essential in multi-DRG and transitive routing topologies, where path selection and traffic symmetry matter.

Transitive Routing

You can have a VCN that acts as a hub, routing traffic between spokes. This is crucial for building scalable, shared services architectures. Using DRG attachments and route rules, you can create full-mesh or hub-and-spoke topologies with total control. Transit Routing can also be used to transit from one OCI region to another OCI region leveraging the OCI backbone.

The three primary transit routing scenarios are:

  • Access between several networks through a single DRG with a firewall between networks
  • Access to several VCNs in the same region
  • Private access to Oracle services

This image shows the basic hub and spoke layout of VCNs along with the gateways required.

Inter-Tenancy Connectivity – Across Tenants

In multi-tenant scenarios, for example between business units or regions, inter-tenancy connectivity allows you to securely link VCNs across OCI accounts. This might involve shared DRGs or peering setups. It’s increasingly relevant for large enterprises where cloud governance splits resources across different tenancies but still needs seamless interconnectivity.

Network Firewall – Powered by Palo Alto Networks

The OCI Network Firewall is a managed, cloud-native network security service. It acts as a stateful, Layer 3 to 7 firewall that inspects and filters network traffic at a granular, application-aware level. You can think of it as a fully integrated, Oracle-managed instance of Palo Alto’s firewall technology with all the power of Palo Alto, but integrated into OCI’s networking fabric.

In this example, routing is configured from an on-premises network through a dynamic routing gateway (DRG) to the firewall. Traffic is routed from the DRG, through the firewall, and then from the firewall subnet to a private subnet

Diagram of routing from a DRG through a firewall, and then to a private subnet.

In this example, routing is configured from the internet to the firewall. Traffic is routed from the internet gateway (IGW), through the firewall, and then from the firewall subnet to a public subnet.

This diagram shows routing from the internet, through a firewall, and then to a public subnet.

In this example, routing is configured from a subnet to the firewall. Traffic is routed from Subnet A, through the firewall, and then from the firewall subnet to Subnet B.

This diagram shows routing from Subnet A, through a firewall, and then to Subnet B.

Zero Trust Packet Routing (ZPR)

Oracle Cloud Infrastructure Zero Trust Packet Routing (ZPR) protects sensitive data from unauthorized access through intent-based security policies that you write for the OCI resources that you assign security attributes to. Security attributes are labels that ZPR uses to identify and organize OCI resources. ZPR enforces policy at the network level each time access is requested, regardless of potential network architecture changes or misconfigurations.

ZPR is built on top of existing network security group (NSG) and security control list (SCL) rules. For a packet to reach a target, it must pass all NSG and SCL rules, and ZPR policy. If any NSG, SCL, or ZPR rule or policy doesn’t allow traffic, the request is dropped. 

Wrapping Up

OCI’s networking stack is deep, flexible, and modern. Whether you are an enterprise architect, a security specialist, or a hands-on cloud engineer, mastering these building blocks is key. Not just to pass the OCI 2025 Network Professional certification, but to design secure, scalable, and resilient cloud networks. 🙂

 

A Closer Look at VMware NSX Security

A Closer Look at VMware NSX Security

A customer of mine asked me a few days ago: “Is it not possible to get NSX Security features without the network virtualization capabilities?”. I wrote it already in my blog “VMware is Becoming a Leading Cybersecurity Vendor” that you do not NSX’s network virtualization editions or capabilities if you are only interested in “firewalling” or NSX security features.

If you google “nsx security”, you will not find much. But there is a knowledge base article that describes the NSX Security capabilities from the “Distributed Firewall” product line: Product offerings for NSX-T 3.2 Security (87077).

Believe it or not, there are customers that haven’t started their zero-trust or “micro-segmentation” journey yet. Segmentation is about preventing lateral (east-west) movement. The idea is to divide the data center infrastructure into smaller security zones and that the traffic between the zones (and between workloads) is inspected based on the organization’s defined policies.

Perimeter Defense vs Micro-Segmentation

If you are one of them and want to deliver east-west traffic introspection using distributed firewalls, then these NSX Security editions are relevant for you:

VMware NSX Distributed Firewall

  • NSX Distributed Firewall (DFW)
  • NSX DFW with Threat Prevention
  • NSX DFW with Advanced Threat Prevention

VMware NSX Gateway Firewall

  • NSX Gateway Firewall (GFW)
  • NSX Gateway Firewall with Threat Prevention
  • NSX Gateway Firewall with Advanced Threat Prevention

Network Detection and Response

  • Network Detection and Response (standalone on-premises offering)

Note: If you are an existing NSX customer using network virtualization, please have a look at Product offerings for VMware NSX-T Data Center 3.2.x (86095).

VMware NSX Distributed Firewall

The NSX Distributed Firewall is a hypervisor kernel-embedded stateful firewall that lets you create access control policies based on vCenter objects like datacenters and clusters, virtual machine names and tags, IP/VLAN/VXLAN addresses, as well as user group identity from Active Directory.

If a VM gets vMotioned to another physical host, you do not need to rewrite any firewall rules.

The distributed nature of the firewall provides a scale-out architecture that automatically extends firewall capacity when additional hosts are added to a data center.

Should you be interested in “firewalling” only, want to implement access controls for east-west traffic (micro-segmentation) only, but do not need threat prevention (TP) capabilities, then “NSX Distributed Firewall Edition” is perfect for you.

So, which features does the NSX DFW edition include?

The NSX DFW edition comes with these capabilities:

  • L2 – L4 firewalling
  • L7 Application Identity-based firewalling
  • User Identity-based firewalling
  • NSX Intelligence (flow visualization and policy recommendation)
  • Aria Operations for Logs (formerly known as vRealize Log Insight)

What is the difference between NSX DFW and NSX DFW with TP?

With “NSX DFW with TP”, you would get the following additional features:

  • Distributed Intrusion Detection Services (IDS)
  • Distributed Behavioral IDS
  • Distributed Intrusion Prevention Service (IPS)
  • Distributed IDS Event Forwarding to NDR

Where does the NSX Distributed Firewall sit?

This question comes up a lot because customers understand that this is not an agent-based solution but something that is built into the VMware ESXi hypervisor.

The NSX DFW sits in the virtual patch cable, between the VM and the virtual distributed switch (VDS):

NSX Distributed Firewall

Note: Prior to NSX-T Data Center 3.2, VMs must have their vNIC connected to an NSX overlay or VLAN segment to be DFW-protected. In NSX-T Data Center 3.2, distributed firewall protects workloads that are natively connected to a VDS distributed port group (DVPG).

VMware NSX Gateway Firewall

The NSX Gateway Firewall extends the advanced threat prevention (ATP) capabilities of the NSX Distributed Firewall to physical workloads in your private cloud. It is a software-only, L2 – L7 firewall that includes capabilities such as IDS and IPS, URL filtering and malware detection as well as routing and VPN functionality.

If you are not interested in ATP capabilities yet, you can start with the “NSX Gateway Firewall” edition. What is the difference between all NSX GFW editions?

VMware NSX GFW Editions

The NSX GFW can be deployed as a virtual machine or with an ISO image that can run on a physical server and it shares the same management console as the NSX Distributed Firewall.

API Security with Spring Cloud Gateway and Tanzu Service Mesh

API Security with Spring Cloud Gateway and Tanzu Service Mesh

Today, more than ever, both humans and machines consume or process data. We, humans, consume data through multiple applications that are hosted in different clouds from different devices like smartphones, laptops, and tablets. Companies are building applications that need to look good and work well on any platform/device.

At the same time, developers are building new applications following cloud-native principles. A cloud-native architecture is a design pattern for applications that are built for the cloud. Most cloud-native apps are organized as microservices which are used to break up larger applications into loosely coupled units that can be managed by smaller teams. Resilience and scale are achieved through horizontal scaling, distributed processing, and automated placement of failed components.

Different people have a different understanding of “cloud-native” and the chances are high that you will get different answers. Let us look at the official definition from CNCF:

“Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.

These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.”

12-Factor App

A widely accepted methodology for building cloud-based applications is the “Twelve-Factor Application”. It uses declarative formats for automation to minimize time and costs. It should offer maximum portability between execution environments and be suitable for the deployment on modern cloud platforms. The 12-factor methodology can be applied with any programming language and may use any combination of backing servers (caching, queuing, databases).

Interestingly, we now see other factors like API-first, telemetry, and security complementing this list.

While doing research for my book about “workload mobility and application portability”, I saw the term “API-first” many times.

Then I started to remember that VMware acquired Mesh7 a while ago and they announced Tanzu Service Mesh Enterprise last year at VMworld Europe (now known as VMware Explore). API security was even one of their main topics during the networking & security solutions keynote presented by Tom Gillis.

VMworld 2021 API Security

That is why I thought it is time to better understand this topic and write a piece about APIs. Let us start with some basics first.

What is an API?

An application programming interface (API) is a way for two or more software components to communicate with each other using a set of defined protocols and definitions. APIs are here to make the developer’s life easier.

I bet you have seen parts of Google Maps already embedded in different websites when you were looking for a specific business or restaurant location. Most websites and developers would use Google Maps in this case, because it just makes sense for us, right? That is why Google exposes the Google Maps API so developers can embed Google Maps objects very easily in a standardized way. Or have you seen anyone who wants to develop their own version of Google Maps?

In the case of enterprises, APIs are a very elegant way to share data with customers or other external users. Such public APIs like Google Maps APIs can be used by partners who then can access your data. And we all know that data is the new oil. Companies can make a lot of money today by sharing their data.

Even when using private APIs (internal use only), you decide who can access your API and data. This is one of the reasons why API security and API management become more important. You want to provide secure access when sensitive data is being exposed.

What is an API Gateway?

For microservices-based apps, it makes sense to implement an API gateway, because it can act as a single entry point for all API calls made to your system. And it doesn’t matter if your system/application is hosted on-premises, in the public cloud, or a combination of both. The API gateway takes care of the request (API call) and returns the requested data.

API Gateway Diagram

Image Source: https://www.tibco.com/reference-center/what-is-an-api-gateway 

API gateways can also handle other tasks like authentication, rate management, and statistics. This is important for example when you want to monetize some of your APIs by offering a service to consumers or other companies.

What is Spring Cloud Gateway for VMware Tanzu?

Spring Cloud Gateway for VMware Tanzu provides a simple way to route internal and external API requests to application services that expose APIs. This solution is based on the open-source Spring Cloud Gateway project and provides a library for building API gateways on top of Spring and Java.

Because it is intended that Spring Cloud Gateway sits between a requester and the resource that is being requested, it is in the position to intercept, analyze and modify requests.

Revitalize Legacy Apps with APIs

Before we had microservices, there were monolithic applications. An all-in-one application architecture, where all services are installed on the same virtual machine and depend on each other.

There are multiple reasons why such a monolith cannot be broken up into smaller pieces and modernized. Sometimes it’s not (technically) possible, not worth it, or it just takes too long. Hence many companies still use such monolithic (legacy) applications. The best example here is the mainframe which often still runs business-critical applications.

I always thought that my customers only have two options when modernizing applications:

  • Start from scratch (throw the old app away)
  • Refactor/Rewrite an application

Rewriting an application needs time and costs money. Imagine that you would refactor 50 of your applications, split these monoliths up in microservices, connect these hundreds or thousands of microservices, and at the same time must take care of security (e.g., vulnerabilities).

So, what are you going to do now?

APIs seem to provide a very cost-effective way to integrate some of the older applications with newer ones. With this approach, one can abstract away the data and services from the underlying (legacy) application infrastructure. APIs can extend the life of a legacy application and could be the start of a phased application modernization approach.

Tanzu Service Mesh Enterprise

At the moment, we only have an API gateway that sits in front of our microservices. Multiple (micro)services in an aggregated fashion create the API you want to expose to your internal or external customers. The question now is, how you do plan to expose this API when your microservices are distributed over one or more private or public clouds?

When we talk about APIs, we talk about data in motion. That is why we must secure this data that is sent from its source to any location. And you want to secure the application and data without increasing the application latency and decreasing the user’s experience.

Now it makes sense to me why VMware acquired Mesh7 in March 2021 and announced Tanzu Service Mesh Enterprise about 6 months later with these additional features:

  • API Security. API security is achieved through API vulnerability detection and mitigation, API baselining, and API drift detection (including API parameters and schema validation)
  • Personally Identifiable Information (PII) segmentation and detection. PII data is segmented using attribute-based access control (ABAC) and is detected via proper PII data detection and tracking, and end-user detection mechanisms.
  • API Security Visibility. API security is monitored using API discovery, security posture dashboards, and rich event auditing.

Final Words

APIs are used to connect different applications. They are also used to aggregate services or functions that can be consumed by other businesses or partners. Modern and containerized applications bring a large number of APIs with them, that can be hosted in any cloud.

With Spring Cloud Gateway and Tanzu Service Mesh Enterprise, VMware can deliver application connectivity services that enable improved developer experience and more secure operations.

It took me almost a year to realize the strengths of these (combined) products and why VMware for example acquired Mesh7. But it makes sense to me now. Even I do not completely understand all the key features of Spring Cloud Gateway and Tanzu Service Mesh.

Why AWS Developers Love VMware’s Lift and Learn Approach with VMware Cloud on AWS

Why AWS Developers Love VMware’s Lift and Learn Approach with VMware Cloud on AWS

Learn why AWS developers love VMware Cloud on AWS and want to present it to their internal platform team.

I had booth duty at the AWS Swiss Cloud Day 2022 and had the chance to finally talk to people that normally do not talk to VMware folks like me. I believe I had not a single infrastructure or cloud architect talking to me the whole day and I have been approached by Linux administrators and developers only. After I explained to them our partnership and capabilities with AWS, they were mind blown!

Michael, what is VMware’s business with AWS?”

Why are you here at the event, you are only a hypervisor company, right?

Haha, what are you guys doing here?

What is the reason for VMware coming here? You are a competitor of AWS, no?

Developers don’t want to do ops

Look, the developers did not know, that I have no developer background and spent most of my time with data centers. I already built true hybrid clouds almost 10 years ago before we had all the different hyperscalers and providers like Amazon Web Services. After I passed the AWS Solutions Architect Associate and AWS Developer Associate exams a few months ago, I finally understood better how complex software development and cloud migrations must be.

It is said that developers do not want to deal with operational concerns. And other developers want to understand the production environment to make sure that their code work. Additionally, we have the shift-left approach that puts more pressure on the developer’s shoulders, they do not have time for ops.

But after talking to a few developers, I had a light-bulb moment and the following truths came to the surface:

  • Developers had no clue how VMware can ease some of their pain
  • Developers liked my talk about infrastructure and ops
  • I need to bring more business cards to such events!!!

Developers are interested in infrastructure

Remember the questions from above? To answer the questions about VMware’s relevance or relationship with AWS, I used the first 2min to explain VMware Cloud on AWS to them. Yes, I started talking about infrastructure and not about Tanzu, developer experience, our open source projects, and contributions, or Tanzu Labs. The people visiting us at the booth were impressed that VMware and AWS have even specialists only focusing on this solution. Still, they were not convinced yet that VMware can do something good for them.

VMC on AWS Overview

Okay, I got it. So what? What is the value?

How would someone with a VMware background answer such a question? Most of us usually see this situation as the right moment to talk about use cases like:

  • Data center exit or refresh (infrastructure modernization)
  • Burst Capacity
  • Low latency to AWS native services
  • Application modernization
  • Cloud migrations

So, which of these use cases are relevant and important to developers?

The developer’s story

The developers confirmed some statements of mine:

  • Cloud migrations take long and are not easy
  • Lift & shift migrations involve a lot of manual tasks
  • They either have to refactor their app on-premises first and then move to the public cloud or start from scratch on AWS

I say it again, software development is complex. Developers need to modernize existing applications on-premises and then migrate them somehow to AWS because you cannot always start from scratch.

Imagine this: You have an application that was deployed and operated for years in your data centers. Most probably you don’t even understand all the dependencies and configurations anymore since the years have passed. Maybe you are not even the guy who initially developed this application.

Note: The only thing that can be assumed, is, that your infrastructure is most likely running on a VMware-based cloud.

Now you need to start modernizing this application, which takes months or even years. When you are done with your task, you have to figure out how to bring this application over to AWS. Because you had to spend all your time refactoring this application, there was no time to build new AWS skills. At least not during normal office hours.

Lift and shift is easy, right?

Nope. When it would be easy, why does the migration in most cases take longer than expected and cost more than expected? When you have to exit a data center for any reason and need to bring some of your workloads over to a public cloud like AWS, then a lift and shift approach is the best and fastest approach. But somehow organizations do not see much value in using this approach during their cloud adoption. At least not with VMware.

But if a consulting firm or AWS themselves tell the customer, that lift and shift is a good idea, their customers suddenly see the benefit even if they have to add millions to their estimated budget. Consulting firms are not cheap, and neither are lift and shift projects with different underlying technologies like having VMware as the source site on-premises and AWS (or any other public cloud provider) as the destination. But hey, good for your company if they have this extra money.

AWS Lift and Shift

Lift and shift brings no innovation

Different organizations have different agendas and goals. For some, solely running their virtual machines and containers, and using cloud native services is enough for them – no matter the costs. Others expect that economies of scale bring the necessary cost advantages over time while they implement and deliver innovation.

That is why some companies see lift and shift as the approach, which brings no innovation. It is complex, not easy, takes longer, costs more and in the end, you don’t use cloud native services (yet).

It is time now to change the perspective and narrative because I get why you think that lift and shift brings no innovation.

Forget Lift and Shift – Do Lift and Learn

So, our use case here is application modernization. A developer needs to modernize and migrate an application, ideally at the same time. No wonder why some of you may think that lift and shift brings no innovation: because you modernize later. 

Developers struggle. They struggle very much. After I explained VMware Cloud on AWS and mentioned, that a lift and learn approach is the better way that makes their life much easier, they asked me for my business card. It took less than 24h until I received my first two e-mails to organize a meeting.

Give developers more time.

Developers and ops teams need to have enough time to skill up, to learn and understand the new things. You have to break and fix things first in the new world before you can truly understand it. They loved the idea of lift and learn:

  1. Lift and shift your applications first with VMware Cloud on AWS. A true hybrid cloud approach, where the technology format is the same (on-prem and on AWS), will speed up your cloud adoption timeline and therefore save costs. Your workload now runs in the public cloud. Check!
  2. Since the cloud migration didn’t take 12 months, but more something like 3-4 months, your developers can use the additional time to learn and understand how to build things on AWS! The developers are happy because they have less pressure now and can play around with new stuff.
  3. After they have understood the new world, they can start modernizing different parts of the application. What has started with a legacy/traditional application, becomes a hybrid application and eventually a fully modernized app over time.

Figure 4. Connectivity examples for AWS Cloud storage services

The stepping stone to becoming cloud native

Some of you may think now that VMware and its solution with VMC on AWS is just a temporary solution before going completely, cloud native. Let us take a step back again quickly.

When I joined VMware in 2018, they talked about 70mio workloads running on their platform. This year at VMware Explore (formerly VMworld) they showed several 85mio VMware-based workloads. This is proof to me, that:

  • the cloud adoption does not happen as fast as expected,
  • on-premises data centers and VMware is not legacy,
  • VMware is more than only a “hypervisor” company,
  • cloud native and container-based workloads do not always make sense and
  • virtual machines are still going to exist for a while.

These are some pointers to why AWS has this partnership with VMware. As you can see, VMware is very strategic and relevant and should be part of every cloud and application modernization conversation.

Call to action

Just because a lot of people say that developers do not care about ops and are not interested in talking to “infrastructure guys” like me, does not mean that this statement/assumption is true. My conversations from AWS Swiss Cloud Day 2022 clearly showed that developers need to know more about the options and value that companies like VMware can bring to the table.

Do not let developers only talk to developers. Do lift and learn.

VMware Explore US 2022 – VMware Projects and Day 2 Announcements

VMware Explore US 2022 – VMware Projects and Day 2 Announcements

Last year at VMworld 2021, VMware mentioned and announced a lot of (new) projects they are working on. What happened to them and which new VMware projects have been mentioned this year at VMware Explore so far?

Project Ensemble – VMware Aria Hub

VMware unveiled their unified multi-cloud management portfolio called VMware Aria, which provides a set of end-to-end solutions for managing the cost, performance, configuration, and delivery of infrastructure and cloud native applications.

VMware Aria is anchored by VMware Aria Hub (formerly known as Project Ensemble), which provides centralized views and controls to manage the entire multi-cloud environment, and leverages VMware Aria Graph to provide a common definition of applications, resources, roles, and accounts.

VMware Aria Graph provides a single source of truth that is updated in near-real time. Other solutions on the market were designed in a slower moving era, primarily for change management processes and asset tracking. By contrast, VMware Aria Graph is designed expressly for cloud-native operations.

VMware Explore US 2022 Session: A Unified Cloud Management Control Plane – Update on Project Ensemble [CMB2210US]

Project Monterey – DPU-based Acceleration for NSX

Last year introduced as Project Monterey and in technology preview, VMware announced the GA version of Monterey called DPU-based Acceleration for NSX yesterday.

Project Arctic – vSphere+ and vSAN+

Project Arctic has been introduced last year as a Technology Preview and was described as “the next step in the evolution of vSphere in a multi-cloud world”. What has started with the idea of bringing VMware Cloud services closer to vSphere, has evolved to a even more interesting and enterprise-ready version called vSphere+ and vSAN+. It includes developer services that consist of the Tanzu Kubernetes Grid runtime, Tanzu Mission Control Essentials and NSX Advanced Load Balancer Essentials. VMware is going to add more and more VMware Cloud add-on services in the future. Additionally, VMware even introduced VMware Cloud Foundation+.

Project Iris – Application Transformer for VMware Tanzu

VMware mentioned Project Iris very briefly last year at VMworld. In February 2022, Project Iris became generally available and is since then known as Application Transformer for VMware Tanzu.

Project Northstar

At VMware Explore on day 1, VMware introduced Project Northstar, which will provide customers a centralized cloud console that gives them instant access to networking and security services, such as network and security policy controls, Network Detection and Response (NDR), NSX Intelligence, Advanced Load Balancing (ALB), Web Application Firewall (WAF), and HCX. Project Northstar will be able to apply consistent networking and security policies across private cloud, hybrid cloud, and multi-cloud environments.

Graphical user interface Description automatically generated

VMware Explore US 2022 Session: Multi-Cloud Networking and Security with NSX [NETB2154US]

Project Watch

At VMware Explore on day 1,VMware unveiled Project Watch, a new approach to multi-cloud networking and security that will provide advanced app-to-app policy controls to help with continuous risk and compliance assessment. In technology preview, Project Watch will help network security and compliance teams to continuously observe, assess, and dynamically mitigate risk and compliance problems in composite multi-cloud applications.

Project Trinidad

Also announced at VMware Explore day 1 and further explained at day 2, Project Trinidad extends VMware’s API security and analytics by deploying sensors on Kubernetes clusters and uses machine learning with business logic inference to detect anomalous behavior in east-west traffic between microservices.

Project Narrows

Project Narrows introduces a unique addition to Harbor, allowing end users to assess the security posture of Kubernetes clusters at runtime. Images previously undetected, will be scanned at the time of introduction to a cluster, so vulnerabilities can now be caught, images may be flagged, and workloads quarantined.

Project Narrows adding dynamic scanning to your software supply chain with Harbor is critical. It allows greater awareness and control of your running workloads than the traditional method of simply updating and storing workloads.

VMware is open sourcing the initial capabilities of Project Narrows on GitHub as the Cloud Native Security Inspector (CNSI) Project.

VMware Explore US 2022 Session: Running App Workloads in a Trusted, Secure Kubernetes Platform [VIB1443USD]

Project Keswick

Also introduced on day 2, Project Keswick is about simplifying edge deployments at scale. It comes as an xLabs project coming out of the Advanced Technology Group in VMware’s Office of the CTO.

Bild

A Keswick deployment is entirely automated and uses Git as a single source of truth for a declarative way to manage your infrastructure and applications through desired state configuration enabled by GitOps. This ensures the infrastructure and applications running at the edge are always exactly what they need to be.

VMware Explore US 2022 Session: Edge Computing: What’s Next? [VIB1457USD]

Project Newcastle

At VMworld 2021, VMware talked the first time (I think) about cryptographic agility and even showed a short demo of a Post Quantum Cryptography (PQC) enabled Unified Access Gateway (using a proxy-based approach): 

Diagram of an HAProxy with TLS Termination and Quantum-Safe Cipher Support as a reverse proxy to communicate with a quantum-safe web browser.

At VMware Explore 2022 day 2, VMware demonstrated what they believe to be the world’s first quantum-safe multi-cloud application!

VMware developed and presented Project Newcastle, a policy-based framework enabling and orchestrating cryptographic transition in modern applications.

Integrated with Tanzu Service Mesh, Project Newcastle gives users greater insight into the cryptography in their applications. But that’s not all — as a platform for cryptographic agility, Project Newcastle automates the process of reconfiguring an application’s cryptography to comply with user-defined policies and industry standards.

Closing Comment

Which VMware projects excite you the most? I’m definitely going with Project Ensemble (Aria Hub) and Project Newcastle!

Interclouds And The Future of Cloud Computing

Interclouds And The Future of Cloud Computing

I am finally taking the time to write this piece about interclouds, workload mobility and application portability. Some of my engagements during the past four weeks led me several times to discussions about interclouds and workload mobility.

Cloud to Cloud Interoperability and Federation

Who has thought back in 2012 that we will have so many (public) cloud providers like AWS, Azure, Google Cloud, IBM Cloud, Oracle Cloud etc. in 2022?

10 years ago, many people and companies were convinced that the future consists of public cloud infrastructure only and that local self-managed data centers are going to disappear.

This vision and perception of cloud computing has dramatically changed over the past few years. We see public cloud providers stretching their cloud services and infrastructure to large data centers or edge locations. It seems they realized, that the future is going to look differently than a lot of people anticipated back then.

I was not aware that the word “intercloud” and the need for it exists for a long time already apparently. Let’s take David Bernstein’s presentation as an example, which I found by googling “intercloud”:

This presentation is about avoiding the mistake of using proprietary protocols and cloud infrastructures that lead to silos and a non-interoperable architecture. He was part of the IEEE Intercloud Working Group (P2302) which was working on a standard for “Intercloud Interoperability and Federation (SIIF)” (draft), which mentioned the following:

Currently there are no implicit and transparent interoperability standards in place in order for disparate
cloud computing environments to be able to seamlessly federate and interoperate amongst themselves.
Proposed P2302 standards are a layered set of such protocols, called “Intercloud Protocols”, to solve the interoperability related challenges. The P2302 standards propose the overall design of decentralized, scalable, self-organizing federated “Intercloud” topology.

David Bernstein Intercloud

I do not know David Bernstein and the IEEE working group personally, but it would be great to hear from some of them, what they think about the current cloud computing architectures and how they envision the future of cloud computing for the next 5 or 10 years.

As you can see, the wish for an intercloud protocol or an intercloud exists since a while. Let us quickly have a look how others define intercloud:

Cisco in 2008 (it seems that David Bernstein worked at Cisco that time). Intercloud is a network of clouds that are linked with each other. This includes private, public, and hybrid clouds that come together to provide a seamless exchange of data.

teradata. Intercloud is a cloud deployment model that links multiple public cloud services together as one holistic and actively orchestrated architecture. Its activities are coordinated across these clouds to move workloads automatically and intelligently (e.g., for data analytics), based on criteria like their cost and performance characteristics.

The Future of Cloud Computing

I found this post on Twitter on May 19th, 2022:

Alvin Cheung Berkeley Intercloud

Alvin Cheung is an associate professor at Berkeley EECS and wrote the following in his Twitter comments:

we argue that cloud computing will evolve to a new form of inter-cloud operation: instead of storing data and running code on a single cloud provider, apps will run on an inter-operating set of cloud providers to leverage their specialized services / hw / geo etc, much like ISPs.

Alvin and his colleagues wrote a publication which states “A Berkeley View on the Future of Cloud Computing” that mentions the following very early in the PDF:

We predict that this market, with the appropriate intermediation, could evolve into one with a far greater emphasis on compatibility, allowing customers to easily shift workloads between clouds.

[…] Instead, we argue that to achieve this goal of flexible workload placement, cloud computing will require intermediation, provided by systems we call intercloud brokers, so that individual customers do not have to make choices about which clouds to use for which workloads, but can instead rely on brokers to optimize their desired criteria (e.g., price, performance, and/or execution location).

We believe that the competitive forces unleashed by the existence of effective intercloud brokers will create a thriving market of cloud services with many of those services being offered by more than one cloud, and this will be sufficient to significantly increase workload portability.

Intercloud Broker

Organizations place their workloads in that cloud which makes the most sense for them. Depending on different regulations, data classification, different cloud services, locations, or pricing, they then decide which data or workload goes to which cloud.

The people from Berkeley do not necessarily promote a multi-cloud architecture, but have the idea of an intercloud broker that places your workload on the right cloud based on different factors. They see the intercloud as an abstraction layer with brokering services:

In my understanding their idea goes towards the direction of an intelligent and automated cloud management platform that takes the decision where a specific workload and its data should be hosted. And that it, for example, migrates the workload to another cloud which is cheaper than the current one.

Cloud Native Technologies for Multi-Cloud

Companies are modernizing/rebuilding their legacy applications or create new modern applications using cloud native technologies. Modern applications are collections of microservices, which are light, fault tolerant and small. These microservices can run in containers deployed on a private or public cloud.

Which means, that a modern application is something that can adapt to any environment and perform equally well.

The challenge today is that we have modern architectures, new technologies/services and multiple clouds running with different technology stacks. And we have Kubernetes as framework, which is available in different formats (DIY or offerings like Tanzu TKG, AKS, EKS, GKE etc.)

Then there is the Cloud Native Computing Foundation (CNCF) and the open source community which embrace the principal of “open” software that is created and maintained by a community.

It is about building applications and services that can run on any infrastructure, which also means avoiding vendor or cloud lock-in.

Challenges of Interoperability and Multiple Clouds

If you discuss multi-cloud and infrastructure independent applications, you mostly end up with an endless list of questions like:

  • How can we achieve true workload mobility or application portability?
  • How do we deal with the different technology formats and the “language” (API) of each cloud?
  • How can we standardize and automate our deployments?
  • Is latency between clouds a problem?
  • What about my stateful data?
  • How can we provide consistent networking and security?
  • What about identity federation and RBAC?
  • Is the performance of each cloud really the same?
  • How should we encrypt traffic between services in multiple clouds?
  • What about monitoring and observability?

Workload Mobility and Application Portability without an Intercloud

VMware has a different view and approach how workload mobility and application portability can be achieved.

Their value add and goal is the same, but with a different strategy of abstracting clouds.

VMware is not building an intercloud but they provide customer a  technology stack (compute, storage, networking), or a cloud operating system if you will, that can run on top of every major public cloud provider like AWS, Azure, Google Cloud, IBM Cloud, Oracle Cloud and Alibaba Cloud.

VMware Workload Mobility

This consistent infrastructure makes it especially for virtual machines and legacy applications extremely easy to be migrated to any location.

What about modern applications and Kubernetes? What about developers who do not care about (cloud) infrastructures?

Project Cascade

At VMworld 2021, VMware announced the technology preview of “Project Cascade” which will provide a unified Kubernetes interface for both on-demand infrastructure (IaaS) and containers (CaaS) across VMware Cloud – available through an open command line interface (CLI), APIs, or a GUI dashboard.

The idea is to provide customers a converged IaaS and CaaS consumption service across any cloud, exposed through different Kubernetes APIs.

VMware Project Cascade

I heard the statement “Kubernetes is complex and hard” many times at KubeCon Europe 2022 and Project Cascade is clearly providing another abstraction layer for VM and container orchestration that should make the lives of developers and operators less complex.

Project Ensemble

Another project in tech preview since VMworld last year is “Project Ensemble“. It is about multi-cloud management platform that provides an app-centric self-service portal with predictive support.

Project Ensemble will deliver a unified consumption surface that meets the unique needs of the cloud administrator and SRE alike. From an architectural perspective, this means creating a platform designed for programmatic consumption and a firm “API First” approach.

I can imagine that it will be a service that leverages artificial intelligence and machine learning to simplify troubleshooting and that is capable in the future to intelligently place or migrate your workloads to the appropriate or best cloud (for example based on cost) including all attached networking and security policies.

Conclusion

I believe that VMware is on the right path by giving customers the option to build a cloud-agnostic infrastructure with the necessary abstraction layers for IaaS and CaaS including the cloud management platform. By providing a common way or standard to run virtual machines and containers in any cloud, I am convinced, VMware is becoming the defacto standard for infrastructure for many enterprises.

VMware Vision and Strategy 2022

By providing a consistent cloud infrastructure and a consistent developer model and experience, VMware bridges the gap between the developers and operators, without the need for an intercloud or intercloud protocol. That is the future of cloud computing.

 

Other relevant resources: