The State of Application Modernization 2025

The State of Application Modernization 2025

Every few weeks, I find myself in a conversation with customers or colleagues where the topic of application modernization comes up. Everyone agrees that modernization is more important than ever. The pressure to move faster, build more resilient systems, and increase operational efficiency is not going away.

But at the same time, when you look at what has actually changed since 2020… it is surprising how much has not.

We are still talking about the same problems: legacy dependencies, unclear ownership, lack of platform strategy, organizational silos. New technologies have emerged, sure. AI is everywhere, platforms have matured, and cloud-native patterns are no longer new. And yet, many companies have not even started building the kind of modern on-premises or cloud platforms needed to support next-generation applications.

It is like we are stuck between understanding why we need to modernize and actually being able to do it.

Remind me, why do we need to modernize?

When I joined Oracle in October 2024, some people reminded me that most of us do not know why we are where we are. One could say that it is not important to know that. In my opinion, it very much is. Something has fundamentally changed in the past that has led us to our situation.

In the past, when we moved from physical servers to virtual machines (VMs), apps did not need to change. You could lift and shift a legacy app from bare metal to a VM and it would still run the same way. The platform changed, but the application did not care. It was an infrastructure-level transformation without rethinking the app itself. So, the transition (P2V) of an application was very smooth and not complicated.

But now? The platform demands change.

Cloud-native platforms like Kubernetes, serverless runtimes, or even fully managed cloud services do not just offer a new home. They offer a whole new way of doing things. To benefit from them, you often have to re-architect how your application is built and deployed.

That is the reason why enterprises have to modernize their applications.

What else is different?

User expectations, business needs, and competitive pressure have exploded as well. Companies need to:

  • Ship features faster
  • Scale globally
  • Handle variable load
  • Respond to security threats instantly
  • Reduce operational overhead

A Quick Analogy

Think of it like this: moving from physical servers to VMs was like transferring your VHS tapes to DVDs. Same content, just a better format.

But app modernization? That is like going from DVDs to Netflix. You do not just change the format, but you rethink the whole delivery model, the user experience, the business model, and the infrastructure behind it.

Why Is Modernization So Hard?

If application modernization is so powerful, why is not everyone done with it already? The truth is, it is complex, disruptive, and deeply intertwined with how a business operates. Organizations often underestimate how much effort it takes to replatform systems that have evolved over decades. Here are 6 common challenges companies face during modernization:

  1. Legacy Complexity – Many existing systems are tightly coupled, poorly documented, and full of business logic buried deep in spaghetti code. 
  2. Skill Gaps – Moving to cloud-native tech like Kubernetes, microservices, or DevOps pipelines requires skills many organizations do not have in-house. Upskilling or hiring takes time and money.
  3. Cultural Resistance – Modernization often challenges organizational norms, team structures, and approval processes. People do not always welcome change, especially if it threatens familiar workflows.
  4. Data Migration & Integration – Legacy apps are often tied to on-prem databases or batch-driven data flows. Migrating that data without downtime is a massive undertaking.
  5. Security & Compliance Risks – Introducing new tech stacks can create blind spots or security gaps. Modernizing without violating regulatory requirements is a balancing act.
  6. Cost Overruns – It is easy to start a cloud migration or container rollout only to realize the costs (cloud bills, consultants, delays) are far higher than expected.

Modernization is not just a technical migration. It’s a transformation of people, process, and platform (technology). That is why it is hard and why doing it well is such a competitive advantage!

Technical Debt Is Also Slowing Things Down

Also known as the silent killer of velocity and innovation: technical debt

Technical debt is the cost of choosing a quick solution now instead of a better one that would take longer. We have all seen/done it. 🙂 Sometimes it is intentional (you needed to hit a deadline), sometimes it is unintentional (you did not know better back then). Either way, it is a trade-off. And just like financial debt, it accrues interest over time.

Here is the tricky part: technical debt usually doesn’t hurt you right away. You ship the feature. The app runs. Management is happy.

But over time, debt compounds:

  • New features take longer because the system is harder to change

  • Bugs increase because no one understands the code

  • Every change becomes risky because there is no test safety net

Eventually, you hit a wall where your team is spending more time working around the system than building within it. That is when people start whispering: “Maybe we need to rewrite it.”  Or they just leave your company.

Let me say it: Cloud Can Also Introduce New Debt

Cloud-native architectures can reduce technical debt, but only if used thoughtfully.

You can still:

  • Over-complicate microservices

  • Abuse Kubernetes without understanding it

  • Ignore costs and create “cost debt”

  • Rely on too many services and lose track

Use the cloud to eliminate debt by simplifying, automating, and replacing legacy patterns, not just lifting them into someone else’s data center.

It Is More Than Just Moving to the Cloud 

Modernization is about upgrading how your applications are built, deployed, run, and evolved, so they are faster, cheaper, safer, and easier to change. Here are some core areas where I saw organizations are making real progress:

  • Improving CI/CD. You can’t build modern applications if your delivery process is stuck in 2010.
  • Data Modernization. Migrate from monolithic databases to cloud-native, distributed ones.
  • Automation & Infrastructure as Code. It is the path to resilience and scale.
  • Serverless Computing. It is the “don’t worry about servers” mindset and ideal for many modern workloads.
  • Containerizing Workloads. Containers are a stepping stone to microservices, Kubernetes, and real DevOps maturity.
  • Zero-Trust Security & Cybersecurity Posture. One of the biggest priorities at the moment.
  • Cloud Migration. It is not about where your apps run. it is about how well they run there. “The cloud” should make you faster, safer, and leaner.

As you can see, application modernization is not one thing, it’s many things. You do not have to do all of these at once. But if you are serious about modernizing, these points (any more) must be part of your blueprint. Modernization is a mindset.

Why (replatforming) now?

There are a few reasons why application modernization projects are increasing:

  • The maturity of cloud-native platforms: Kubernetes, managed databases, and serverless frameworks have matured to the point where they can handle serious production workloads. It is no longer “bleeding edge”
  • DevOps and Platform Engineering are mainstream: We have shifted from siloed teams to collaborative, continuous delivery models. But that only works if your platform supports it.
  • AI and automation demand modern infrastructure: To leverage modern AI tools, event-driven data, and real-time analytics, your backend can’t be a 2004-era database with a web front-end duct-taped to it.

Conclusion

There is no longer much debate: (modern) applications are more important than ever. Yet despite all the talk around cloud-native technologies and modern architectures, the truth is that many organizations are still trying to catch up and work hard to modernize not just their applications, but also the infrastructure and processes that support them.

The current progress is encouraging, and many companies have learned from the experience of their first modernization projects.

One thing that is becoming harder to ignore is how much the geopolitical situation is starting to shape decisions around application modernization and cloud adoption. Concerns around data sovereignty, digital borders, national cloud regulations, and supply chain security are no longer just legal or compliance issues. They are shaping architecture choices.

Some organizations are rethinking their cloud and modernization strategies, looking at multi-cloud or hybrid models to mitigate risk. Others are delaying cloud adoption due to regional uncertainty, while a few are doubling down on local infrastructure to retain control. It is not just about performance or cost anymore, but also about resilience and autonomy.

The global context (suddenly) matters, and it is influencing how platforms are built, where data lives, and who organizations choose to partner with. If anything, it makes the case even stronger for flexible, portable, cloud-native architectures. So you are not locked into a single region or provider.

From Monolithic Data Centers to Modern Private Clouds

From Monolithic Data Centers to Modern Private Clouds

Behind every shift from old-school to new-school, there is a bigger story about people, power, and most of all, trust. And nowhere is that clearer than in the move from traditional monolithic data centers to what we now call a modern private cloud infrastructure.

A lot of people still think this evolution is just about better technology, faster hardware, or fancier dashboards. But it is not. If you zoom out, the core driver is not features or functions, it is trust in the executive vision, and the willingness to break from the past.

Monolithic data centers stall innovation

But here is the problem: monoliths do not scale in a modern world (or cloud). They slow down innovation, force one-size-fits-all models, and lock organizations into inflexible architectures. And as organizations grew, the burden of managing these environments became more political than practical.

The tipping point was not when better tech appeared. It was when leadership stopped trusting that the monolithic data centers with the monolithic applications could deliver what the business actually needed. That is the key. The failure of monolithic infrastructure was not technical – it was cultural.

Hypervisors are not the platform you think

Let us make that clear: hypervisor are not platforms! They are just silos and one piece of a bigger puzzle.

Yes, they play a role in virtualization. Yes, they helped abstract hardware and brought some flexibility. But let us not overstate it, they do not define modern infrastructure or a private cloud. Hypervisors solve a problem from a decade ago. Modern private infrastructure is not about stacking tools, it is about breaking silos, including the ones created by legacy virtualization models.

Private Cloud – Modern Infrastructure

So, what is a modern private infrastructure? What is a private cloud? It is not just cloud-native behind your firewall. It is not just running Kubernetes on bare metal. It is a mindset.

You do not get to “modern” by chasing features or by replacing one virtualization solution with another vendor. You get there by believing in the principles of openness, automation, decentralization, and speed. And that trust has to start from the top. If your CIO or CTO is still building for audit trails and risk reduction as their north star, you will end up with another monolithic data center stack. Just with fancier logos.

But if leadership leans into trust – trust in people, in automation, in feedback loops – you get a system that evolves. Call it modern. Call it next-gen.

It was never about the technology

We moved from monolithic data centers not because the tech got better (though it did), but because people stopped trusting the old system to serve the new mission.

And as we move forward, we should remember: it is not hypervisors or containers or even clouds that shape the future. It is trust in execution, leadership, and direction. That is the real platform everything else stands on. If your architecture still assumes manual control, ticketing systems, and approvals every step of the way, you are not building a modern infrastructure. You are simply replicating bureaucracy in YAML. A modern infra is about building a cloud that does not need micro-management.

Platform Thinking versus Control

A lot of organizations say they want a platform, but what they really want is control. Big difference.

Platform thinking is rooted in enablement. It is about giving teams consistent experiences, reusable services, and the freedom to ship without opening a support ticket every time they need a VM or a namespace.

And platform thinking only works when there is trust as well:

  • Trust in dev teams to deploy responsibly
  • Trust in infrastructure to self-heal and scale
  • Trust in telemetry and observability to show the truth

Trust is a leadership decision. It starts when execs stop treating infrastructure as a cost center and start seeing it as a product. Something that should deliver value, be measured, and evolve.

It is easy to get distracted. A new storage engine, a new control plane, a new AI-driven whatever. Features are tempting because they are measurable. You can point at them in a dashboard or a roadmap.

But features don’t create trust. People do. The most advanced platform in the world is useless if teams do not trust it to be available, understandable, and usable. 

So instead of asking “what tech should we buy?”, the real question is:

“Do we trust ourselves enough to let go of the old way?”

Because that is what building a modern private cloud is really about.

Trust at Scale

In Switzerland, we like things to work. Predictably. Reliably. On time. With the current geopolitical situation in the world, and especially when it comes to public institutions, that expectation is non-negotiable.

The systems behind those services are under more pressure than ever. Demands are rising and talent is shifting. Legacy infrastructure is getting more fragile and expensive. And at the same time, there is this quiet but urgent question being asked in every boardroom and IT strategy meeting:

Can we keep up without giving up control?

Public sector organizations (not only in Switzerland) face a unique set of constraints:

  • Critical infrastructure cannot go down, ever
  • Compliance and data protection are not just guidelines, they are legal obligations
  • Internal IT often has to serve a wide range of users, platforms, and expectations

So, it is no surprise that many of these organizations default to monolithic, traditional data centers. The logic is understandable: “If we can touch it, we can control it.”

But here is the reality though: control does not scale. And legacy does not adapt. Staying “safe” with old infrastructure might feel responsible, but it actually increases long-term risk, cost, and technical debt. There is a temptation to approach modernization as a procurement problem: pick a new vendor, install a new platform, run a few migrations, and check the box. Done.

But transformation doesn’t work that way. You can’t buy your way out of a culture that does not trust change.

In understand, this can feel uncomfortable. Many institutions are structured to avoid mistakes. But modern IT success requires a shift from control to resilience, and it is not about perfection. It is only perfect until you need to adapt again.

How to start?

By now, it is clear: modern private cloud infrastructure is not about chasing trends or blindly “moving to the cloud.” It’s about designing systems that reflect what your organization values: reliability, control, and trust, while giving teams the tools to evolve. But that still leaves the hardest question of all:

Where do we start?

First, ransparency is the first ingredient of trust. You can’t fix what you won’t name.

Second, modernizing safely does not mean boiling the ocean. It means starting with a thin slice of the future.

The goal is to identify a use case where you can:

  • Show real impact in under six months

  • Reduce friction for both IT and internal users

  • Create confidence that change is possible without risk

In short, it is about finding use cases with high impact but low risk.

Third, this is where a lot of transformation efforts stall. Organizations try to modernize the tech, but keep the old permission structures. The result? A shinier version of the same bottlenecks. Instead, shift from control to guardrails. Think less about who can approve what, and more about how the system enforces good behavior by default. For example:

  • Implement policy-as-code: rules embedded into the platform, not buried in documents

  • Automate security scans, RBAC, and drift detection

  • Give teams safe, constrained freedom instead of needing to ask for access

Guardrails enable trust without giving up safety. That’s the core of a modern infrastructure (private or public cloud).

And lastly, make trust measurable. Not just with uptime numbers or dashboards but with real signals:

  • Are teams delivering faster?

  • Are incidents down?

  • etc.

Make this measurable, visible, and repeatable. Success builds trust. Trust creates momentum.

Final Thoughts

IT organizations do not need moonshots. They need measured, meaningful modernization. The kind that builds belief internally, earns trust externally, and makes infrastructure feel like an asset again.

The technology matters, but how you introduce it matters even more. 

Private Cloud Autarky – You Are Safe Until The World Moves On

Private Cloud Autarky – You Are Safe Until The World Moves On

I believe it was 2023 when the term “autarky” was mentioned during my conversations with several customers, who maintained their own data centers and private clouds. Interestingly, this word popped up again recently at work, but I only knew it from photovoltaic systems. And it kept my mind busy for several weeks.

What is autarky?

To understand autarky in the IT world and its implications for private clouds, an analogy from the photovoltaic (solar power) system world offers a clear parallel. Just as autarky in IT means a private cloud that is fully self-sufficient, autarky in photovoltaics refers to an “off-grid” solar setup that powers a home or facility without relying on the external electrical grid or outside suppliers.

Imagine a homeowner aiming for total energy independence – an autarkic photovoltaic system. Here is what it looks like:

  • Solar Panels: The homeowner installs panels to capture sunlight and generate electricity.
  • Battery: Excess power is stored in batteries (e.g., lithium-ion) for use at night or on cloudy days.
  • Inverter: A device converts solar DC power to usable AC power for appliances.
  • Self-Maintenance: The homeowner repairs panels, replaces batteries, and manages the system without calling a utility company or buying parts. 

This setup cuts ties with the power grid – no monthly bills, no reliance on power plants. It is a self-contained energy ecosystem, much like an autarkic private cloud aims to be a self-contained digital ecosystem.

Question: Which partner (installation company) has enough spare parts and how many homeowners can repair the whole system by themselves?

Let’s align this with autarky in IT:

  • Solar Panels = Servers and Hardware: Just as panels generate power, servers (compute, storage, networking) generate the cloud’s processing capability. Theoretically, an autarkic private cloud requires the organization to build its own servers, similar to crafting custom solar panels instead of buying from any vendor.
  • Battery = Spares and Redundancy: Batteries store energy for later; spare hardware (e.g., extra servers, drives, networking equipment) keeps the cloud running when parts fail. 
  • Inverter = Software Stack: The inverter transforms raw power into usable energy, like how a software stack (OS, hypervisor) turns hardware into a functional cloud.
  • Self-Maintenance = Internal Operations: Fixing a solar system solo parallels maintaining a cloud without vendor support – both need in-house expertise to troubleshoot and repair everything.

Let me repeat it: both need in-house expertise to troubleshoot and repair everything. Everything.

The goal is self-sufficiency and independence. So, what are companies doing?

An autarkic private cloud might stockpile Dell servers or Nvidia GPUs upfront, but that first purchase ties you to external vendors. True autarky would mean mining silicon and forging chips yourself – impractical, just like growing your own silicon crystals for panels.

The problem

In practice, autarky for private clouds sounds like an extreme goal. It promises maximum control. Ideal for scenarios like military secrecy, regulatory isolation, or distrust of global supply chains but clashes with the realities of modern IT:

  • Once the last spare dies, you are done. No new tech without breaking autarky.
  • Autarky trades resilience for stagnation. Your cloud stays alive but grows irrelevant.
  • Autarky’s price tag limits it to tiny, niche clouds – not hyperscale rivals.
  • Future workloads are a guessing game. Stockpile too few servers, and you can’t expand. Too many, and you have wasted millions. A 2027 AI boom or quantum shift could make your equipment useless.

But where is this idea of self-sufficiency or sovereign operations coming from? Nowadays? Geopolitical resilience.

Sanctions or trade wars will not starve your cloud. A private (hyperscale) cloud that answers to no one, free from external risks or influence. That is the whole idea.

What is the probability of such sanctions? Who knows… but this is a number that has to be defined for each case depending on the location/country, internal and external customers, and requirements.

If it happens, is it foreseeable, and what does it force you to do? Does it trigger a cloud-exit scenario?

I just know that if there are sanctions, any hyperscaler in your country has the same problems. No matter if it is a public or dedicated region. That is the blast radius. It is not only about you and your infrastructure anymore.

What about private disconnected hyperscale clouds?

When hosting workloads in the public clouds, organizations care more about data residency, regulations, the US Cloud Act, and less about autarky.

Hyperscale clouds like Microsoft Azure and Oracle Cloud Infrastructure (OCI) are built to deliver massive scale, flexibility, and performance but they rely on complex ecosystems that make full autarky impossible. Oracle offers solutions like OCI Dedicated Region and Oracle Alloy to address sovereignty needs, giving customers more control over their data and operations. However, even these solutions fall short of true autarky and absolute sovereign operations due to practical, technical, and economic realities.

A short explanation from Microsoft gives us a hint why that is the case:

Additionally, some operational sovereignty requirements, like Autarky (for example, being able to run independently of external networks and systems) are infeasible in hyperscale cloud-computing platforms like Azure, which rely on regular platform updates to keep systems in an optimal state.

So, what are customers asking for when they are interested in hosting their own dedicated cloud region in their data centers? Disconnected hyperscale clouds.

But hosting an OCI Dedicated Region in your data center does not change the underlying architecture of Oracle Cloud Infrastructure (OCI). Nor does it change the upgrade or patching process, or the whole operating model.

Hyperscale clouds do not exist in a vacuum. They lean on a web of external and internal dependencies to work:

  • Hardware Suppliers. For example, most public clouds use Nvidia’s GPUs for AI workloads. Without these vendors, hyperscalers could not keep up with the demand.
  • Global Internet Infrastructure. Hyperscalers need massive bandwidth to connect users worldwide. They rely on telecom giants and undersea cables for internet backbone, plus partnerships with content delivery networks (CDNs) like Akamai to speed things up.
  • Software Ecosystems. Open-source tools like Linux and Kubernetes are part of the backbone of hyperscale operations.
  • Operations. Think about telemetry data and external health monitoring.

Innovation depends on ecosystems

The tech world moves fast. Open-source software and industry standards let hyperscalers innovate without reinventing the wheel. OCI’s adoption of Linux or Azure’s use of Kubernetes shows they thrive by tapping into shared knowledge, not isolating themselves. Going it alone would skyrocket costs. Designing custom chips, giving away or sharing operational control or skipping partnerships would drain billions – money better spent on new features, services or lower prices.

Hyperscale clouds are global by nature, this includes Oracle Dedicated Region and Alloy. In return you get:

  • Innovation
  • Scalability
  • Cybersecurity
  • Agility
  • Reliability
  • Integration and Partnerships

Again, by nature and design, hyperscale clouds – even those hosted in your data center as private Clouds (OCI Dedicated Region and Alloy) – are still tied to a hyperscaler’s software repositories, third-party hardware, operations personnel, and global infrastructure.

Sovereignty is real, autarky is a dream

Autarky sounds appealing: a hyperscale cloud that answers to no one, free from external risks or influence. Imagine OCI Dedicated Region or Oracle Alloy as self-contained kingdoms, untouchable by global chaos.

Autarky sacrifices expertise for control, and the result would be a weaker, slower and probably less secure cloud. Self-sufficiency is not cheap. Hyperscalers spend billions of dollars yearly on infrastructure, leaning on economies of scale and vendor deals. Tech moves at lightning speed. New GPUs drop yearly, software patches roll out daily (think about 1’000 updates/patches a month). Autarky means falling behind. It would turn your hyperscale cloud into a relic.

Please note, there are other solutions like air-gapped isolated cloud regions, but those are for a specific industry and set of customers.

From Cloud-First to Cloud-Smart to Repatriation

From Cloud-First to Cloud-Smart to Repatriation

VMware Explore 2024 happened this week in Las Vegas. I think many people were curious about what Hock Tan, CEO of Broadcom, had to say during the general session. He delivered interesting statements and let everyone in the audience know that “the future of enterprise is private – private cloud, private AI, fueled by your own private data“. On social media, the following slide about “repatriation” made quite some noise:

VMware Explore 2024 Keynote Repatriation

The information on this slide came from Barcley’s CIO Survey in April 2024 and it says that 8 out of 10 CIOs today are planning to move workloads from the public cloud back to their on-premises data centers. It is interesting, and in some cases even funny, that other vendors in the hardware and virtualization business are chasing this ambulance now. Cloud migrations are dead, let us do reverse cloud migrations now. Hybrid cloud is dead, let us do hybrid multi-clouds now and provide workload mobility. My social media walls are full of such postings now. It seems Hock Tan presented the Holy Grail to the world.

Where is this change of mind from? Why did only 43% during COVID-19 plan a reverse cloud migration and now “suddenly” more than 80%?

I could tell you the story now about cloud-first not being cool anymore, that organizations started to follow a smarter cloud approach, and then concluded that cloud migrations are still not happening based on their expectations (e.g., costs and complexity). And that it is time now to bring workloads back on-premises. It is not that simple.

I looked at Barclay’s CIO survey and the chart (figure 20 in the survey) that served as a source for Hock Tan’s slide:

Barclays CIO Survey April 2024 Cloud RepatriationWe must be very careful with our interpretation of the results. Just because someone is “planning” a reverse cloud migration, does it mean they are executing? And if they execute such an exercise, is this going to be correctly reflected in a future survey?

And which are the workloads and services that are brought back to an enterprise’s data center? Are we talking about complete applications? Or is it more about load balancers, security appliances, databases and storage, and specific virtual machines? And if we understand the workloads, what are the real reasons to bring them back? Figure 22 of the survey shows “Workloads that Respondents Intend to Move Back to Private Cloud / On-Premise from Public Cloud”:

Barclays CIO Survey April 2024 Workload to migrate

Okay, we have a little bit more context now. Just because some workloads are potentially migrated back to private clouds, what does it mean for public cloud vs. private cloud spend? Question #11 of the survey “What percentage of your workloads and what percentage of your total IT spend are going towards the public cloud, and how have those evolved over time?” focuses on this matter.

Barclays CIO Survey April 2024 Percentage of Workloads and Spend My interpretation? Just because one slide or illustration talks about repatriation does not mean, that the entire world is just doing reverse migrations now. Cloud migrations and reverse cloud migrations can happen at the same time. You could bring one application or some databases back on-premises but decide to move all your virtual desktops to the public cloud in parallel. We could still bring workloads back to our data center and increase public cloud spend. 

Sounds like cloud-smart again, doesn’t it? Maybe I am an organization that realized that the applications A, B, C, and D shouldn’t run in Azure, AWS, Google, and Oracle anymore, but the applications W, X, Y, and Z are better suited for these hyperscalers.

What else?

I am writing about my views and my opinions here. There is more to share. During the pandemic, everything had to happen very quickly, and everyone suddenly had money to speed up migrations and application modernization projects. After that, I think it is a natural thing that everything was slowing down a bit after this difficult and exhausting phase.

Some of the IT teams are probably still documenting all their changes and new deployments on an internal wiki, and their bosses started to hire FinOps specialists to analyze their cloud spend. It is no shocking surprise to me that some of the financial goals haven’t been met and result in a reverse cloud migration a few years later.

But that is not all. Try to think about the past years. What else happened?

Yes, we almost forgot about Artificial Intelligence (AI) and Sovereign Clouds.

Before 2020, not many of us were thinking about sovereign clouds, data privacy, and AI.

Most enterprises are still hosting their data on-premises behind their own firewall. And some of this data is used to train or finetune models. We see (internal) chatbots popping up using Retrieval Augmented Generation (RAG), which delivers answers based on actual data and proprietary information.

Okay. What else? 

Yep, there is more. There are new technologies and offerings available that were not here before. We just covered AI and ML (machine learning) workloads that became a potential cost or compliance concern.

The concept of sovereign clouds has gained traction due to increasing concerns about data sovereignty and compliance with local regulations.

The adoption of hybrid and hybrid multi-cloud strategies has been a significant trend from 2020 to 2024. Think about VMware’s Cloud Foundation approach with Azure, Google, Oracle etc., AWS Outposts, Azure Stack, Oracle’s DRCC, or Nutanix’s.

Enterprises started to upskill and train their people to deliver their own Kubernetes platforms.

Edge computing has emerged as a crucial technology, particularly for industries like manufacturing, telecommunications, and healthcare, where real-time data processing is critical.

Conclusion

Reverse cloud migrations are happening for many different reasons like cost management, performance optimization, data security and compliance, automation and operations, or because of lock-in concerns.

Yes, (cloud) repatriation became prominent, but I think this is just a reflection of the maturing cloud market – and not an ambulance.

And no, it is not a better moment to position your hybrid multi-cloud solutions, unless you understand the services and workloads that need to be migrated from one cloud to another. Just because some CIOs plan to bring back some workloads on-premises, does it mean/imply that they will do it? What about the sunk cost fallacy?

Perhaps IT leaders are going to be more careful in the future and are trying to find other ways for potential cost savings and strategic benefits to achieve their business outcomes – and keep their workloads in the cloud versus repatriating them.

Businesses are adopting a more nuanced workload-centric strategy.

What’s your opinion?

Distributed Hybrid Infrastructure Offerings Are The New Multi-Cloud

Distributed Hybrid Infrastructure Offerings Are The New Multi-Cloud

Since VMware belongs to Broadcom, there was less focus and messaging on multi-cloud or supercloud architectures. Broadcom has drastically changed the available offerings and VMware Cloud Foundation is becoming the new vSphere. Additionally, we have seen big changes regarding the partnerships with hyperscalers (the Azures and AWSes of this world) and the VMware Cloud partners and providers. So, what happened to multi-cloud and how come that nobody (at Broadcom) talks about it anymore?

What is going on?

I do not know if it’s only me, but I do not see the term “multi-cloud” that often anymore. Do you? My LinkedIn feed is full of news about artificial intelligence (AI) and how Nvidia employees got rich. So, I have to admit that I lost track of hybrid clouds, multi-clouds, or hybrid multi-cloud architectures. 

Cloud-Inspired and Cloud-Native Private Clouds

It seems to me that the initial idea of multi-cloud has changed in the meantime and that private clouds are becoming platforms with features. Let me explain.

Organizations have built monolithic private clouds in their data centers for a long time. In software engineering, the word “monolithic” describes an application that consists of multiple components, which form something larger. To build data centers, we followed the same approach by using different components like compute, storage, and networking. And over time, IT teams started to think about automation and security, and the integration of different solutions from different vendors.

The VMware messaging was always pointing in the right direction: They want to provide a cloud operating system for any hardware and any cloud (by using VMware Cloud Foundation). On top of that, build abstraction layers and leverage a unified control plane (aka consistent automation and operations).

And I told all my customers since 2020 that they need to think like a cloud service provider, get rid of silos, implement new processes, and define a new operating model. That is VMware by Broadcom’s messaging today and this is where they and other vendors are headed: a platform with features that provide cloud services.

In other words, and this is my opinion, VMware Cloud Foundation is today a platform with different components like vSphere, vSAN, NSX, Aria, and so on. Tomorrow, it is still called VMware Cloud Foundation, a platform that includes compute, storage, networking, automation, operations, and other features. No more other product names, just capabilities, and services like IaaS, CaaS, DRaaS or DBaaS. You just choose the specs of the underlying hardware and networking, deploy your private clouds, and then start to build and consume your services.

Replace the name “VMware Cloud Foundation” in the last paragraph with AWS Outposts or Azure Stack. Do you see it now? Distributed unmanaged and managed hybrid cloud offerings with a (service) consumption interface on top.

That is the shift from monolithic data centers to cloud-native private clouds.

From Intercloud to Multi-Cloud

It is not the first time that I write about interclouds, that not many of us know. In 2012, there was this idea that different clouds and vendors need to be interoperable and agree on certain standards and protocols. Think about interconnected private and public clouds, which allow you to provide VM mobility or application portability. Can you see the picture in front of you? What is the difference today in 2024?

In 2023, I truly believed that VMware figured it out when they announced VMware Cloud on Equinix Metal (VMC-E). To me, VMC-E was different and special because of Equinix, who is capable of interconnecting different clouds, and at the same time could provide a baremetal-as-a-service (BMaaS) offering.

Workload Mobility and Application Portability

Almost 2 years ago, I started to write a book about this topic, because I wanted to figure out if workload mobility and application portability are things, that enterprises are really looking for. I interviewed many CIOs, CTOs, chief architects and engineers around the globe, and it became VERY clear: it seems nobody was changing anything to make app portability a design requirement.

Almost all of the people I have spoken to, told me, that a lot of things must happen that could trigger a cloud-exit and therefore they see this as a nice-to-have capability that helps them to move virtual machines or applications faster from one cloud to another.

VMware Workload Mobility

And I have also been told that a lift & shift approach is not providing any value to almost all of them.

But when I talked to developers and operations teams, the answers changed. Most of them did not know that a vendor could provide mobility or portability. Anyway, what has changed now?

Interconnected Multi-Clouds and Distributed Hybrid Clouds

I mentioned it already before. Some vendors have realized that they need to deliver a unified and integrated programmable platform with a control plane. Ideally, this control plane can be used on-premises, as a SaaS solution, or both. And according to Gartner, these are the leaders in this area (Magic Quadrant for Distributed Hybrid Infrastructure):

Gartner Magic-Quadrant-for-Distributed-Hybrid-Infrastructure

In my opinion, VMware and Nutanix are providing a hybrid multi-cloud approach.

AWS and Microsoft are providing hybrid cloud solutions. In Microsoft’s case, we see Azure Stack HCI, Azure Kubernetes Service (AKS incl. Hybrid AKS) and Azure Arc extending Microsoft’s Azure services to on-premises data centers and edge locations.

The only vendor, that currently offers true multi-cloud capabilities, is Oracle. Oracle has Dedicated Region Cloud@Customer (DRCC) and Roving Edge, but also partnerships with Microsoft and Google that allow customers to host Oracle databases in Azure and Google Cloud data centers. Both partnerships come with a cross-cloud interconnection.

That is one of the big differences and changes for me at the moment. Multi-cloud has become less about mobility or portability, a single global control plane, or the same Kubernetes distribution in all the clouds, but more about bringing different services from different cloud providers closer together.

This is the image I created for the VMC-E blog. Replace the words “AWS” and “Equinix” with “Oracle” and suddenly you have something that was not there before, an interconnected multi-cloud.

What’s Next?

Based on the conversations with my customers, it does not feel that public cloud migrations are happening faster than in 2020 or 2022 and we still see between 70 and 80% of the workloads hosted on-premises. While we see customers who are interested in a cloud-first approach, we see many following a hybrid multi-cloud and/or multi-cloud approach. It is still about putting the right applications in the right cloud based on the right decisions. This has not changed.

But the narrative of such conversations has changed. We will see more conversations about data residency, privacy, security, gravity, proximity, and regulatory requirements. Then there are sovereign clouds.

Lastly, enterprises are going to deploy new platforms for AI-based workloads. But that could still take a while.

Final Thoughts

As enterprises continue to navigate the above mentioned complexities, the need for flexible, scalable, and secure infrastructure solutions will only grow. There are a few compelling solutions that bridge the gap between traditional on-premises systems and modern cloud environments.

And since most enterprises are still hosting their workloads on-premises, they have to decide if they want to stretch the private cloud to the public cloud, or the other way around. Both options can co-exist, but would make it too big and too complex. What’s your conclusion?

AZ-104 Study Guide – Azure Storage

AZ-104 Study Guide – Azure Storage

 

If you are looking for the full AZ-104 study guide: https://www.cloud13.ch/2023/10/31/az-104-study-guide-microsoft-azure-administrator/ 

It is clear to me that networking is probably the most complex topic in Azure. The concept is very different from the on-premises world,  you have so many options and a lot of topics to understand. Let us focus on Azure storage as the next topic. As always, I will follow John Savill’s guidance and look for the documentation online.

Storage Accounts

An Azure storage account contains all of your Azure Storage data objects: blobs, files, queues, and tables. The storage account provides a unique namespace for your Azure Storage data that’s accessible from anywhere in the world over HTTP or HTTPS. Data in your storage account is durable and highly available, secure, and massively scalable.

When naming your storage account, keep these rules in mind:

  • Storage account names must be between 3 and 24 characters in length and may contain numbers and lowercase letters only.
  • Your storage account name must be unique within Azure. No two storage accounts can have the same name.

Azure Storage Redundancy

Data in an Azure Storage account is always replicated three times in the primary region. Azure Storage offers two options for how your data is replicated in the primary region:

  • Locally redundant storage (LRS) copies your data synchronously three times within a single physical location in the primary region. LRS is the least expensive replication option, but isn’t recommended for applications requiring high availability or durability.Diagram showing how data is replicated in a single data center with LRS
    • Zone-redundant storage (ZRS) copies your data synchronously across three Azure availability zones in the primary region. For applications requiring high availability, Microsoft recommends using ZRS in the primary region, and also replicating to a secondary region.

    Diagram showing how data is replicated in the primary region with ZRS

    Redundancy in a secondary region

    For applications requiring high durability, you can choose to additionally copy the data in your storage account to a secondary region that is hundreds of miles away from the primary region. If your storage account is copied to a secondary region, then your data is durable even in the case of a complete regional outage or a disaster in which the primary region isn’t recoverable.

    Azure Storage offers two options for copying your data to a secondary region:

    • Geo-redundant storage (GRS) copies your data synchronously three times within a single physical location in the primary region using LRS. It then copies your data asynchronously to a single physical location in the secondary region. Within the secondary region, your data is copied synchronously three times using LRS.

    Diagram showing how data is replicated with GRS or RA-GRS

    • Geo-zone-redundant storage (GZRS) copies your data synchronously across three Azure availability zones in the primary region using ZRS. It then copies your data asynchronously to a single physical location in the secondary region. Within the secondary region, your data is copied synchronously three times using LRS.

    Diagram showing how data is replicated with GZRS or RA-GZRS

    Geo-redundant storage (with GRS or GZRS) replicates your data to another physical location in the secondary region to protect against regional outages. With an account configured for GRS or GZRS, data in the secondary region is not directly accessible to users or applications, unless a failover occurs. The failover process updates the DNS entry provided by Azure Storage so that the secondary endpoint becomes the new primary endpoint for your storage account. During the failover process, your data is inaccessible. After the failover is complete, you can read and write data to the new primary region.

    The following table describes key parameters for each redundancy option:

    Parameter LRS ZRS GRS/RA-GRS GZRS/RA-GZRS
    Percent durability of objects over a given year at least 99.999999999% (11 9’s) at least 99.9999999999% (12 9’s) at least 99.99999999999999% (16 9’s) at least 99.99999999999999% (16 9’s)
    Availability for read requests At least 99.9% (99% for cool or archive access tiers) At least 99.9% (99% for cool access tier)

    At least 99.9% (99% for cool or archive access tiers) for GRS

    At least 99.99% (99.9% for cool or archive access tiers) for RA-GRS

    At least 99.9% (99% for cool access tier) for GZRS

    At least 99.99% (99.9% for cool access tier) for RA-GZRS

    Availability for write requests At least 99.9% (99% for cool or archive access tiers) At least 99.9% (99% for cool access tier) At least 99.9% (99% for cool or archive access tiers) At least 99.9% (99% for cool access tier)
    Number of copies of data maintained on separate nodes Three copies within a single region Three copies across separate availability zones within a single region Six copies total, including three in the primary region and three in the secondary region Six copies total, including three across separate availability zones in the primary region and three locally redundant copies in the secondary region

    Azure Blobs

    Azure Storage offers three types of blob storage:

    • Block Blobs. Block blobs are composed of blocks and are ideal for storing text or binary files, and for uploading large files efficiently.
    • Append Blobs. Append blobs are also made up of blocks, but they are optimized for append operations, making them ideal for logging scenarios.
    • Page blobs. Page blobs are made up of 512-byte pages up to 8 TB in total size and are designed for frequent random read/write operations. Page blobs are the foundation of Azure IaaS Disks

    Overview of Azure page blobs

    Page blobs are a collection of 512-byte pages, which provide the ability to read/write arbitrary ranges of bytes. Hence, page blobs are ideal for storing index-based and sparse data structures like OS and data disks for Virtual Machines and Databases. For example, Azure SQL DB uses page blobs as the underlying persistent storage for its databases. Moreover, page blobs are also often used for files with Range-Based updates.

    Key features of Azure page blobs are its REST interface, the durability of the underlying storage, and the seamless migration capabilities to Azure. These features are discussed in more detail in the next section. In addition, Azure page blobs are currently supported on two types of storage: Premium Storage and Standard Storage. Premium Storage is designed specifically for workloads requiring consistent high performance and low latency making premium page blobs ideal for high performance storage scenarios. Standard storage accounts are more cost effective for running latency-insensitive workloads.

    Azure page blobs are the backbone of the virtual disks platform for Azure IaaS. Both Azure OS and data disks are implemented as virtual disks where data is durably persisted in the Azure Storage platform and then delivered to the virtual machines for maximum performance. Azure Disks are persisted in Hyper-V VHD format and stored as a page blob in Azure Storage. In addition to using virtual disks for Azure IaaS VMs, page blobs also enable PaaS and DBaaS scenarios such as Azure SQL DB service, which currently uses page blobs for storing SQL data, enabling fast random read-write operations for the database. Another example would be if you have a PaaS service for shared media access for collaborative video editing applications, page blobs enable fast access to random locations in the media. It also enables fast and efficient editing and merging of the same media by multiple users.

    The following visual illustrates the guidelines to choose the various Azure data transfer tools depending upon the network bandwidth available for transfer, data size intended for transfer, and frequency of the transfer.

    Azure data transfer tools

    Premium block blob storage accounts

    Premium block blob storage accounts make data available via high-performance hardware. Data is stored on solid-state drives (SSDs) which are optimized for low latency. SSDs provide higher throughput compared to traditional hard drives. File transfer is much faster because data is stored on instantly accessible memory chips. All parts of a drive accessible at once. By contrast, the performance of a hard disk drive (HDD) depends on the proximity of data to the read/write heads.

    Access tiers for blob data

    Data stored in the cloud grows at an exponential pace. To manage costs for your expanding storage needs, it can be helpful to organize your data based on how frequently it will be accessed and how long it will be retained. Azure storage offers different access tiers so that you can store your blob data in the most cost-effective manner based on how it’s being used. Azure Storage access tiers include:

    • Hot tier – An online tier optimized for storing data that is accessed or modified frequently. The hot tier has the highest storage costs, but the lowest access costs.
    • Cool tier – An online tier optimized for storing data that is infrequently accessed or modified. Data in the cool tier should be stored for a minimum of 30 days. The cool tier has lower storage costs and higher access costs compared to the hot tier.
    • Cold tier – An online tier optimized for storing data that is rarely accessed or modified, but still requires fast retrieval. Data in the cold tier should be stored for a minimum of 90 days. The cold tier has lower storage costs and higher access costs compared to the cool tier.
    • Archive tier – An offline tier optimized for storing data that is rarely accessed, and that has flexible latency requirements, on the order of hours. Data in the archive tier should be stored for a minimum of 180 days.

    Object replication for block blobs

    Object replication asynchronously copies block blobs between a source storage account and a destination account. Some scenarios supported by object replication include:

    • Minimizing latency. Object replication can reduce latency for read requests by enabling clients to consume data from a region that is in closer physical proximity.
    • Increase efficiency for compute workloads. With object replication, compute workloads can process the same sets of block blobs in different regions.
    • Optimizing data distribution. You can process or analyze data in a single location and then replicate just the results to additional regions.
    • Optimizing costs. After your data has been replicated, you can reduce costs by moving it to the archive tier using life cycle management policies.

    Diagram showing how object replication works

    Append Blobs

    An append blob is composed of blocks and is optimized for append operations. When you modify an append blob, blocks are added to the end of the blob only, via the Append Block operation. Updating or deleting of existing blocks is not supported. Unlike a block blob, an append blob does not expose its block IDs.

    Each block in an append blob can be a different size, up to a maximum of 4 MiB, and an append blob can include up to 50,000 blocks. The maximum size of an append blob is therefore slightly more than 195 GiB (4 MiB X 50,000 blocks).

    Azure Files

    Azure Files offers fully managed file shares in the cloud that are accessible via the industry standard Server Message Block (SMB) protocol, Network File System (NFS) protocol, and Azure Files REST API. Azure file shares can be mounted concurrently by cloud or on-premises deployments.

    SMB Azure file shares are accessible from Windows, Linux, and macOS clients. NFS Azure file shares are accessible from Linux clients. Additionally, SMB Azure file shares can be cached on Windows servers with Azure File Sync for fast access near where the data is being used.

    Active Directory as Authentication Source

    On-premises Active Directory Domain Services (AD DS) integration with Azure Files provides the methods for storing directory data while making it available to network users and administrators. Security is integrated with AD DS through logon authentication and access control to objects in the directory. With a single network logon, administrators can manage directory data and organization throughout their network, and authorized network users can access resources anywhere on the network. AD DS is commonly adopted by enterprises in on-premises environments or on cloud-hosted VMs, and AD DS credentials are used for access control. 

    Files AD workflow diagram

    Azure File Sync

    Azure File Sync enables centralizing your organization’s file shares in Azure Files, while keeping the flexibility, performance, and compatibility of a Windows file server. While some users may opt to keep a full copy of their data locally, Azure File Sync additionally can transform Windows Server into a quick cache of your Azure file share. You can use any protocol that’s available on Windows Server to access your data locally, including SMB, NFS, and FTPS. You can have as many caches as you need across the world.

    An Azure hybrid file services topology diagram.

    Azure Queue Storage

    Azure Queue Storage is a service for storing large numbers of messages. You access messages from anywhere in the world via authenticated calls using HTTP or HTTPS. A queue message can be up to 64 KB in size. A queue may contain millions of messages, up to the total capacity limit of a storage account.

    Azure Table Storage

    Azure Table storage stores large amounts of structured data. The service is a NoSQL datastore which accepts authenticated calls from inside and outside the Azure cloud. Azure tables are ideal for storing structured, non-relational data. Common uses of Table storage include:

    • Storing TBs of structured data capable of serving web scale applications
    • Storing datasets that don’t require complex joins, foreign keys, or stored procedures and can be denormalized for fast access
    • Quickly querying data using a clustered index
    • Accessing data using the OData protocol and LINQ queries with WCF Data Service .NET Libraries

    You can use Table storage to store and query huge sets of structured, non-relational data, and your tables will scale as demand increases.

    Tables storage component diagram

     

    Azure Managed Disks

    Azure managed disks are block-level storage volumes that are managed by Azure and used with Azure Virtual Machines. Managed disks are like a physical disk in an on-premises server but, virtualized. With managed disks, all you have to do is specify the disk size, the disk type, and provision the disk. Once you provision the disk, Azure handles the rest.

    The available types of disks are ultra disks, premium solid-state drives (SSD), standard SSDs, and standard hard disk drives (HDD). For information about each individual disk type, see Select a disk type for IaaS VMs.

    Disk type comparison

    The following table provides a comparison of the five disk types to help you decide which to use.

    Ultra disk Premium SSD v2 Premium SSD Standard SSD Standard HDD
    Disk type SSD SSD SSD SSD HDD
    Scenario IO-intensive workloads such as SAP HANA, top tier databases (for example, SQL, Oracle), and other transaction-heavy workloads. Production and performance-sensitive workloads that consistently require low latency and high IOPS and throughput Production and performance sensitive workloads Web servers, lightly used enterprise applications and dev/test Backup, non-critical, infrequent access
    Max disk size 65,536 GiB 65,536 GiB 32,767 GiB 32,767 GiB 32,767 GiB
    Max throughput 4,000 MB/s 1,200 MB/s 900 MB/s 750 MB/s 500 MB/s
    Max IOPS 160,000 80,000 20,000 6,000 2,000, 3,000*
    Usable as OS Disk? No No Yes Yes Yes

    * Only applies to disks with performance plus (preview) enabled.

    Note: You can adjust ultra disk IOPS and throughput performance at runtime without detaching the disk from the virtual machine. After a performance resize operation has been issued on a disk, it can take up to an hour for the change to take effect. Up to four performance resize operations are permitted during a 24-hour window.