This article is contributed. See the original author and article here.
Sometimes success in life depends on little things that seem easy. So easy that they are often overlooked or underestimated for some reason. This also applies to life in IT. For example, just think about this simple question: “Do you have a tested and documented Active Directory disaster recovery plan?”
This is a question we, as Microsoft Global Compromise Recovery Security Practice, ask our customers whenever we engage in a Compromise Recovery project. The aim of these projects is to evict the attacker from compromised environments by revoking their access, thereby restoring confidence in these environments for our customers. More information can be found here: CRSP: The emergency team fighting cyber attacks beside customers – Microsoft Security Blog
Nine out of ten times the customer replies: “Sure, we have a backup of our Active Directory!”, but when we dig a little deeper, we often find that while Active Directory is backed up daily, an up-to-date, documented, and regularly tested recovery procedure does not exist. Sometimes people answer and say: “Well, Microsoft provides instructions on how to restore Active Directory somewhere on docs.microsoft.com: so, if anything happens that breaks our entire directory, we can always refer to that article and work our way through. Easy!”. To this we say, an Active Directory recovery can be painful/time-consuming and is often not easy.
You might think that the likelihood of needing a full Active Directory recovery is small. Today, however, the risk of a cyberattack against your Active Directory is higher than ever, hence the chances of you needing to restore it have increased. We now even see ransomware encrypting Domain Controllers, the servers that Active Directory runs on. All this means that you must ensure readiness for this event.
Readiness can be achieved by testing your recovery process in an isolated network on a regular basis, just to make sure everything works as expected, while allowing your team to practice and verify all the steps required to perform a full Active Directory recovery.
Consider the security aspects of the backup itself, as it is crucial to store backups safely, preferably encrypted, restricting access to only trusted administrative accounts and no one else!
You must have a secure, reliable, and fast restoration procedure, ready to use when you most need it.
Azure Recovery Services Vault can be an absolute game changer for meeting all these requirements, and we often use it during our Compromise Recovery projects, which is why we are sharing it with you here. Note that the intention here is not to write up a full Business Continuity Plan. Our aim is to help you get started and to show you how you can leverage the power of Azure.
The process described here can also be used to produce a lab containing an isolated clone of your Active Directory. In the Compromise Recovery, we often use the techniques described here, not only to verify the recovery process but also to give ourselves a cloned Active Directory lab for testing all kinds of hardening measures that are the aim of a Compromise Recovery.
What is needed
This high-level schema shows you all the components that are required:
At least one production DC per domain in Azure
We do assume that you have at least one Domain Controller per domain running on a VM in Azure, which nowadays many of our customers do. This unlocks the features of Azure Recovery Services Vault to speed up your Active Directory recovery.
Note that backing up two Domain Controllers per domain improves redundancy, as you will have multiple backups to choose from when recovering. This is another point in our scenario where Azure Recovery Vault’s power comes through, as it allows you to easily manage multiple backups in one single console, covered by common policies.
Azure Recovery Services Vault
We need to create the Azure Recovery Services Vault and to be more precise, a dedicated Recovery Services Vault for all “Tier 0” assets in a dedicated Resource Group (Tier 0 assets are sensitive, highest-level administrative assets, including accounts, groups and servers, control of which would lead to control of your entire environment).
This Vault should reside in the same region as your “Tier 0” servers, and we need a full backup of at least one Domain Controller per domain.
Once you have this Vault, you can include the Domain Controller virtual machine in your Azure Backup.
Recovery Services vaults are based on the Azure Resource Manager model of Azure, which provides features such as:
- Enhanced capabilities to help secure backup data: With Recovery Services Vaults, Azure Backup provides security capabilities to protect cloud backups. This includes the encryption of backups that we mention above.
- Central monitoring for your hybrid IT environment: With Recovery Services Vaults, you can monitor not only your Azure IaaS virtual machines but also your on-premises assets from a central portal.
- Azure role-based access control (Azure RBAC): Azure RBAC provides fine-grained access management control in Azure. Azure Backup has three built-in RBAC roles to manage recovery points, which allows us to restrict backup and restore access to the defined set of user roles.
- Soft Delete: With soft delete the backup data is retained for 14 additional days after deletion, which means that even if you accidentally remove the backup, or if this is done by a malicious actor, you can recover it. These additional 14 days of retention for backup data in the “soft delete” state don’t incur any cost to you.
Find more information on the benefits in the following article: What is Azure Backup? – Azure Backup | Microsoft Docs
Isolated Restore Virtual Network
Another thing we need is an isolated network portion (the “isolatedSub” in the drawing) to which we restore the DC. This isolated network portion should be in a separate Resource Group from your production resources, along with the newly created Recovery Services Vault.
Isolation means no network connectivity whatsoever to your production networks! If you inadvertently allow a restored Domain Controller, the target of your forest recovery Active Directory cleanup actions, to replicate with your running production Active Directory, this will have a serious impact on your entire IT Infrastructure. Isolation can be achieved by not implementing any peering, and of course by avoiding any other connectivity solutions such as VPN Gateways. Involve your networking team to ensure that this point is correctly covered.
Bastion Host in Isolated Virtual Network
The last thing we need is the ability to use a secure remote connection to the restored virtual machine that is the first domain controller of the restore Active Directory. To get around the isolation of the restoration VNET, we are going to use Bastion Host for accessing this machine.
Azure Bastion is a fully managed Platform as a Service that provides secure and seamless secure connection (RDP and SSH) access to your virtual machines directly through the Azure Portal and avoids public Internet exposure using SSH and RDP with private IP addresses only.
Azure Bastion | Microsoft Docs
The Process
Before Azure Recovery Vault existed, the first steps of an Active Directory recovery were the most painful part of process: one had to worry about provisioning a correctly sized- and configured recovery machine, transporting the WindowsImageBackup folder to a disk on this machine, and booting from the right Operating System ISO to perform a machine recovery. Now we can bypass all these pain points with just a few clicks:
Perform the Virtual Machine Backup
Creating a backup of your virtual machine in the Recovery Vault involves including it in a Backup Policy. This is described here:
Azure Instant Restore Capability – Azure Backup | Microsoft Docs
Restore the Virtual Machine to your isolated Virtual Network
To restore your virtual machine, you use the Restore option in Backup Center, with the option to create a new virtual machine. This is described here:
Restore VMs by using the Azure portal – Azure Backup | Microsoft Docs
Active Directory Recovery Process
Once you have performed the restoration of your Domain Controller virtual machine to the isolated Virtual Network, you can log on to this machine using the Bastion Host, which allows you to start performing the Active Directory recovery as per our classic guidance.
You login using the built-in administrator account, followed by the steps outlined in the drawing below under “Start of Recovery in isolated VNet” :
All the detailed steps can be found here Active Directory Forest Recovery Guide | Microsoft Docs and we note that the above process may need to be tailored for your organization.
Studying the chart above, you will see that there are some dependencies that apply. Just think about seemingly trivial stuff such as the Administrator password that is needed during recovery, the one that you use to log on to the Bastion.
- Who has access to this password?
- Did you store the password in a Vault that is dependent on a running AD service?
- Do you have any other services running on your domain controllers, such as any file services (please note that we do not recommend this)?
- Is DNS running on Domain controllers or is there a DNS dependency on another product such as Infoblox?
These are things to consider in advance, to ensure you are ready for recovery of your AD.
Tips and Tricks
In order to manage a VM in Azure two things come in handy:
- Serial console- this feature in the Azure portal provides access to a text-based console for Windows virtual machines. This console session provides access to the Virtual Machine independent of the network or operating system state. The serial console can only be accessed by using the Azure portal and is allowed only for those users who have an access role of Contributor or higher to the VM or virtual machine scale set. This feature comes in handy when you need to troubleshoot Remote Desktop connection failures; suppose you need to disable the Host Based Firewall or need to change IP configuration settings. More information can be found here: Azure Serial Console for Windows – Virtual Machines | Microsoft Docs
- Run Command- this feature uses the virtual machine agent to run PowerShell scripts within an Azure Windows VM. You can use these scripts for general machine or application management. They can help you to quickly diagnose and remediate Virtual Machine access and network issues and get the Virtual Machine back to a good state. More information can be found here: Run scripts in a Windows VM in Azure using action Run Commands – Azure Virtual Machines | Microsoft Docs
Security
We remind you that a Domain Controller is a sensitive, highest-level administrative asset, a “Tier 0” asset (see for an overview of our Securing Privileged access Enterprise access model here: Securing privileged access Enterprise access model | Microsoft Docs), no matter where it is stored. Whether it runs as a virtual machine on VMware, on Hyper-V or in Azure as a IAAS virtual machine, that fact does not change. This means you will have to protect these Domain Controllers and their backups using the maximum level or security restrictions you have at your disposal in Azure. Role Based Access Control is one of the features that can help here to restrict accounts that have access.
Conclusion
A poorly designed disaster recovery plan, lack of documentation, and a team that lacks mastery of the process will delay your recovery, thereby increasing the burden on your administrators when a disaster happens. In turn, this will exacerbate the disastrous impact that cyberattacks can have on your business.
In this article, we gave you a global overview of how the power of Azure Recovery Services Vault can simplify and speed up your Active Directory Recovery process: how easy it is to use, how fast you can recover a machine into an isolated VNET in Azure, and how you can connect to it safely using Bastion to start performing your Active Directory Recovery on a restored Domain Controller.
Finally, ask yourself this question: “Am I able to recover my entire Active Directory in the event of a disaster? If you cannot answer this question with a resounding “yes” then it is time to act and make sure that you can.
Authors: Erik Thie & Simone Oor, Compromise Recovery Team
To learn more about Microsoft Security solutions visit our website. Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity.
Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.
Recent Comments