Real-world Case:
Online migrate 7115 (multi PBs) SQL Server databases on-premise to hundreds Azure SQL Managed Instances with near zero downtime (equals cutover time).
Why This Solution
This solution involves some extra works in order to set up a home-made, migration runtime environment. If you only need to migrate a few or dozens databases from a few SQL Servers on-premise to a few SQL Managed Instances in Azure, it would be more practical to simply use Azure DMS portal or manually run some DMS PowerShell commands for your migration tasks, but when you deal with more than hundreds or thousands databases, the efficiency and the built-in error handling features provided by this solution make it well worth the efforts. More databases, more useful.
Major Migration Tools:
PowerShell Script
SQL Database
New update: If you are looking for migrating your databases using Azure Data Studio, which is still not very efficient when migrating hundreds or thousands database all together online, please be aware that:
It's currently in preview
Automating migrations with Azure Data Studio using PowerShell / CLI isn't supported.
If your database backups are in an on-premises SMB network file share, you to install self-hosted integration runtime to access and migrate database backups
The maximum number of concurrent database migrations per self-hosted integration runtime on a single machine is limited.(no more than 10 according to Microsoft's recommendation)
You can't use an existing self-hosted integration runtime created from Azure Data Factory for database migrations with DMS. Initially, the self-hosted integration runtime should be created using the Azure SQL Migration extension in Azure Data Studio and can be reused for further database migrations.
Concurrently migrating multiple databases to the same SQL Server on Azure VM results in migration failures for most databases. Ensure you only migrate a single database to a SQL Server on Azure VM at any point in time.
The Highlight of The Solution:
Automatically ONLINE migrate hundreds or even thousands database from multiple SQL Servers on-premises to multiple Azure SQL Managed Instances using a PROVEN and EFFICIENT method with minimum downtime.
The runtime environment diagram of this solution:
The solution components :
Site-to-site connectivity between Azure subscriptions and on-premises source servers through either ExpressRoute or VPN.
DMS subnets with enough IP addresses in the same Azure region as the target database.
Azure Premium DMS Instances in the subscription .
Azure standard storage accounts.
DMS configuration database
Migration console VM
An SMB network share
Credential for DMS to access the source SQL Servers and SMB network share
Credential for DMS to access the target SQL Managed Instances
Azure App ID for DMS
An Azure Key Vault
Assumption:
This article focuses on HOW to online migrate a large number of SQL Server databases on-premise to Azure using DMS and assume all the following pre-migration tasks have been done using different tools:
There are a variety of Microsoft and third-party services and tools available for you to plan successfully for database migration and to complete its various phases and tasks. Please refer to the related MS articles for guidance.
GitHub Repo (currently private, available upon request):
https://github.com/glindba/Azure-DMS-With-PowerShell.git
Step 1: Prepare Azure Resources For DMS
1. Site-to-site Connections
A reliable and fast data flow path from the on-premises source to Azure is essential for a successful database migration. We used to have multiple production environments running in different data centers across US and Canada. For the larger migration data loads, we used Azure ExpressRoute to extend our on-premises networks to Microsoft cloud and enabled BGP (Border Gateway Protocol) which effectively ensures that the Azure VNet is an extension of the on-premises network. Our network admins configured the data center side networks to allow the traffic from the DMS subnet IP ranges. Initially, we purchased ER circuit with 5Gbps bandwidth which provided us up to 180GB/Hour upload speed. Later on, we upgraded it to 10Gbps to achieve a faster (up to 240GB/Hour)migration. For certain ER bandwidth option, your may observe a different ER performance in your environment depending on the gateway and routing configuration. Work with your Cloud Admins to optimize it where it's required. The SLA for Dedicated Circuit within an Azure ExpressRoute Service is 99.95% monthly.
For the smaller data centers, we connected multiple on-premises networks to Azure through site-to-site VPN gateways. Each Azure VNet can only contains one VPN gateway, but a single VPN gateway supports connections to multiple locations, including other Azure VNets or other cloud environments. We first deployed a generation1 VPN gateway (VpnGw1AZ) with Aggregate Throughput 650Mbps, later upgraded to VpnGw2AZ (1Gbps). Although VPN connections go over the public Internet and the Aggregate Throughput Benchmark is not a guaranteed throughput, we didn't experience any reliability issues during our migration process. However, we did run into VPN related issues twice because some other teams within our organization started putting a heavy traffics on the same VPN while the DMS activities were uploading full database backup files to the Azure blob containers. Aggregate Throughput Benchmark is based on measurements of multiple tunnels aggregated through a single gateway and it is S2S + P2S combined. If lot of P2S connections or other workloads share the same connection, the DMS migration can be negatively impacted due to throughput limitations. This was the main reason why we had to upgrade the gateway to a higher SKU.
2.DMS Subnets
Create a Microsoft Azure Virtual Network(VNet) for Azure DMS by using the Azure Resource Manager(ARM) deployment model and ensure that the Network Security Group (NSG) rules defined on the DMS subnets don't block the outbound port 443 of Service Tag for Service Bus, Storage and Azure Monitor. DMS activities write out all the database backups to Azure storage blob containers and monitor the restore status via ARM. During the migration, you may notice lots of traffic from DMS subnets egressing via your Azure VNet firewall to some Microsoft public IP addresses like 20.44.16.113 or 137.116.44.148. This is normal.
At the beginning, we believed one dedicated subnet with 123 free IP addresses in our production VNet should be sufficient for DMS deployment. We later ended up having to create two more DMS subnets with extra 246 IP addresses as we found more DMS instances actually needed for larger migrations.
The DMS subnets should be in the same Azure region as the target database and their IP address ranges should be allowed access through NSG rules defined on SQLMI subnets. In our case, the target SQLMI subnets and DMS subnets are in the same VNet
3.Azure Database Migration Service Limits
By default, the maximum number of DMS services per subscription, per region is 10. We initially requested a quota increase for Premium DMS to 100, then increased again to 150, 200 and finally 250 in total.
Relationship between DMS projects, activities(tasks), source databases and target databases
As showed above, multiple projects can be created in one DMS instance, multiple activities can be created in a single project. each activity can handle multiple databases. So in theory, a single DMS instance can migrate many databases simultaneously, but our experience shows that if one DMS instance migrate more than 4 databases, it has much more chances running into a variety of issues (for example, 100% CPU load on DMS VMs). Please read Part4 of this article for details about error handling.
4.Configure Target SQL Managed Instance For DMS
How to properly prepare and configure Azure resource for production SQLMI deployment is a complex and different topic. Here I only focus on the items that are mandatory for a successful database migration through DMS.
Create A SQL Server Authenticated Login
The solution PowerShell scripts need a login to run some T-SQL queries on all the target SQ Managed Instances, below is the T-SQL script for this task:
USE [master]
GO
if not exists (select name from sys.syslogins where name = 'azdms')
Begin
CREATE LOGIN [azdms] WITH PASSWORD=N'4************', DEFAULT_DATABASE=[master], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF
ALTER SERVER ROLE [sysadmin] ADD MEMBER [azdms]
End
else
Begin
select 'Found existing login name azdms,changing password'
ALTER LOGIN [azdms] WITH PASSWORD=N'4***********3'
End
Migrate Encryption Certificate
If the source databases are protected by Transparent Data Encryption(TDE) or the backup files are encrypted. the corresponding certificate must be migrated before DMS can start the database restore. This can be done through the PowerShell commands, For detailed steps, see Migrate Migrate a certificate of a TDE-protected database to Azure SQL Managed Instance.
We did experience some failures that were simply due to the missing certificate on the target SQLMIs. Unfortunately, so far there is no system catalog views or tables which we can query to verify the certificate. The only way to check its existence is to test restore a encrypted full database backup to the target SQLMI. Hopefully Microsoft would soon provide a more efficient way for this task.
Verify the number of data files on the target SQLMIs is not too big causing the DMS activities hit this error below:
Non retriable error occurred while restoring backup with index 1 - 41935 Managed Instance has reached the total capacity of underlying Azure storage account. Azure Premium Storage account is limited to 35TB of allocated space.
RESTORE DATABASE is terminating abnormally.
Normally, the SQLMI has two storage limits:
Managed Instance General Purpose user storage cannot exceed 8TB(Now 16TB).
Maximum of 280 data files on the General Purpose instance. This limitation is due to how Azure premium disk storage is allocated internally. 280 files placed on the smallest 128GB disk will reach beyond the 35TB of maximum internal Azure premium storage limit. Please note that this limit is internal to Azure storage under the hood, and that customers can still only use up to 8TB of storage for the Managed Instance.
Adjust or Disable Throttling Limit of Concurrent Operations
By default, Azure have a throttling limit at 160 operations per subscription. This limit has to be increased and disabled completely. Otherwise DMS will fail to restore databases on the target SQLMIs with error messages like this:
Error message: 'Cannot process create or update request. Too many create or update operations in progress for subscription
Note: This quota change can only be deployed by Microsoft Azure support team.
5.Azure Storage Accounts for DMS
To better manage the data upload performance when running 250 DMS instances to migrate hundreds databases to Azure simultaneously, we created 10 new Standard Performance tier, Azure Storage Accounts in the same region as the DMS subnets and tried our best to evenly distribute the backup files among them. DMS activity(task) creates a dedicated blob container for all backup files from one source database when the migration starts. So each and every databases has its own blob container in the storage account. The names of these containers look random like this "be777d1e-ec38-41af-acfa-4015de4f217d". You can use DMS PowerShell commands to retrieve the mapping information between the source database name and the target blob container name.
The networking firewall policies on the DMS storage accounts should allow access from target SQLMI subnets through Azure Storage Service Endpoint
6. Prepare SMB Network Share For DMS
Currently DMS only supports reading the source database backups from an SMB network share. It would be nice if DMS could also retrieve backup files from Azure storage blob containers. Depending on the migration workload, the disk or network I/O performance could be a bottleneck slowing down the file uploading process.
In our case, we created the SMB share folder on Dell EMC Isilon with four (4) nodes handling incoming storage access traffics. All DMS instances use the same SMB share URL, but the load is being balanced by the DNS that returns IP of nodes with less usage.
7. Prepare The Source SQL Server for DMS
Ensure that the service account running the source SQL Server instance can write backup files to the DMS SMB network share and the computer account for the source servers also has read/write access to the same share.
Create a SQL Server login for a Windows account with Sysadmin role. This Windows account should has full control privilege on the SMB network share because it will be used by Azure DMS to impersonate its credential to upload the backup files to Azure Storage container for restore operation. I used my own Windows domain account for this task, but it's better to create a new window domain account dedicated for DMS activities
Open the Windows Firewall (by default it is TCP port 1433) on source SQL Server machines to allow Azure DMS access. If the SQL Server instances are listening on some other port, add that to the firewall.
Azure DMS does not initiate any backups, and instead uses existing backups for the migration. It's mandatory to take backups using the WITH CHECKSUM option. Each backup can be written to either a separate backup file or multiple backup files. However, appending multiple backups (i.e. full and t-log) into a single backup media is not supported. It's strongly recommended to create a separate subfolder dedicated for receiving full and T-Log backup files from only one source database. Doing so can avoid putting too many backup files in a single folder which will negatively impact DMS activity's performance when scanning all the files in the folder to find right backup files. Finally, using compressed backups to improve the overall upload performance and reduce the likelihood of experiencing potential issues associated with migrating large backups.
Since SQLMI doesn't support multiple transaction log files, if any of you source databases have more than one log file, consolidate them to only one transaction log file before starting migration.
Comments