Microsoft Fabric Database Mirroring
By Marvin Magdaluyo on May 20, 2024
How does Fabric Mirroring works?
Fabric Mirroring leverages Change Data Capture (CDC) technology to convert data into appropriate Delta tables and store it in OneLake. Initially, a snapshot of the data is created, and subsequently, it is kept synchronized in near real-time with each transaction, whether it involves creating new tables or inserting, updating, or deleting data. The replication process employs intelligent logic to detect changes in the source data, optimizing compute usage by replicating only when necessary. Users have granular control over what data is mirrored into Fabric. Additionally, detailed monitoring provides insights into mirroring operations and the last refresh time of the replica in Fabric OneLake.
Fabric Mirrored Snowflake:
To better simulate Data Warehouse mirroring, I used the AdventureWorks DW Database and loaded it into Snowflake DW. Outlined below are the step-by-step procedures. (See shared drive for the scripts used.)
1. Restore the AdventureWorks Database in SQL Server.
2. From SQL Server, script all the schemas and tables.
3. Create the AdventureWorks Database in Snowflake.
4. Apply the extracted script in Snowflake to create all tables. Consider the following adjustments or you can refer to SnowFlake_Tables_Schemas.sql:
Remove all indexes, as Snowflake uses data clustering.
I used the data type Number instead of Money.
I used the data type Boolean instead of Bit.
I used the data type nvarchar(16777216) instead of nvarchar(max).
I used the data type varbinary (8388608) instead of varbinary (max).
5. Extract all the data from each table using Bulk Copy Protocol (BCP) and write it into CSV files. Although there are various methods to upload and extract data into Snowflake, I used the existing SQL Server tools without downloading any ODBC driver. (See BCP_OUT.txt command)
6. Load the CSV files into Snowflake. (See Snowflake_File_Format.txt)
7. Set up Microsoft Fabric Mirroring, which took less than 15 minutes, and synchronized all the data.
First step is to create a dedicated Workspace for your Data Warehouse.
Follow below (Step 1 to 9).
GO TO DATA WAREHOUSE AND CHOOSE MIRRORED SNOWFLAKE > NEW CONNECTION
SUPPLY CONNECTION DETAILS AND SELECT DATABASE ON DROPDOWN
SELECT TABLES TO BE REPLICATED OR AUTOMATICALLY MIRROR ALL TABLES AND CLICK APPLY CHANGES
SELECT MONITOR REPLICATION > VIEW DETAILS
GO TO YOUR WORKSPACE AND CHOOSE SQL ANALYTICS ENDPOINT
What are the Platforms supported?
Currently, the following database are available in preview:
1. Azure Cosmos DB
2. Azure SQL Database
3. Snowflake
You can check all above link for it's Security Considerations, Network Requirements, Limitations, Cost and support.
I was curious about the status of replication when the resource is set to auto suspend or turn off (16 hours). After checking, I was amazed to discover that the replication status remains active.