Azure SQL DB Cross Database Queries with Elastic Queries – the Good, the Bad & the Ugly

Post applies to:
SQL ServerAzure SQL VMSQL DB SingletonSQL DB Managed InstanceSynapse (SQL DW)

Often perceived as a major limitation of Azure SQL DB Singleton databases is their lack of support for incredibly useful and much used cross DB/server queries using the 3 and 4 part naming convention.

Unfortunately Azure SQL DB Singletons are contained databases under a logical server/instance. Each database actually has it’s own IP address (more on this later) so Singleton databases can’t see each other – even those under the same logical server – and throw an error if you try to use the traditional 3 or 4 part object names.

The Good

However there is a way to do cross database queries between Singleton DBs using something called Elastic Queries. These allow you to query tables/views in databases outside of the Singleton DB you’re connected to.

Elastic Query running against [elasticQryDB1] successfully querying the [db2Country] table in the [elasticQryDB2] database.

Note no 3 or 4 part object name is used for [db2Country] it is referenced as a local table.

The Bad

Elastic queries are good as far as they go but definitely have short comings compared to full-blown cross-db/server queries:

  • They don’t actually switch-on 3 and 4 part object names – even when Elastic Queries are set-up you’ll still get the error if you use 3 or 4 part object names
  • Every remote table/view that you want to query needs an External Table object creating which points/connects to the remote table/view
    • And the External Table depends on an existing External Data Source connection object (see sample script section 3 below) albeit only one of these is needed for each remote/target DB
  • You can only reference external tables & views
    • So no external Proc & UDF calls using Elastic Queries
    • Although you can use [sp_execute_remote] to execute remote SQL (using an existing External Data Source) including Proc executions & UDF references
  • Creating External Tables is not entirely intuitive due to the plumbing that needs to be set-up first
  • There is a significant over-head in creating an External Table definition for *every* remote object you want to query

On this final point it’s feasible that a script could be run against the target DB that automatically generates the External Table CREATE statements. But for full automation – to keep on top of schema changes in the remote/target database – this script would need to be produced and then run against the Singleton database where the External Table definitions reside. In an environment where there’s no SQL Agent this could be the final nail in the coffin for many DBAs who wanted to attempt using Elastic Queries.

The Ugly

Elastic queries are good but they aren’t as rich as full-blown as the 3 and 4 part naming convention for the following reasons:

  • On Azure SQL Database Singelton the External Data Source connection object only supports connections to other Azure SQL DB Singelton database so no sneaky queries back to your on-prem DB!
  • Elastic Queries only allow SELECT statements
    • other DML statements (INSERT/UPDATE/DELETE) aren’t supported
  • External Table definitions use an embedded impersonation account for the connection (which are used in the DATABASE SCOPED CREDENTIAL – see sample TSQL below)
    • This can unintentionally elevate users permissions
  • External Table definitions aren’t dynamic
    • New columns added to the External Table aren’t available to query without amending the External Table definition
    • Worse still if a column is removed/renamed in the remote table the External Table definition becomes invalid and needs to be amended otherwise an error will be thrown
  • When configuring Elastic Queries the remote/target database needs to allow network access for the querying DB so you will have to either:
    1. Add the IP address of the database where the External Table definition live (i.e. the database needing to reference the the remote table) to the remote/target database‘s server or database firewall
    2. OR configure Private Endpoint access
    3. OR switch on “Allow Azure services and resources to access this server” in the remote/target database’s server firewall settings – not recommended =8O
    4. You need to do one of these this even if the source & target databases are on the same server
      • remember that Singleton DBs are fully contained – each database has it’s own IP address

Elastic Query Sample

These scripts demonstrate using an Elastic Query to reference another database on the same Azure SQL DB logical server. It expects to find 2 separate Azure SQL DB Singleton databases – [elasticQryDB1] and [elasticQryDB2].

The scripts build a [db1Customer] table on [elasticQryDB1] and a [db2Country] table on [elasticQryDB2].

The [db1Customer] table has an notional FK of [CountryId] that maps to the PK in the [db2Country] table on [elasticQryDB2]. Referential Integrity is not enforced by Elastic Query.

Script Pre-requisites

  1. An Azure SQL DB Singelton database named [elasticQryDB1]
  2. An Azure SQL DB Singelton database named [elasticQryDB2]
  3. Both databases attached to an Azure SQL logical server named [sqlsrv-elasticquery]
  4. On the [sqlsrv-elasticquery] SQL Server, under it’s “firewalls and virtual networks” settings, switch on “Allow Azure services and resources to access this server”

DO NOT DO #4 ON YOUR ENVIRONMENTS – including non-prod – unless you understand the impLications!

We’re just doing this for simplicity here – you should add the local database’s IP address to either the remote/target Server or Database firewall rules or use Service Endpoints.

Script 0: Run against [master]

/*
CREATE SEVER LEVEL LOGIN USED IN [elasticQryDB2] FOR REMOTE ACCESS BY [elasticQryDB1]:
*/
IF EXISTS (select * from sys.sql_logins where name = 'remote_user') 
        DROP LOGIN remote_user;
CREATE LOGIN remote_user WITH PASSWORD = 'Pass@12345' 
GO

Script 1: Run against [elasticQryDB1]

/*
CREATE [db1Customer] TABLE & POPULATE SOME SAMPLE DATA
*/
IF (select object_id('db1Customer')) IS NOT NULL DROP TABLE db1Customer;
CREATE TABLE dbo.db1Customer (
	CustomerId INT NOT NULL PRIMARY KEY,
	CustomerName NVARCHAR(50),
	CountryID INT
);
 
INSERT INTO dbo.db1Customer(CustomerId, CustomerName, CountryId) VALUES 
	(101, 'Steve', 201),
	(102, 'Mike', 202),
	(103, 'Bob', 203),
	(104, 'Bill', 204),
	(105, 'Ben', 205 );


SELECT * FROM dbo.db1Customer

Script 2: Run against remote database [elasticQryDB2]

/*
CREATE [db1Customer] TABLE & POPULATE SOME SAMPLE DATA
*/
IF (select object_id('db2Country')) IS NOT NULL DROP TABLE db2Country;
CREATE TABLE dbo.db2Country (
	CountryId INT NOT NULL PRIMARY KEY,
	Country NVARCHAR(50)
);
 
INSERT INTO dbo.db2Country(CountryId, Country) VALUES  
        (201, 'United States'),
	(203, 'Greece'),
	(204, 'France'),
	(205, 'Germany'),
	(206, 'Ireland');	

SELECT * FROM db2Country


/*
MAP [remote_user] LOGIN TO DB USER IN [elasticQryDB2] & ADD TO [db_owner] ROLE
*/
IF EXISTS (select * from sys.sysusers where name = 'remote_user') 
      DROP USER remote_user;
CREATE USER remote_user FOR LOGIN remote_user WITH DEFAULT_SCHEMA=[dbo];
GO
-- Add user to the database owner role
EXEC sp_addrolemember 'db_owner', 'remote_user'
GO

Script 3: Run against [elasticQryDB1]

/*
ELASTIC QUERY needs a "external table" which points/connect to the remote table.
The External Table needs some plumbing before it can be defined including a 
External Data Source conn object.
*/


/*
STEP 0: TIDY UP
Drop External Table, Data Source, Database Scoped Credential & Master Key if they exist
*/
IF EXISTS(select * from sys.external_tables where name = 'db2Country')
	DROP EXTERNAL TABLE [dbo].db2Country;
IF EXISTS (select * from sys.external_data_sources where name = 'datasource_elasticQryDB2') 
	DROP EXTERNAL DATA SOURCE [datasource_elasticQryDB2];
IF EXISTS (select * from sys.database_scoped_credentials where name = 'elasticQryDB2_remote_user') 
	DROP DATABASE SCOPED CREDENTIAL [elasticQryDB2_remote_user];
IF EXISTS (SELECT * FROM sys.symmetric_keys WHERE name = '##MS_DatabaseMasterKey##')
	DROP MASTER KEY;
GO


/*
STEP1: CREATE MASTER KEY
The database master key is a symmetric key used to protect the private keys of certificates
and asymmetric keys that are present in the database i.e. its used purely internally for 
encrypting/protecting other forms for encryption within the server
*/
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<Use2A_Strong@PassWord>'
GO

 
/*
STEP 2: CREATE DATABASE SCOPED CREDENTIAL WHICH REFERENCES A USER IN THE REMOTE DB
The database scoped credential contains the login used to connect to your external 
data source. 
The user name and password should be for an account that has access to the tables in the
external database.
*/
CREATE DATABASE SCOPED CREDENTIAL [elasticQryDB2_remote_user]
WITH IDENTITY = 'remote_user',
SECRET = 'Pass@12345';
GO


/*
STEP 3: CREATE EXTERNAL DATA SOURCE OF TYPE RDBMS 
i.e. a connection object to the DB holding the remote table
*/
CREATE EXTERNAL DATA SOURCE [datasource_elasticQryDB2]
WITH (
	TYPE=RDBMS,
	LOCATION='sqlsrv-elasticquery.database.windows.net',
	DATABASE_NAME='elasticQryDB2',
	CREDENTIAL = [elasticQryDB2_remote_user]
);
GO


/*
STEP 4: CREATE EXTERNAL TABLE DEFINITION
This instruction creates an external table for Elastic Database query.
The EXTERNAL TABLE name & definition *must* be identical to remote table definition.
*/
CREATE EXTERNAL TABLE [dbo].db2Country ( --This is the name of the remote/target table
	[CountryId] INT,   --Column names & datatypes must be same as remote table
	[Country] NVARCHAR(50)
) WITH (DATA_SOURCE = [datasource_elasticQryDB2]);
GO


/*
STEP5. TEST EXTERNAL TABLE DEFINITION WORKING 
*/
SELECT * FROM dbo.db2Country; 

--...and in a JOIN:
SELECT
  db1Customer.CustomerId,
  db1Customer.CustomerName,
  db2Country.Country
FROM dbo.db1Customer 
  INNER JOIN dbo.db2Country ON db1Customer.[CountryId] = db2Country.[CountryId];

One thought on “Azure SQL DB Cross Database Queries with Elastic Queries – the Good, the Bad & the Ugly

  1. Hi. Interesting read. However, I don’t see how the Private Endpoint & Private Link approach will help with the access. If all parties are Azure SQL Databases, they won’t sit in any VNet for outbound connectivity, so will not benefit from the Private Endpoint, right? I’ve tried this (somewhat against my better judgement) and it does not seem to work indeed.

    Also, adding the IP of the database that holds the EXTERNAL TABLE definition to the firewall of the databases holding the actual table will work, but only for as long as that IP address will remain the same. I’m not aware of any guarantee that this will, in fact, remain stable for all eternity.

    So from the looks of it, this only really works when opening up the firewall to at least all IP ranges used by SQL in the datacenter of choice, which is still too large of a list for my taste. Or am I missing something?

    Thanks in advance for any additional insight here, because I also find it hard to believe that there’s no better way to solve this.

    Like

Leave a reply to Annejan Cancel reply