Prerequisits
For this getting started guide, you will need the following resources:
- SQL database containing betl
- SQL source database ( e.g. adventureWorks or Contoso: https://www.microsoft.com/en-us/download/details.aspx?id=18279 ).
- SQL target database. For this example let’s call this the raw data warehouse (rdw).
Please install them manually
Ingest object tree for on premise source
I made a template from the dbo.obj_tree_ms_sql_server view in which I added the database parameter.
Note that this meta data query is a little bit different for other database vendors. E.g. Oracle, postgress, MySQL.
-- begin {{template_name}} {{schema_name}}.{{obj_name}}[{{obj_id}}]
-- exec [dbo].[parse_handlebars] {{obj_id}}, '{{template_name}}'
select
null src_obj_id
, isnull(o.object_id, db.database_id) external_obj_id
, dbo.server_type() server_type
, @@SERVERNAME server_name
, db.name db_name
, s.name [schema_name]
, o.name as obj_name
, case
when o.type = 'U' then 10
when o.type = 'V' then 20
when s.name is not null then 30
when db.name is not null then 40
else 50 -- server
end obj_type_id
, c.column_id ordinal_position
, c.name column_name
, null column_type_id
, c.is_nullable is_nullable
, t.name data_type
, c.max_length max_len
--, case when DATA_TYPE in ('int', 'bigint', 'smallint', 'tinyint', 'bit') then cast(null as int) else numeric_precision end numeric_precision
, convert(tinyint, CASE -- int/decimal/numeric/real/float/money
WHEN c.system_type_id IN (48, 52, 56, 59, 60, 62, 106, 108, 122, 127) THEN c.precision
END) AS NUMERIC_PRECISION
, convert(int, CASE -- datetime/smalldatetime
WHEN c.system_type_id IN (40, 41, 42, 43, 58, 61) THEN NULL
ELSE ODBCSCALE(c.system_type_id, c.scale) END) AS NUMERIC_SCALE
, case when ic.is_descending_key=0 then 'ASC'when ic.is_descending_key=1 then 'DESC'else null end [primary_key_sorting]
, convert(nvarchar(4000),
OBJECT_DEFINITION(c.default_object_id)) AS [default_value]
, null _source
from
{{db_name}}.sys.databases db
left join {{db_name}}.sys.schemas s on db.database_id = db_id()
left join {{db_name}}.sys.objects o on o.schema_id = s.schema_id
and o.type in ( 'U','V') -- only tables and views
and o.object_id not in
(
select major_id
from {{db_name}}.sys.extended_properties
where name = N'microsoft_database_tools_support'
and minor_id = 0 and class = 1) -- exclude ssms diagram objects
left join {{db_name}}.sys.columns c on c.object_id = o.object_id
left join {{db_name}}.sys.types t on c.user_type_id = t.user_type_id
-- = s.name and col.table_name = o.name
-- left join {{db_name}}.sys.columns col on
--col.table_schema = s.name
--and col.table_name = o.name
--and col.COLUMN_NAME=c.name
left join {{db_name}}.sys.indexes i on
i.object_id = o.object_id
and i.is_primary_key = 1
left join {{db_name}}.sys.index_columns ic on
ic.object_id = o.object_id
and ic.column_id = c.column_id
where
isnull(s.name,'') not in ( 'sys', 'INFORMATION_SCHEMA', 'guest')
and isnull(s.name,'') not like 'db[_]%'
and db.name not in ('master','model','msdb','tempdb')
union all
select null, suser_sid()
, dbo.server_type()
, @@SERVERNAME server_name
, {{db_name}}
, null
, suser_sname()
, 60 -- user
, null
, null
, null
, null
, null
, null
, null
, null
, null
, null
, null
-- end {{template_name}} {{schema_name}}.{{obj_name}}[{{obj_id}}]
I added this template to the table static.Template with the template name obj_tree_ms_sql_server. Now we can test the template:
declare @sql as varchar(max)
exec parse_handlebars 'ContosoRetailDW', 'obj_tree_ms_sql_server' , @output = @sql OUTPUT;
exec (@sql )
-- set results to text in smss to better read the output
- side-note. to adjust the amount of logging that is returned, you can set the loglevel to verbose or debug.
-- display log_level and other user bound properties
my_info
-- set log_level to verbose
verbose
-- set log_level to debug
debug
-- to clear and initialise the entire meta data use this:
clear_meta_data
-- ingest the object tree of the betl database and all databases on the localhost.
ingest_current_db
The next step is to combine the object tree query with the ingest code:
declare @sql as varchar(max)
exec parse_handlebars 'ContosoRetailDW', 'ingest_obj_tree_ms_sql_server' , @output = @sql OUTPUT;
exec (@sql )
select * from obj_ext
Load tables into staging database
Preparation
- Create an empty Staging database called sqldb_staging and a schema called con. (We create a schema for every source system).
-- make sure that you have a target staging database with a Con schema
IF NOT EXISTS (SELECT * FROM sys.databases WHERE name = 'sqldb_staging')
create database sqldb_staging
GO
connect to the new sqldb_staging using ssms and create the schema.
IF NOT EXISTS ( SELECT schema_id FROM sqldb_staging.sys.schemas WHERE name = 'con')
exec('create schema [con] AUTHORIZATION [dbo]')
After this db and schema are created they must be added to the meta data
-- update meta db with databases and schemas in LOCALHOST
exec betl.dbo.refresh 'LOCALHOST', 1
-- update meta db with schemas in [My_Staging]
exec betl.dbo.refresh '[sqldb_staging]'
Copy tables into staging using truncate insert templates
Please continue reading the Getting started with BETL in Azure to see how you can use templates to build the staging layer and refresh the raw data warehouse. The big difference with Azure is that you can use cross database queries and linked servers, so you can build everything using stored procedures. However I prefer to use SSIS for orchestration, because orchestration is easier using a visual design tool that supports monitoring and batch handling.