SQL SERVER ARCHITECTURE
Database Engine
The Database Engine component of SQL Server is the core service for storing, processing and securing data. SQL Server supports up to 50 instances of the Database Engine on a single computer. An instance of the Database Engine is a copy of the sqlservr.exe executable that runs as an OS service and handles all application requests to perform work on the data in any of the databases managed by that instance.
Applications connect to the instance through a network connection if the application and instance are on separate computers. If the application and instance are on the same computer, the SQL Server connection can run as either a network connection or an in-memory connection. When a connection has been completed, an application sends T-SQL statements across the connection that are resolved by the instance into operations against the data and objects in the databases, and if the required permissions have been granted to the login credentials, performs the work. Any data retrieved is returned to the application, along with any messages such as errors.
Pages and extents
The page is the fundamental unit of data storage in SQL Server. The disk space allocated to a data file (.mdf or .ndf) in a database is
logically divided into pages numbered contiguously from 0 to n. Disk I/O operations are performed at the page level i.e SQL Server reads
or writes whole data pages.
The page size is 8 KB i.e databases have 128 pages per MB. Each page begins with a 96-byte header that is used to store system information
about the page including the page number, page type, the amount of free space on the page and the allocation unit ID of the object that
owns the page. The following page types are used in the data files of a SQL Server database
Data - data rows with all data (except large object data types)
Index - index entries
Text/Image - large object data types (text, ntext, image, nvarchar(max), varchar(max), varbinary(max) and xml)
Global Allocation Map (GAM)/Shared Global Allocation Map (SGAM) - information about whether extents are allocated
Page Free Space (PFS) - information about page allocation and free space available on pages
Index Allocation Map - information about extents used by a table or index per allocation unit
Extents are a collection of eight physically contiguous pages and are used to efficiently manage the pages. All pages are stored in extents.
Databases have 16 extents per MB.
SQL Server has two types of extents
Uniform extents are owned by a single object - all eight pages in the extent can only be used by the owning object
Mixed extents are shared by up to eight objects - each of the eight pages in the extent can be owned by a different object
SQL Server uses two types of allocation maps to record the allocation of extents
Global Allocation Map (GAM) - GAM pages record what extents have been allocated. Each GAM covers 64,000 extents (4 GB of data) and has
1-bit for each extent in the interval it covers. If the bit is 1, the extent is free and if 0, the extent is allocated.
Shared Global Allocation Map (SGAM) - SGAM pages record which extents are currently being used as mixed extents and also have at least
one unused page. Each SGAM covers 64,000 extents (4 GB of data) and has 1-bit for each extent in the interval it covers. If the bit
is 1, the extent is being used as a mixed extent and has a free page. If the bit is 0, the extent is not used as a mixed extent or it is a
mixed extent and all its pages are being used.
Current use of extent GAM bit setting SGAM bit setting
Free, not being used 1 0
Uniform extent, or full mixed extent 0 0
Mixed extent with free pages 0 1
Memory management
SQL Server dynamically acquires and frees memory as required and the amount of memory to be allocated to SQL server need not be specified.
SQL Server builds a buffer pool in memory to hold pages read from the database and tries to minimize the number of physical reads and
writes between the disk and the buffer pool. SQL Server tries to reach a balance between
keeping the buffer pool from becoming so big that the entire system is low on memory
Minimizing physical I/O to the database files by maximizing the size of the buffer pool
When SQL Server is using memory dynamically, it queries the system periodically to determine the amount of free memory and if it is less,
releases memory to the OS to prevent the OS from paging. SQL Server adds memory only when its workload requires more memory, a server at
rest does not increase the size of its virtual address space.
SQL Server uses the memory notification API QueryMemoryResourceNotification to determine when the SQL Server Memory Manager may allocate
memory and release memory.
When SQL Server starts, it computes the size of virtual address space for the buffer pool based on a number of parameters such as amount of
physical memory on the system, number of server threads and various startup parameters and reserves the computed amount, but it acquires
(commits) only the required amount of physical memory for the current load.
The instance then continues to acquire memory as needed to support the workload. As more users connect and run queries, the additional
physical memory is acquired on demand. An instance continues to acquire physical memory until it either reaches its max server memory
allocation target or the OS indicates there is no longer an excess of free memory; it frees memory when it has more than the min server
memory setting and the OS indicates that there is a shortage of free memory.
SQL Server can free and acquire several megabytes of memory each second, allowing it to quickly adjust to memory allocation changes.
min and max server memory
The min and max server memory configuration options establish upper and lower limits to the amount of memory used by the buffer pool and
other caches of the SQL Server Database Engine. The buffer pool does not immediately acquire the memory specified in min server memory and
starts with only the memory required to initialize. As the workload increases, it keeps acquiring memory and does not free any of the
acquired memory until it reaches min server memory. Once min server memory is reached, the buffer pool then uses the standard algorithm to
acquire and free memory and never drops its memory allocation below the level specified in min server memory and never acquires more memory
than max server memory.
The SQL Server process acquires more memory than specified by max server memory option. Both internal and external components can allocate
memory outside of the buffer pool, which consumes additional memory, but the memory allocated to the buffer pool usually represents the
largest portion of memory consumed by SQL Server.
Buffer management
Buffer management is a key component in achieving I/O efficiency. The buffer management component consists of two mechanisms: the buffer
manager to access and update database pages and the buffer cache (also called the buffer pool), to reduce database file I/O.
The buffer cache is divided into 8 KB pages (buffers), the same size as a data or index page. The buffer manager manages reading data or
index pages from the disk into the buffer cache and writing modified pages back to disk. A page remains in the buffer cache until the buffer
manager needs the area to read in more data. Data in the buffer cache can be modified multiple times before being written back to disk.
When SQL Server starts, it computes the size of virtual address space for the buffer cache based on a number of parameters and reserves this
computed amount (called the memory target) but it acquires only the required amount of memory for the current load. The sys.dm_os_sys_info
catalog view can be queried to return the number of pages reserved as the memory target and the number of pages currently committed in the
buffer cache.
select committed_target_kb, committed_kb from sys.dm_os_sys_info;
The interval between SQL Server startup and when the buffer cache obtains its memory target is called ramp-up which depends on the number
and type of client requests. Ramp-up is expedited by transforming single page read requests into aligned eight page requests (making up one
extent). This allows the ramp-up to finish much faster, especially on machines with a lot of memory.
Because the buffer manager uses most of the memory in the SQL Server process, it cooperates with the memory manager to allow other
components to use its buffers. The buffer manager interacts primarily with the following components:
Resource manager to control overall memory usage
Database manager and the SQL Server Operating System (SQLOS) for low-level file I/O operations
Log manager for write-ahead logging
Transaction log
Every SQL server database has a transaction log that records all transactions and their modifications to the database and is used to
recover the database in case of system failure.
The transaction log supports the following operations:
Individual transaction recovery - if an application issues a ROLLBACK or if the Database Engine detects an error such as the loss of
communication with a client, the log records are used to roll back the modifications made by an incomplete transaction
Recovery of all incomplete transactions - If a server fails, the databases may be left in a state where some changes were never written
from the buffer cache to the data files and there may be updates from incomplete transactions in the data files. When an instance of SQL
Server is started, it runs a recovery of each database and modifications recorded in the log which may not have been written to the data
files are rolled forward and any incomplete transactions found in the log are rolled back to ensure the database integrity is preserved
Rolling a restored database, file, filegroup, or page forward to the point of failure
Supporting transactional replication
Supporting high availability and disaster recovery solutions - Always On availability groups, database mirroring and log shipping
The SQL Server Database Engine divides each physical log file internally into a number of virtual log files (VLFs) with no fixed size
or fixed number. The Database Engine chooses the size of the virtual log files dynamically while it is creating or extending log files,
trying to keep their number to a minimum. The size of the virtual files after a log file has been extended is the sum of the size of the
existing log and the size of the new file increment. The size or number of virtual log files cannot be configured.
Log truncation
Log truncation frees space in the log file for reuse by the transaction log by deleting inactive VLFs. If a transaction log were never
truncated, it would eventually fill all the disk space that is allocated to its physical log files. Before the log can be truncated, a
checkpoint operation must occur. A checkpoint writes the current in-memory modified pages (dirty pages) and transaction log information
from memory to disk and marks the inactive portion of the log as reusable.
Truncation occurs automatically after the following events
Under the simple recovery model, after a checkpoint
Under the full or bulk-logged recovery model, if a checkpoint has occurred since the previous backup, truncation occurs after a log
backup (unless it is a copy-only log backup)
The active portion of the transaction log can never be truncated. The record at the start of the active portion of the log is identified by
the minimum recovery log sequence number (MinLSN).
The recovery model chosen for a database determines how much of the transaction log in front of the active portion must be retained.
Although the log records in front of the MinLSN play no role in recovering active transactions, they are required to roll forward
modifications when using log backups to restore a database to the point of failure.
In case of a crash, data can be recovered by restoring the last database backup, and then restoring every log backup since the database
backup. This means that the sequence of log backups must contain every log record that was written since the database backup. When a
sequence of transaction log backups is maintained, no log record can be truncated until after it has been written to a log backup. The log
records before the MinLSN are only needed to maintain a sequence of transaction log backups.
In the simple recovery model, a sequence of transaction logs is not being maintained. All log records before the MinLSN can be truncated at
any time, except while a BACKUP statement is being processed. NO_LOG and TRUNCATE_ONLY are the only BACKUP LOG options that are valid for
a database that is using the simple recovery model.
SELECT * FROM sys.dm_db_log_space_usage; - returns space usage information for the transaction log
Shrinking the log
Log truncation does not reduce the size of the physical log file which needs to be shrunk to free the unused space. A log file can be
shrunk only while the database is online and at least one VLF is free.
When a transaction log file is shrunk, inactive VLFs are removed from the end of the log file to reduce the log to approximately the target
size. But if the storage space is required again after a log shrink, the transaction log will grow again and may introduce performance
overhead during log growth operations.
-- shrink a log file to a specific target size
ALTER DATABASE db1 SET RECOVERY SIMPLE; - the file is first truncated by setting the database recovery model to SIMPLE
DBCC SHRINKFILE (db1_Log, 1); - shrink the truncated log file to 1 MB
ALTER DATABASE db1 SET RECOVERY FULL; - reset the database recovery model
Add or enlarge a log file
One transaction log file is sufficient unless log space is running out and disk space is also running out on the volume that holds the log
file. Space can be gained by enlarging the existing log file or by adding a log file to the database, typically on a different disk.
ALTER DATABASE db1 MODIFY FILE (name=db1_log, size = 200 MB); - enlarge a log file
-- add two log files
ALTER DATABASE db1 ADD LOG FILE
( NAME = db1_log2, FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\db1_log2.ldf',
SIZE = 5MB, MAXSIZE = 100MB, FILEGROWTH = 5MB ),
( NAME = db1_log3, FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\db1_log3.ldf',
SIZE = 5MB, MAXSIZE = 100MB, FILEGROWTH = 5MB );
Back
DATA STRUCTURES
Databases
Each instance of SQL Server can contain one or many databases. Within a database, there are one or many object ownership groups called
schemas and within each schema there are objects like tables, views and stored procedures. Some objects such as certificates and asymmetric
keys are contained within the database but not within a schema.
-- create a database specifying data and transaction log files (size in MB, by default)
CREATE DATABASE db1 ON
(NAME = db1_dat, FILENAME = 'D:\Program Files\SQL Server\Data\db1.mdf', SIZE = 1000, MAXSIZE = 200000, FILEGROWTH = 5)
LOG ON
(NAME = db1_log, FILENAME = 'D:\Program Files\SQL Server\DATA\db1.ldf', SIZE = 50MB, MAXSIZE = 2500MB, FILEGROWTH = 5MB);
select * from sys.databases; - list all databases
Database files and filegroups
At a minimum, every SQL Server database has two OS files - a data file and a log file. Data files contain data and objects such as tables,
indexes, stored procedures, and views. Log files contain the information that is required to recover/rollback all transactions performed
on the database. Data files can be grouped together in filegroups for allocation and administration purposes.
The three types of files are
Primary - The primary data file contains the startup information for the database and points to the other files in the database. User data
and objects can be stored in this file or in secondary data files. Every database has one primary data file. The recommended file name
extension for primary data files is .mdf
Secondary - Secondary data files are optional, are user-defined and store user data. Secondary files can be used to spread data across
multiple disks by putting each file on a different disk drive. Additionally, if a database exceeds the maximum size for a single Windows
file, secondary data files can be used so the database can continue to grow. The recommended extension for secondary data files is .ndf
Transaction log - There must be at least one log file for each database and the recommended extension is .ldf
Database detach and attach
The data and log files of a database can be detached and then reattached to the same or another instance of SQL Server. Detaching and
attaching a database is useful for moving the database to a different instance of SQL Server or computer.
EXEC sp_detach_db 'db1', 'true'; - detach database
CREATE DATABASE testdb ON (FILENAME = 'C:\mssql\testdb_data.mdf'), (FILENAME = 'C:\mssql\testdb_log.ldf')
FOR ATTACH; - attach database
A database cannot be detached if
the database is replicated and published - if replicated, the database must be unpublished and publishing must be disabled by running
sp_replicationdboption or replication must be removed by running sp_removedbreplication
a database snapshot exists on the database - before detaching the database, all of its snapshots must be dropped
the database is being mirrored in a database mirroring session - the database cannot be detached unless the session is terminated
the database is suspect - a suspect database must be put into emergency mode before detaching
the database is a system database
System Databases
master - stores all the system-level information for an SQL Server instance
msdb - used by SQL server Agent to schedule jobs and alerts
model - template for all new databases
resource - read-only database that contains system objects
tempdb - workspace for temporary objects and intermediate resultsets
Tables
SQL Server can have as many as two billion tables per database and 1,024 columns per table. The number of rows and total size of the table
are limited only by the available storage.
The maximum number of bytes per row is 8,060. If tables are created with varchar, nvarchar or varbinary columns in which the total defined
width exceeds 8,060 bytes, the table is created, but a warning message appears. Trying to insert more than 8,060 bytes into such a row or to
update a row so that its total row size exceeds 8,060 produces an error message and the statement fails.
CREATE TABLE statements that include a sql_variant column can generate the following warning: The total row size (xx) for table 'yy'
exceeds the maximum number of bytes per row (8060). Rows that exceed the maximum number of bytes will not be added.. This warning occurs
because sql_variant can have a maximum length of 8016 bytes and when a sql_variant column contains values close to the maximum, it can
overshoot the row's maximum size limit.
CREATE TABLE t1 ( c1 int primary key identity(1,1), c2 varchar(255), INDEX ix_1 NONCLUSTERED (c2));
EXEC sp_rename 'sales.t1', 't2'; - rename table
ALTER TABLE t1 ADD col1 int NOT NULL, col2 varchar(20); - add columns
ALTER TABLE t1 DROP COLUMN col1; - drop column
EXEC sp_rename 'dbo.t1.col1', 'col2', 'COLUMN'; - rename column
ALTER TABLE dbo.t1 ALTER COLUMN col1 DECIMAL(5,2); - modify column
A computed column is a virtual column that is not physically stored in the table, unless the column is marked PERSISTED. A computed column
expression uses data from other columns to calculate a value for the column to which it belongs.
CREATE TABLE dbo.products - table with computed column
(
product_id int IDENTITY (1,1) NOT NULL,
quantity smallint,
unit_price money,
inventory_value AS quantity * unit_price );
ALTER TABLE dbo.products DROP COLUMN retail_value; - drop a computed column
ALTER TABLE dbo.products ADD retail_value AS (quantity * unit_price * 1.5); - add a computed column
Temporary tables
A local temp table name is prefixed with single number sign (#table_name) and a global temp table with a double number sign (##table_name).
Differences between local and global temporary tables:
Visibility - Local temporary tables are visible only in the current session; global temporary tables are visible to all sessions.
Lifetime - A local temporary table created in a stored procedure is dropped automatically when the stored procedure completes. All other local
temporary tables are dropped automatically at the end of the current session. Global temporary tables are automatically dropped
when the session that created the table ends and all other tasks have stopped referencing them.
IF OBJECT_ID('tempdb..#Results') IS NOT NULL -- check if temp table exists
DROP TABLE #Results;
System tables
sys.sysobjects
select 'drop table ' + name from sys.sysobjects where type = 'U' and name like '%_old'; -- create drop statements for tables
SYS.SYSPROCESSES
select cmd,* from sys.sysprocesses where blocked > 0; -- find all blocked statements
In-memory tables
Memory-optimized tables are created using the MEMORY_OPTIMIZED = ON option in the create table statement. The entire table resides in
memory. A copy is maintained on disk, but for durability purposes only. Non-durable in-memory tables are also supported, with no logging of
data and no persistence to disk. Transactions on these tables do not require any disk IO, but the data will not be recovered if there is a
server crash.
Data in memory-optimized tables can be accessed
Through natively compiled stored procedures.
Through interpreted Transact-SQL, outside of a natively-compiled stored procedure. These T-SQL statements may be either inside
interpreted stored procedures or they may be ad-hoc T-SQL statements.
Indexes
Each table can contain a maximum of 249 nonclustered indexes and 1 clustered index. A clustered index is automatically created when a
PRIMARY KEY is defined for a table and a non-clustered index when a UNIQUE constraint is defined. If a PRIMARY KEY is enforced on a table
with an existing clustered index, SQL Server enforces the primary key using a nonclustered index.
Columns that are of the large object (LOB) data types ntext, text, varchar(max), nvarchar(max), varbinary(max), xml, or image cannot
be specified as key columns for an index.
CREATE [CLUSTERED /NONCLUSTERED /UNIQUE] INDEX idx1 on table/view (col1,col2 etc.)
WITH options
ON [filegroup];
Options
FILLFACTOR=n - specifies a percentage that indicates how full the Database Engine should make the leaf level of each index page during
index creation or rebuild. The default is 0. If fillfactor is 100 or 0, leaf pages are filled to capacity
IGNORE_DUP_KEY - for multiple row insert operations, when set to ON, only the rows violating an unique index fail and a warning message is
issued. When set to OFF, an error message is issued and the entire INSERT transaction is rolled back.
Included columns
Non-key columns can be included in the leaf-level of non-clustered indexes using the INCLUDE clause to cover more queries. Non-key columns
have the following benefits
they can be data types not allowed as index key columns
they are not considered by the Database Engine when calculating the number of index key columns or index key size
An index with non-key columns can significantly improve query performance when all columns in the query are included in the index either
as key or non-key columns. The query optimizer can then locate all the column values within the index, table or clustered index data is not
accessed resulting in fewer disk I/O operations.
CREATE NONCLUSTERED INDEX ix_address_postalcode ON dbo.address (postalcode)
INCLUDE (addr_line1, addr_line2, city, state);
XML indexes
XML indexes can be created on xml data type columns. They index all tags, values and paths over the XML instances in the column and benefit
query performance. XML instances are stored in xml type columns as BLOBs and without an index, these BLOBs are shredded at run time to
evaluate a query which can be time-consuming. If XML BLOBs are commonly queried in the application, it helps to index the xml type columns
with the additional cost of index maintenance during data modification.
XML indexes fall into the following two categories
Primary XML index - The first index on the xml type column must be the primary XML index
Secondary XML index - Using the primary XML index, the following types of secondary indexes are supported - PATH, VALUE, and PROPERTY
Full-text indexes
A full-text index includes one or more character-based columns of the following data types - char, varchar, nchar, nvarchar, text, ntext,
image, xml or varbinary(max) and FILESTREAM. A full-text index can contain up to 1024 columns.
A full-text catalog needs to be created before creating a full-text index. A full-text catalog is a logical container for a group of
full-text indexes. A full-text catalog is a virtual object that does not belong to any filegroup.
CREATE FULLTEXT CATALOG ftCatalog AS DEFAULT;
CREATE FULLTEXT INDEX ON dbo.candidate(resume) KEY INDEX uk_candidate WITH STOPLIST = SYSTEM;
Rebuilding and reorganizing indexes
Rebuilding an index drops and re-creates the index. This removes fragmentation, reclaims disk space by compacting the pages based on the
specified or existing fill factor setting and reorders the index rows in contiguous pages.
Reorganizing an index uses minimal system resources. It defragments the leaf level of clustered and nonclustered indexes on tables and views
by physically reordering the leaf-level pages to match the logical, left to right, order of the leaf nodes. Reorganizing also compacts the
index pages. Compaction is based on the existing fill factor value.
Fragmentation exists when indexes have pages in which the logical ordering, based on the key value, does not match the physical ordering
inside the data file and can degrade query performance.
-- find the average fragmentation percentage of all indexes in a table
SELECT a.index_id, name, avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (DB_ID(N'HumanResDB'), OBJECT_ID(N'dbo.Employee'), NULL, NULL, NULL) AS a
JOIN sys.indexes AS b ON a.object_id = b.object_id AND a.index_id = b.index_id;
For fragmentation between 5% and 30%, ALTER INDEX REORGANIZE can be used and for > 30% ALTER INDEX REBUILD WITH (ONLINE = ON) can be
used. Rebuilding an index can be executed online or offline, while reorganizing an index is always executed online. To achieve availability
similar to the reorganize option, indexes can be rebuilt online.
Back
T-SQL LANGUAGE
T-SQL or Transact-SQL is Microsoft's implementation of the SQL standard. All tools and applications communicate with a SQL database by
sending T-SQL commands. Some of the Microsoft tools that execute T-SQL commands are SSMS, SQL Server Data Tools (SSDT), sqlcmd and Azure
Data Studio.
Datatypes
BINARY/VARBINARY - Binary datatypes of either fixed-length or variable length. Syntax: BINARY(n) or VARBINARY(n)
BIT - Integer datatype 1, 0 or NULL.
BIGINT/INT/SMALLINT/TINYINT - Integer data types of 8, 4, 2 and 1 bytes storage length respectively.
CHAR/VARCHAR - Fixed-length and variable length character datatypes. Syntax: CHAR(n) or VARCHAR(n)
DATETIME/SMALLDATETIME - Date and time data types for representing date and time of day.
DECIMAL/NUMERIC - Numeric datatypes with fixed precision and scale. Syntax: decimal(precision, scale) or numeric (p,s)
FLOAT/REAL - Approcimate number data types for use with floating-point numeric data. Syntax: FLOAT(n) or REAL
MONEY/SMALLMONEY - Monetary datatypes for representing monetary or currency values.
NCHAR, NVARCHAR - Unicode datatypes that help in storing data from various languages.
NTEXT/TEXT/IMAGE - Fixed and variable-length datatypes for storing large non-unicode, unicode character and binary data.
ntext - variable-length unicode data with a max length of 230 - 1 characters. Storage size is two times the number of
characters entered
text - variable length non-unicode data in the code page of the server and with a maximum length of 231 - 1 characters
image - variable length binary data from 0 through 231 - 1 bytes
table - is a special data type used to store a result set for processing at a later time. table is primarily used for temporarily storing a
set of rows that are returned as the table-valued function result set. Functions and variables can be declared to be of type table. table
variables can be used in functions, stored procedures and batches
XML - SQL server allows storage of native xml with the xml datatype. Creating a typed xml column is a 2 step process - creating a schema
and associating it to the column
CREATE XML SCHEMA COLLECTION dbo.nounsSchema AS
'<?xml version="1.0">
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="Category" type="xsd:string" />
<xsd:element name="arabic" type="xsd:string" />
<xsd:element name="english" type="xsd:string" />
</xsd:schema>';
CREATE TABLE dbo.words (id int PRIMARY KEY,
nouns xml(dbo.nounsSchema));
Operators
T-SQL uses these operator categories -
Arithmetic
+ - * / %(modulo)
Assignment
=
Bitwise
& (AND), | (OR), ^ (XOR)
Comparison
=, >, <, >=, <=, <> (not equal), != (not equal), !< (not lesser), !> (not greater)
Logical
ALL, AND, ANY, BETWEEN, EXISTS, IN, LIKE, NOT, OR, SOME
String concatenation
+
Unary
+, - . ~ (bitwise NOT)
Statements
Control flow
BEGIN...END - encloses a series of T-SQL statements, so that a group of statements can be executed. BEGIN...END blocks can be nested.
BEGIN
statements
END
IF....ELSE - imposes conditions on the execution of a T-SQL statement
IF boolean expression
BEGIN
sql statements
END
ELSE
BEGIN
sql statements
END
GOTO - alters the flow of execution to a label. GOTO statements can be nested
GOTO label1
...
label1:
THROW - raises an exception and transfers execution to a CATCH block of a TRY...CATCH construct
THROW 51000, 'The record does not exist.', 1;
Msg 51000, Level 16, State 1, Line 1
The record does not exist.
TRY...CATCH - used to implement error handling logic
BEGIN TRY
statements
END TRY
BEGIN CATCH
statements
END CATCH
WHILE - sets a condition for the repeated execution of an SQL statement or statement block. BREAK causes an exit from the innermost WHILE
loop. CONTINUE causes the WHILE loop to restart ignoring any statements after CONTINUE
WHILE boolean expression
BEGIN
sql statements
[BREAK]
[CONTINUE]
END
Cursors
CLOSE - Closes an open cursor by releasing the current resultset and freeing any cursor locks.
CLOSE cursor1
DEALLOCATE - removes a cursor reference. When the last cursor reference is deallocated, the data structures comprising the cursor are
released. Removes the association between a cursor and a cursor name.
DEALLOCATE cursor_name
DECLARE CURSOR - Defines the attributes of a T-SQL cursor, such as its scrolling behaviour and the query used to build the cursor.
DECLARE cursorname CURSOR
[LOCAL | GLOBAL]
[FORWARD_ONLY | SCROLL]
FOR select_statement
[FOR UPDATE OF columnname]
FETCH - Retrieves a specific row from a cursor.
FETCH [NEXT | PRIOR | FIRST | LAST] FROM cursor1 INTO @variable1, @variable2....@variablen
OPEN - Open and populate a cursor
DDL
ALTER TABLE - modifies a table definition by altering, adding or dropping columns or constraints
ALTER TABLE t1 add primary key (col1, col2, col3);
ALTER TABLE t1 ADD CONSTRAINT fk_t1_col1 FOREIGN KEY (col1) REFERENCES RefTableName(RefColumnName); - add a foreign key
ALTER TABLE t1 ADD col1 (varchar(50)); - add a column
ALTER TABLE tbl1 WITH NOCHECK ADD CONSTRAINT CK_tbl1_Col1 CHECK (Col1 in (1,2)); - add a check constraint
ALTER SCHEMA - Change schema
ALTER SCHEMA dbo TRANSFER AS0386.spAuthors1
DROP TABLE - Removes a table definition and all data, indexes, triggers, constraints and permission specifications for that table.
DML
INSERT - inserts a new row to a table or a view
MERGE - performs insert, update, or delete operations on a target table based on the results of a join with a source table
/*Using Merge to Insert/Update*/
MERGE dbo.UnitMeasure AS target
USING (SELECT @UnitMeasureCode, @Name) AS source (UnitMeasureCode, Name)
ON (target.UnitMeasureCode = source.UnitMeasureCode)
WHEN MATCHED THEN
UPDATE SET Name = source.Name
WHEN NOT MATCHED THEN
INSERT (UnitMeasureCode, Name)
VALUES (source.UnitMeasureCode, source.Name)
OUTPUT deleted.*, $action, inserted.* INTO #MyTempTable;
/*Using Merge to Update/Delete*/
CREATE PROCEDURE dbo.usp_UpdateInventory @OrderDate datetime AS
MERGE dbo.ProductInventory AS target
USING (SELECT ProductID, SUM(OrderQty) FROM dbo.SalesOrderDetail AS sod
JOIN dbo.SalesOrderHeader AS soh
ON sod.SalesOrderID = soh.SalesOrderID
AND soh.OrderDate = @OrderDate
GROUP BY ProductID) AS source (ProductID, OrderQty)
ON (target.ProductID = source.ProductID)
WHEN MATCHED AND target.Quantity - source.OrderQty <= 0
THEN DELETE
WHEN MATCHED
THEN UPDATE SET target.Quantity = target.Quantity - source.OrderQty,
target.ModifiedDate = GETDATE()
OUTPUT $action, Inserted.ProductID, Inserted.Quantity, Inserted.ModifiedDate, Deleted.ProductID,
Deleted.Quantity, Deleted.ModifiedDate;
TRUNCATE TABLE - removes all rows from a table without logging the individual row deletes. It is faster than DELETE and uses lesser system
and log resources. The DELETE statement removes rows one at a time and records an entry in the transaction log for each deleted row.
TRUNCATE TABLE removes the data by deallocating the data pages used by the table, and only the page deallocations are recorded in the log.
The counter used by an identity for new rows is reset to the seed for the column. TRUNCATE TABLE cannot be used on a table referenced by
a FOREIGN KEY constraint. Because TRUNCATE TABLE is not logged, it cannot activate a trigger.
UPDATE
General
DECLARE - Declares a variable in the body of a batch or a procedure. After declaration, all variables are initialized as NULL.
DECLARE @local_variable data_type;
EXECUTE - Executes a command string or character string within a Transact-SQL batch, or a system/user/extended stored procedure or a
scalar-valued user-defined function.
KILL - ends a user process using the session id. If the specified session id has much work to undo, the statement may take some time to
complete. Active session ID values can be obtained by querying the session_id column of the sys.dm_tran_locks, sys.dm_exec_sessions and
sys.dm_exec_requests dynamic management views or viewing the SPID column that the sp_who system stored procedure returns. If a rollback is
in progress for a specific SPID, the cmd column in the sp_who result set for that SPID indicates KILLED/ROLLBACK.
KILL sessionid;
SELECT - retrieves rows from the database. The main clauses are
[ WITH common_table_expression]
SELECT select_list [ INTO new_table ]
[ FROM table_source ] [ WHERE search_condition ]
[ GROUP BY group_by_expression [WITH CUBE | ROLLUP] ]
[ HAVING search_condition ]
[ ORDER BY order_expression [ ASC | DESC ] ]
[ FOR XML [RAW | AUTO | PATH | EXPLICIT] ]
The UNION, EXCEPT, and INTERSECT operators can be used between queries to combine or compare their results into one result set.
CUBE - Specifies that in addition to the usual rows provided by GROUP BY, summary rows are introduced into the result set. A GROUP BY
summary row is returned for every possible combination of group and subgroup in the result set.
ROLLUP - Specifies that in addition to the usual rows provided by GROUP BY, summary rows are introduced into the result set. Groups are
summarized in a hierarchical order, from the lowest level in the group to the highest.
FOR XML - retrieves the query output as XML. One of these modes can be specified:
The RAW mode generates a single <row> element per row in the rowset that is returned by the SELECT statement. XML hierarchy can be
generated by writing nested FOR XML queries.
The AUTO mode generates nesting in the resulting XML by using heuristics based on the way the SELECT statement is specified and the user
has minimal control over the shape of the XML generated. The nested FOR XML queries can be written to generate XML hierarchy beyond the XML
shape that is generated by AUTO mode heuristics.
The EXPLICIT mode allows more control over the shape of the XML. Attributes and elements can be mixed at will in deciding the shape of the
XML. It requires a specific format for the resulting rowset that is generated because of query execution. This rowset format is then mapped
into XML shape. The power of EXPLICIT mode is to mix attributes and elements at will, create wrappers and nested complex properties, create
space-separated values (for e.g OrderID attribute may have a list of order ID values) and mixed contents.
The PATH mode together with the nested FOR XML query capability provides the flexibility of the EXPLICIT mode in a simpler manner.
select * from vw_nouns FOR XML RAW; -- RAW mode uses a generic row tag
select * from vw_nouns FOR XML RAW, XMLSCHEMA; -- returns the schema of the xml output along with the result
SET - sets the specified variable to the given value
SET @local_variable = expression
WITH common_table_expression(CTE) - specifies a temporary named resultset. This is derived from a simple query and defined within the
execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement.
with t as
(select distinct * from [DB1].[dbo].[tbl1])
select COUNT(*) from t;
Transactions
BEGIN TRANSACTION - BEGIN TRANSACTION marks the starting point of an explicit, local transaction. It increments @@TRANCOUNT by 1
COMMIT - ensures all of a transaction's modifications are made permanent in the database
COMMIT TRANSACTION - marks the end of a successful implicit or explicit transaction. If @@TRANCOUNT is 1, COMMIT TRANSACTION makes all data
modifications since the start of the transaction permanent, frees the transaction's resources and decrements @@TRANCOUNT to 0. When
@@TRANCOUNT is greater than 1, COMMIT TRANSACTION decrements @@TRANCOUNT only by 1 and the transaction stays active
ROLLBACK - rolls back a user-specified transaction to the beginning of the transaction
ROLLBACK TRANSACTION - rolls back an explicit or implicit transaction to the beginning or to a savepoint
SAVE TRANSACTION - sets a savepoint within a transaction
Functions
ASCII - returns the ASCII code value of the leftmost character of a character expression.
CASE - evaluates a list of conditions and returns one of the possible expressions.
Simple CASE function:
CASE input expression
WHEN cond1 THEN result1
WHEN cond2 THEN result2
.......
ELSE result_else
END
Searched CASE function:
CASE
WHEN boolean expression THEN result1
.....
ELSE result_else
END
CAST/CONVERT - explicitly convert an expression of one data type to another
CAST (expression AS data_type)
CONVERT (data_type [(length)], expression [,style])
SELECT title, ytd_sales FROM tblBooks WHERE CAST (ytd_sales AS char[20]) LIKE '3%'
SELECT title, ytd_sales FROM tblBooks WHERE CONVERT (char[20], ytd_sales) LIKE '3%'
CHAR - a string function that converts an int ASCII code to a character
CHARINDEX - returns starting position of a specified pattern in a string
DATEDIFF - returns the date and time difference between two dates passed to it as parameters
DATEDIFF ( datepart, startdate, enddate )
Datepart can be one of the following parameters that specifies the unit of difference to be calculated - Year (abbr. yy, yyyy), quarter
(qq, q), Month (mm, m), dayofyear (dy, y), Day (dd, d), Week (wk, ww), Hour (hh), minute (mi, n), second (ss, s), millisecond (ms)
LEN - returns the number of characters of a given string expression.
-- count the no of Y's
SELECT LENGTH(col) - LENGTH(REPLACE(col, 'Y', ''))
LTRIM/RTRIM - return a character expression after removing leading/trailing blanks
PATINDEX - returns the starting position of the first occurrence of a pattern in a string expression. Returns 0 if the pattern is not found.
Allows wild card characters in the pattern
PATINDEX ('%pattern%', string_expression)
Determine non-printable characters in a column:
select fn, patindex('%[^ !-~]%' COLLATE Latin1_General_BIN,fn) as [Position],
substring(fn,patindex('%[^ !-~]%' COLLATE Latin1_General_BIN,fn),1) as [InvalidCharacter],
ascii(substring(fn,patindex('%[^ !-~]%' COLLATE Latin1_General_BIN,fn),1)) as [ASCIICode]
from #vvk1
where patindex('%[^ !-~]%' COLLATE Latin1_General_BIN,fn) > 0;
SUBSTRING - returns a portion of a character or a binary/text string.expression. Returns 0 if the pattern is not found and allows wild card
characters in the pattern
SELECT x = SUBSTRING ('abcdef', 2, 3) -- 'bcd'
UNICODE - returns the integer value, as defined by the Unicode standard, for the first character of the input expression
DECLARE @nstring nchar(12)
SET @nstring = N'Åkergatan 24'
SELECT UNICODE(@nstring), NCHAR(UNICODE(@nstring))
returns the result set:
------ -
197 Å
Aggregate functions
AVG, COUNT, COUNT_BIG, MAX, MIN, SUM, VAR, VARP, STDEV, STDEVP
Ranking functions
RANK - Displays the rank value for each row in the resultset
SELECT Name, Salary, RANK() OVER(ORDER BY Salary DESC) as Ranking FROM Employees;
DENSE_RANK - Similar to the RANK function without missing ranking values
SELECT Name, Dept, Salary, DENSE_RANK() OVER(PARTITION BY Dept ORDER BY Salary DESC) as DenseRanking FROM Employees;
ROW_NUMBER - returns the sequential number of a row within a partition of a result set
SELECT ROW_NUMBER() OVER(ORDER BY name ASC) AS Row#, name, recovery_model_desc FROM sys.databases WHERE database_id < 5;
NTILE - distributes the rows in an ordered partition into a specified number of groups
System functions
These scalar functions perform operations and return information about values, objects and settings in SQL Server.
@@ERROR - returns the error number for the last T-SQL statement executed
@@FETCH_STATUS - returns the status of the last cursor FETCH statement issued against any cursor currently opened by the connection
@@IDENTITY - returns the last-inserted identity value
@@ROWCOUNT - returns the number of rows affected by the last statement
@@TRANCOUNT - number of BEGIN TRANSACTION statements that have occurred on the current connection
NEWID() - Creates a unique value of type uniqueidentifier.
System statistical functions
These scalar functions return statistical information about the system
@@CONNECTIONS - returns the number of attempted connections (successful or not) since the server was last started
@@CPU_BUSY - the amount of cpu time(ticks) that SQL Server has spent in active operation since its start
@@IDLE - time SQL server has been idle
@@IO_BUSY - time spent on performing I/O
@@TIMETICKS - number of microseconds per tick (computer-dependent)
Variables
A T-SQL local variable is an object that can hold a single data value of a specific type.
The DECLARE statement initializes a Transact-SQL variable by
assigning a name - the name must have a single @ as the first character
assigning a system-supplied or user-defined data type and a length
setting the value to NULL
DECLARE @var1 int; - declare a variable
SET @var1 = 10; - set variable to a value
SELECT @var1 = MAX(emp_id) FROM hr.employee; - set variable to the value of an expression
table variables
Variables can be declared to be of type table. table variables can be used in functions, stored procedures and batches.
table variables that are used in stored procedures cause fewer stored procedure recompilations than when temporary tables are used.
Transactions involving table variables last only for the duration of an update on the table variable and require less locking and logging
resources.
declare @t1 table (rowid int identity(1,1), col1 char(2), col2 char(2);
insert into @t1 select id, ovf, xfer from shipments where status = 'U';
declare @rows int, @rowid int;
select @rows = count(1) from @t1;
while (@rows > 0)
begin
select top 1 @rowid = rowid, @ovfnbr = col1, @xfernbr = col2 from @t1;
/* perform row processing */
delete from @t1 where rowid = @rowid;
select @rows = count(1) from @t1;
end
Stored procedures
A user-defined procedure can be created in a user-defined database or in all system databases except the Resource database. The procedure
can be developed in either T-SQL or as a reference to a Microsoft .NET Framework common runtime language (CLR) method.
There are three ways of returning data from a procedure to a calling program
result sets - if a SELECT statement is included in the body of a stored procedure (not a SELECT .. INTO or INSERT .. SELECT), the rows
specified by the SELECT statement will be sent directly to the client
output parameters - by specifying the OUTPUT keyword for a parameter in the procedure definition, the procedure can return the current
value of the parameter to the calling program when the procedure exits. To save the value of the parameter in a variable that can be used
in the calling program, the calling program must use the OUTPUT keyword when executing the procedure
return codes - a procedure can return an integer value called a return code to indicate the execution status of a procedure using the
RETURN statement
Sample procedure
create proc dbo.usp_kwi_ExtractCustRecords
@parm1 char(23),
@parm2 char(23) as
set nocount on
declare @telephoneno varchar(16), @emailaddress varchar(50);
declare @transsttime datetime, @transendtime datetime;
set @transsttime = cast(@parm1 as datetime)
set @transendtime = cast(@parm2 as datetime)
create table #tempcust
( telephoneno varchar(16),
emailaddress varchar(50) )
declare cursorcust cursor for
select telephone_no1, email_address
from customer cu
where cu.entry_date_time >= @transsttime
and cu.entry_date_time <= @transendtime
open cursorcust
fetch next from cursorcust into @telephoneno, @emailaddress;
while @@fetch_status = 0
begin
insert into #tempcust values (@telephoneno, @emailaddress );
fetch next from cursorcust into @telephoneno, @emailaddress;
end
select * from #tempcust
cleanup:
close cursorcust;
deallocate cursorcust;
drop table #tempcust;
return;
go
System stored procedures
sp_lock - displays info about locks
exec sp_lock; -- displays all locks
exec sp_lock 53; - displays locks for a sessionid
sp_monitor - displays statistics about SQL Server
sp_send_dbmail - sends emails
EXEC msdb.dbo.sp_send_dbmail
@profile_name = 'WHI', @recipients = 'vkandasamy@paypal.com', @subject = 'OE Gen 4.5: New Makes Alert',
@body = 'Automated Message: New Makes have been received in the OE Gen 4.5 process';
sp_spaceused - displays space usage details
EXEC sp_spaceused @updateusage = N'TRUE'; -- for current DB, returns current values
EXEC sp_spaceused N'dbo.vendor'; -- find space used by a table
EXEC sp_MSforeachtable @command1="EXEC sp_spaceused '?'"; -- find space used by all tables
sp_who/sp_who2 - provide user, session and process information
EXEC sp_who; -- displays statistics for all users/sessions
EXEC sp_who 'janetl'; -- show info on a single user login
EXEC sp_who '10'; - show info by sessionid
Columns in sp_who output:
spid - session id
status - process status. The possible values are
dormant. SQL Server is resetting the session
running. The session is running one or more batches. When Multiple Active Result Sets (MARS) is enabled, a session can run multiple
batches
background. The session is running a background task, such as deadlock detection
rollback. The session has a transaction rollback in process
pending. The session is waiting for a worker thread to become available
runnable. The session's task is in the runnable queue of a scheduler while waiting to get a time quantum
spinloop. The session's task is waiting for a spinlock to become free
suspended. The session is waiting for an event, such as I/O, to complete
blk - session id of the blocking process if one exists, else 0
cmd - command executing for the process
Extended stored procedures
Extended stored procedures allow developers to create their own external routines in a programming language such as C. They appear to users
as normal stored procedures and are executed in the same way. Parameters can be passed to extended stored procedures and they can return
results and return status.
Extended stored procedures are dynamic-link libraries (DLLs) that SQL Server can dynamically load and execute. Extended stored procedures
run directly in the address space of SQL Server and are programmed using the SQL Server Open Data Services API.
An extended stored procedure works as follows
1. When a client executes an extended stored procedure, the request is transmitted in tabular data stream (TDS) format from the client
application through the Net-Libraries and Open Data Services to SQL Server
2. SQL Server searches for the DLL associated with the extended stored procedure, and loads the DLL if it is not already loaded
3. SQL Server calls the requested extended stored procedure (implemented as a function inside the DLL)
4. The extended stored procedure passes result sets and return parameters back to the server by using the Open Data Services API
User-defined functions
A stored procedure returns only an int datatype value while a function can return any datatype.
Calling a function
select @var1 = [dbo].[fnConvertBintoStr] (@var2,@var3);
Scalar-valued function
Scalar functions return a single data value of the type defined in the RETURNS clause. The return type can be any data type except text, ntext, image, cursor, and timestamp.
ALTER FUNCTION [dbo].[fnGetCountryForState]
(@InState char(2))
RETURNS char(2)
AS
BEGIN
declare @outCtryCde char(2);
if @InState in ('AB','BC','MB','NB','NL','NT', 'NS','NU','ON','PE','PQ','QC','SK','YT','NF')
set @outCtryCde = 'CA';
else
set @outCtryCde = 'US';
return @outCtryCde;
END
Table-valued function
Table-valued functions (TVFs) return a table data type.
CREATE FUNCTION [dbo].[fnGetLocationInfo]
(
@RegNumber char(9),
@CompanyCde CHAR(3),
@OrigCde CHAR(2),
@OrigNbr CHAR(2)
)
RETURNS TABLE
AS
RETURN
( SELECT rtrim(LCTN_ADDR1) + ' ' + rtrim(LCTN_ADDR2) as LCTN_ADDR,
LCTN_CITY_NME,
lctn_st_cde = case lctn_st_cde
when 'PQ' then 'QC'
when 'NF' then 'NL'
else lctn_st_cde end,
LCTN_ZIP_CODE,
LCTN_COUNTRY_CDE FROM SH_LOCATION
where REG_NBR = @RegNumber and COMPANY_CDE = @CompanyCde
AND LCTN_TYP_CDE = @OrigCde and LCTN_TYP_NBR = @OrigNbr);
Back
SERVER ADMINISTRATION
Installing SQL Server
Editions
SQL Server is available in Enterprise, Standard, Developer and Express editions. The Developer edition is free to use for non-production
environments. The Express edition is free to use for small-scale applications and supports a maximum of 4 cores and a database size of 10GB.
SQL Server Management Studio (SSMS) is a free tool that can be used to configure and manage SQL Server instances, deploy and monitor
data-tier components used by applications and build queries and scripts.
SQL Server Configuration Manager is a tool to manage the services associated with SQL Server, to configure the network protocols used and
to manage the network connectivity configuration from client computers. SQL Server Configuration Manager is a Microsoft Management Console
(MMC) snap-in that is available from the Start menu (All programs -> SQL server -> Configuration tools) or can be added to any other MMC
display.
Default and named instances
SQL Server supports multiple instances of SQL Server on a single server or processor, but only one instance can be the default instance.
All others must be named instances. A computer can run multiple instances of SQL Server concurrently and each instance runs independently
of other instances.
While installing SQL Server, a name can be assigned to the instance enabling the installation of multiple instances on a single machine.
Each named instance is identified by the network name of the computer plus the instance name, in the format computer_name\instance_name.
If a name is not specified, a default instance is created which can be accessed by just using the computer_name. The default instance name
is MSSQLSERVER and a client is not required to specify the name of the instance to make a connection.
Default instances of SQL Server use TCP port 1433 by default. Named instances, however, dynamically assign an unused TCP port number the
first time the instance is started. SQL Server also uses UDP port 1434 to establish communications links from applications.
Managing services
Database Engine service - the executable process that is the SQL Server Database Engine. The Database Engine can be the default instance
(limit one per computer) or can be one of many named instances. The default instance is listed as SQL Server (MSSQLSERVER). Named instances
are listed as SQL Server () and SQL Server Express is installed as SQL Server (SQLEXPRESS) by default.
SQL Server Agent service - a Windows service that executes scheduled administrative tasks called jobs and alerts.
SQL Server Browser service - a Windows service that listens for incoming requests for SQL Server resources and provides clients information
about SQL Server instances installed on the computer. A single instance of the SQL Server Browser service is used for all instances
installed on the computer.
SQL Server Configuration Manager can be used to determine which instances are installed on the computer, start, pause, resume or stop the
services (DB engine/Agent/Browser) and view/change service properties. Pausing prevents new users from connecting to the DB engine but
already connected users can continue to work till their connections are broken.
By default, only members of the local administrators group can start, stop, pause, resume or restart a service.
The Database Engine can be stopped by using the SHUTDOWN statement.
SHUTDOWN; - wait for currently running statements to finish and then stop the DB Engine
SHUTDOWN WITH NOWAIT; - stop immediately
Server configuration
SQL Server resources can be managed and optimized through configuration options by using SSMS or the sp_configure system stored procedure.
Commonly used options can be viewed/changed using SSMS by right-clicking the server in Object Explorer and choosing Properties. Some
options are accessible through the SQL Server configuration manager. All options are accessible using sp_configure.
sp_configure
The sp_configure system procedure displays or changes global configuration settings for the current server. All users can display options
and the ALTER SETTINGS server-level permission is required for changing options.
RECONFIGURE updates the currently configured value (the config_value column in the sp_configure result set) of a configuration option
changed with sp_configure. RECONFIGURE does not always update the currently running value (the run_value column) because some configuration
options require a server stop and restart to update the currently running value.
sp_configure; - display all options
sp_configure 'show advanced options', 1;
RECONFIGURE;
sp_configure 'max server memory', 4096; - set max memory to 4GB
RECONFIGURE;
Server memory configuration
SQL Server uses memory dynamically, but the amount of memory managed by the SQL Server Memory Manager for a SQL Server process used by an
instance of SQL Server can be configured by two options - min server memory (default 0 MB) and max server memory (default 2048TB).
SELECT physical_memory_in_use_kb/1024 'physical_memory_in_use_MB', - currently allocated memory
large_page_allocations_kb/1024 'large_page_allocations_MB',
locked_page_allocations_kb/1024 'locked_page_allocations_MB',
virtual_address_space_reserved_kb/1024 'VAS_reserved_MB',
virtual_address_space_committed_kb/1024 'VAS_committed_MB',
virtual_address_space_available_kb/1024 'VAS_available_MB',
page_fault_count,
memory_utilization_percentage,
process_physical_memory_low,
process_virtual_memory_low
FROM sys.dm_os_process_memory;
Collations in SQL Server
Collations can be set at server, database, column, expression or identifier levels. The default server collation is determined during SQL Server setup based on the OS locale and it becomes the default collation of the system databases and all user databases.
SELECT CONVERT (varchar, SERVERPROPERTY('collation')); - server collation
SELECT name, collation_name FROM sys.databases; - database collation
SELECT name, collation_name FROM sys.columns WHERE name = N'colname'; - column collation
SELECT * FROM fn_helpcollations(); - list of supported collations
Collation options are specified by appending them to the collation name
_CS/_CI - case sensitive/insensitive
_AS/_AI - accent sensitive/insensitive
The COLLATE clause defines a collation of a database or table column or a collation cast operation when applied to character string
expression. Collation name can be either a Windows collation name or a SQL collation name. If not specified during database creation, the
database is assigned the default collation of the instance of SQL Server. If not specified during table creation, the column is assigned
the default collation of the database.
CREATE TABLE t1 (
LatinColumn nvarchar(15) COLLATE latin1_general_cs_as,
FrenchColumn ncarchar(15) COLLATE french_100_cs_as);
Connection handling
Network Protocols
SQL Server uses any of these protocols for listening to connections - Shared memory, Named pipes, TCP/IP, VIA.
The protocol must be enabled on both the client and server to work. The server can listen for requests on all enabled protocols at the
same time. Client computers can pick one or try the protocols in the order listed in SQL Server Configuration Manager.
Connecting to SQL Server over the Internet
Users can connect to an instance of SQL Server over the internet using SQL Query Analyzer or a client application based on ODBC or
DB-Library.
To share data over the Internet, the client and server must be connected to the Internet. In addition, TCP/IP or Multiprotocol
Net-Libraries must be used. If Multiprotocol Net-Library is used, it must be ensured that TCP/IP support is enabled. If the server is
registered with Domain Name System (DNS), it can be connected to using its registered name.
User connections
The user connections option specifies the maximum number of simultaneous user connections that are allowed on an instance of SQL Server.
The default is 0, which means that the maximum (32,767) user connections are allowed.
SELECT @@max_connections;
SELECT * FROM sys.dm_tran_locks WHERE resource_type = 'DATABASE' AND resource_database_id = DB_ID('dbname'); - current connections
SELECT * FROM sys.dm_exec_requests WHERE database_id = DB_ID('dbname'); - connections executing tasks
DBCC Commands
T-SQL provides DBCC statements that acts as database console commands for SQL server. DBCC commands take input parameters and return
values. All DBCC command parameters can accept both Unicode and DBCS literals.
DBCC commands are grouped into four categories
Maintenance
DBCC SHRINKDATABASE - shrinks the size of all the data and log files in the specified database. DBCC SHRINKDATABASE operations can be
stopped at any point in the process and any completed work is kept.
DBCC SHRINKDATABASE ( dbname|dbid|0 [, target_percent] [, NOTRUNCATE|TRUNCATEONLY] );
0 - current database is used
target_percent - percentage of free space to be left in the database file after shrinking
NOTRUNCATE - applicable only to data files, the free space at the end of file is not released to the OS
TRUNCATEONLY - releases all free space at the end of the file to the operating system
DBCC SHRINKFILE - shrinks the size of the specified data or log file for the current database
DBCC DBREINDEX - rebuilds one or more indexes for a table
DBCC DBREINDEX ("HumanResources.Employee", PK_Employee_BusinessEntityID,80);
80 is the fillfactor, the percentage of space on each index page for storing data
DBCC UPDATEUSAGE - update usage statistics for database/table
DBCC UPDATEUSAGE (dbname) WITH NO_INFOMSGS;
Validation
DBCC CHECKDB - Checks the logical and physical integrity of all the objects in the specified database.
DBCC CHECKDB('dbname');
DBCC CHECKDB('dbname') REPAIR_ALLOW_DATA_LOSS | REPAIR_FAST | REPAIR_REBUILD;
REPAIR_ALLOW_DATA_LOSS - tries to repair all reported errors. These repairs can cause some data loss
REPAIR_FAST - maintains syntax for backward compatibility only. No repair actions are performed
REPAIR_REBUILD - performs repairs that have no possibility of data loss
Informational
DBCC USEROPTIONS - returns the SET options active (set) for the current connection
DBCC USEROPTIONS
Miscellaneous
DBCC HELP - returns syntax information for the specified DBCC command. Requires membership in the sysadmin fixed server role
DBCC HELP ('CHECKDB');
DBCC HELP ('?'); - returns all DBCC statements for which help is available
Shrinking the tempdb
The tempdb is a temporary workspace. Among other uses, SQL Server uses the tempdb for
Storage of explicitly created temporary tables
Worktables that hold intermediate results created during query processing and sorting
Materialized static cursors
Storage of version records when snapshot isolation levels or read-committed snapshot isolation levels are used
By default, the tempdb database is configured to autogrow as needed and may unexpectedly grow in time to a size larger than the desired
size. A simple restart of SQL Server resets the size of tempdb to its last configured size. The configured size is the last explicit size
that was set by using a file size changing operation such as ALTER DATABASE with the MODIFY FILE option or the DBCC SHRINKFILE or DBCC
SHRINKDATABASE statements.
The tempdb can be shrunk by
System catalog views
Catalog views return information that is used by the SQL Server Database Engine and are the most general interface to the catalog metadata.
All user-available catalog metadata is exposed through catalog views.
Databases and files catalog views
sys.databases - has one row per database in the SQL server instance
name - unique within an SQL server instance
database_id - unique within an SQL server instance
sys.database_files - per database view containing one row per file of the database
file_id - id of the file within database
type - 0(rows), 1(log), 2(filestream)unique within an SQL server instance
name - logical name of the file
physical name - OS file name
size - current size in 8kB pages
Object catalog views
sys.columns - contains a row for each column of objects with columns like tables and views
object_id - id of the object to which the column belongs
name - unique within the object
column_id - unique within the object
user_type_id - id of the type of the column
collation_name - collation for character-based columns
sys.indexes - contains a row per index or heap of a table, view, or table-valued function
object_id - id of the object to which the index belongs
name - unique within the object
index_id - unique within the object
type - 0(Heap), 1(Clustered), 2(Nonclustered), 3(XML), 4(Spatial)
sys.objects - has a row for each user-defined object created within a database
name - object name
object_id - unique within a database
schema_id - id of the schema that the object is contained in
parent_object_id - id of the object to which this object belongs (0 if not a child object)
type - U(User table), V(view), S(system table), IT(internal table), P(procedure)
create_date
modify_date
sys.partitions - contains a row for each partition. All tables and indexes contain at least one partition
partition_id - unique within a database
object_id - ID of the object to which this partition belongs
index_id - ID of the index within the object to which this partition belongs
hobt_id - ID of the data heap or B-tree that contains the rows for this partition
Back
Monitoring the Server
Dynamic management views
Dynamic management views and functions return server state information that can be used to monitor the health of a server instance,
diagnose problems and tune performance.
sys.dm_db_index_physical_stats - Fragmentation details for indexes
sys.dm_db_log_space_usage - Returns space usage information for the transaction log
sys.dm_exec_procedure_stats - Find when stored procs were executed
SELECT TOP 1000 b.name, a.* From sys.dm_exec_procedure_stats a, sys.objects b
where a.object_id = b.object_id and DB_NAME(a.database_id) = 'db1';
sys.dm_exec_requests - all requests running on the server
select top 100 session_id, start_time, status, command, sql_handle, total_elapsed_time/1000 'elapsed_time_seconds'
from sys.dm_exec_requests order by total_elapsed_time desc; - long running queries
sys.dm_exec_connections - returns information about the connections established to the SQL server instance
session_id - session associated with this connection
most_recent_sql_handle - sql handle of the last request executed on this connection
sys.dm_exec_sessions - returns one row per authenticated session on SQL Server. It shows information about all active user connections and
internal tasks
status - Running(currently running one or more requests), Sleeping(currently running no requests)
reads/writes - number of reads/writes performed by requests in this session
select * from sys.dm_exec_sessions;
sys.dm_exec_sql_text - Returns the text of the SQL batch identified by the specified sql_handle
select * from sys.dm_exec_sql_text (sql_handle);
Finding SQL text from sp_id
declare @spid int, @stmt_start int, @stmt_end int, @sql_handle binary(20);
set @spid = 59; -- fill this in
select top 1
@sql_handle = sql_handle,
@stmt_start = case stmt_start when 0 then 0 else stmt_start / 2 end,
@stmt_end = case stmt_end when -1 then -1 else stmt_end / 2 end
from master.dbo.sysprocesses
where spid = @spid
order by ecid;
SELECT
SUBSTRING( text,
COALESCE(NULLIF(@stmt_start, 0), 1),
CASE @stmt_end
WHEN -1 THEN DATALENGTH(text)
ELSE (@stmt_end - @stmt_start)
END
)
FROM ::fn_get_sql(@sql_handle);
sys.dm_os_process_memory - information about memory allocations and the process address space
select * FROM sys.dm_os_process_memory;
sys.dm_os_sys_info - returns information about resources available to and consumed by SQL Server
sys.dm_tran_locks - reports information about locks
Extended events
Extended events is a lightweight performance monitoring system that uses minimal performance resources and provides GUIs to create, modify,
display and analyze session data.
Extended events can synchronously generate event data (and asynchronously process that data) which provides a flexible solution for event
handling. Extended events also provides the following features:
a unified approach to handling events across the server system, while enabling users to isolate specific events for troubleshooting
purposes
integration with, and support for existing ETW tools
a fully configurable event handling mechanism that is based on T-SQL
the ability to dynamically monitor active processes with minimal effect on those processes
a default system health session that collects system data that can be used to help troubleshoot performance issues
A new event session can be created using SSMS by clicking Management -> Extended events -> Sessions and using the New Session dialog or
the New Session Wizard.
The system_health session
The system_health session is an Extended events session that is included by default with SQL Server. This session starts automatically when
the Database Engine starts, runs without any noticeable performance effects and collects system data that can be used to help troubleshoot
performance issues.
The session collects information including
the sql_text and session_id for any sessions that encounter an error that has a severity >= 20 or a memory-related error
any deadlocks that are detected, including the deadlock graph
the callstack, sql_text and session_id for any sessions that have waited on latches for > 15 seconds or locks for > 30 seconds
The session uses the ring buffer target and event file target to store the data which can be viewed using the Extended events UI in SSMS
or using T-SQL.
Back
TRANSACTION/QUERY MANAGEMENT
Isolation levels
The SET TRANSACTION ISOLATION LEVEL statement controls the locking and row versioning behavior of T-SQL statements issued by a connection.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED/READ COMMITTED/REPEATABLE READ/SNAPSHOT/SERIALIZABLE;
READ UNCOMMITTED - Transactions do not issue shared locks to prevent other transactions from modifying data read by the current
transaction and are also not blocked by exclusive locks that would prevent reading rows that have been modified but not committed by other
transactions.
READ COMMITTED - this option is the SQL Server default and its behavior depends on the setting of the READ_COMMITTED_SNAPSHOT database
option:
If READ_COMMITTED_SNAPSHOT is set to OFF (the default), shared locks are used to prevent other transactions from modifying rows while
the current transaction is running a read operation. The shared locks also block the statement from reading rows modified by other
transactions until the other transaction is completed
If READ_COMMITTED_SNAPSHOT is set to ON, row versioning is used to present each statement with a transactionally consistent snapshot
of the data as it existed at the start of the statement. Locks are not used to protect the data from updates by other transactions
REPEATABLE READ - shared locks are placed on all data read by the transaction and are held until the transaction completes preventing
other transactions from modifying the rows. Other transactions can insert new rows that match the search conditions of statements issued
by the current transaction, which can results in phantom reads if a statement is retried
SNAPSHOT - the statements in a transaction get a snapshot of the committed data as it existed at the start of the transaction. Except when
a database is being recovered, SNAPSHOT transactions do not request locks when reading data and do not block other transactions from
writing data. Transactions writing data do not block SNAPSHOT transactions from reading data
SERIALIZABLE - the most restrictive of the isolation levels since it locks entire ranges of keys and holds them until the transaction
completes. Statements cannot read data that has been modified but not yet committed by other transactions and other transactions cannot
modify data read by the current transaction. Other transactions cannot insert new rows with key values that would fall in the range of
keys read by the current transaction
Locking
SQL Server has multi-granular locking that allows different types of resources to be locked by a transaction. The Database Engine often
has to acquire locks at multiple levels of granularity to fully protect a resource and this group of locks is called a lock hierarchy. For
e.g to fully protect a read of an index, SQL Server may have to acquire share locks on rows and intent share locks on the pages and table.
The following are the resources that SQL Server can lock.
Resource Description
RID A row identifier used to lock a single row within a heap
KEY A row lock within an index used to protect key ranges in serializable transactions
PAGE An 8KB page in a database, such as data or index pages
EXTENT A contiguous group of eight pages
HoBT A heap or B-tree. A lock protecting a B-tree (index) or the heap data pages in a table that does not have a clustered index
TABLE The entire table, including all data and indexes
FILE A database file
APPLICATION An application-specified resource
METADATA Metadata locks
ALLOCATION_UNIT An allocation unit
DATABASE The entire database
Locking modes
Shared(S) - used for read operations(SELECT). No other transactions can modify the data while S locks exist on the resource which are
released as soon as the read operation completes, unless the transaction isolation level is set to repeatable read or higher
Update(U) - used on resources that can be updated. Prevents a common form of deadlock that occurs when multiple sessions are reading,
locking and potentially updating resources later. Only one transaction can obtain a U lock to a resource at a time. If a transaction
modifies a resource, the U lock is converted to an X lock
Exclusive(X) - used for DML operations such as INSERT, UPDATE and DELETE. Ensures that multiple updates cannot be made to the same
resource at the same time
Intent - placed on a higher-level resource before a lock is placed on a lower-level resource. Intent locks serve two purposes
prevent other transactions from modifying the higher-level resource in a way that would invalidate the lock at the lower level
improve the efficiency of SQL Server in detecting lock conflicts at the higher level of granularity
Bulk Update (BU) - used when bulk copying data into a table
Schema - used when an operation dependent on the schema of a table is executing
Schema modification locks - used when a DDL operation or a DML operation like truncation is executing on a table
Schema stability locks - used when queries are compiling
Key-range - protects the range of rows read by a query when using the serializable transaction isolation level
Locking information
sys.dm_tran_locks
Each row in sys.dm_tran_locks represents a currently active request to the lock manager for a lock that has been granted or is waiting to
be granted.
select * from sys.dm_tran_locks;
Columns:
resource_type - DATABASE, FILE, OBJECT, PAGE, KEY, EXTENT, RID, APPLICATION, METADATA, HOBT or ALLOCATION_UNIT
resource_subtype - a subtype of the resource_type
resource_database_id - id of the database
resource_associated_entity_id - can be an object ID, Hobt ID or an Allocation Unit ID depending on the resource type
resource_lock_partition - id of the lock partition for a partitioned lock resource, 0 for nonpartitioned resources
request_mode - for granted requests, this is the granted mode; for waiting requests, this is the mode being requested
request_type - the value is LOCK
request_status - GRANTED, CONVERT, WAIT, LOW_PRIORITY_CONVERT, LOW_PRIORITY_WAIT or ABORT_BLOCKERS
request_session_id - Session ID that currently owns this request
request_owner_type - entity type that owns the request(TRANSACTION, SESSION, SHARED_TRANSACTION_WORKSPACE, EXCLUSIVE_TRANSACTION_WORKSPACE)
-- display current lock requests
SELECT
dtl.request_session_id AS SPID,
DB_NAME(dtl.resource_database_id) AS DBName,
dtl.resource_type AS LockedResource,
CASE
WHEN dtl.resource_type = 'OBJECT' THEN o.object_id
WHEN dtl.resource_type = 'KEY' THEN i.object_id
END AS LockedObjId,
CASE
WHEN dtl.resource_type = 'OBJECT' THEN o.name
WHEN dtl.resource_type = 'KEY' THEN i.name
END AS LockedObjName,
dtl.request_mode AS LockType,
dtl.request_status AS LockStatus,
es.program_name AS ProgramName,
es.original_login_name AS OrigLoginName,
st.text AS SQLText
FROM sys.dm_tran_locks dtl
left join sys.partitions p ON dtl.resource_associated_entity_id = p.partition_id AND dtl.resource_type = 'KEY'
left join sys.indexes i on p.object_id = i.object_id and p.index_id = i.index_id
left join sys.objects o ON dtl.resource_associated_entity_id = o.object_id AND dtl.resource_type = 'OBJECT'
join sys.dm_exec_sessions es ON dtl.request_session_id = es.session_id
JOIN sys.dm_exec_connections dc ON dc.session_id = es.session_id
CROSS APPLY sys.dm_exec_sql_text(dc.most_recent_sql_handle) AS st
WHERE resource_database_id = db_id()
ORDER BY dtl.request_session_id;
Blocks and Deadlocks
A process is referred to as blocked when it is waiting for a resource to be released by another process.
Deadlocks may result from Worker threads, memory, parallel query execution, MARS (Multiple Active Resultsets). SQL Server performs periodic
deadlock detection through a lock monitor thread that periodically initiates a search through all of the tasks in an instance. When a
deadlock is detected, one of the threads is chosen as the deadlock victim and terminated.
Locking information can be viewed through - Activity Monitor, SQL Server Profiler (Locks event), sys.dm_tran_locks. Locked processes can be
killed using Activity Monitor.
Deadlock Graphs
SQL Server Profiler and SSMS use a deadlock wait-for graph to describe a deadlock. The deadlock wait-for graph contains process nodes,
resource nodes and edges representing the relationships between the processes and the resources.
Process node - a thread that performs a task; for e.g INSERT, UPDATE or DELETE
Resource node - a database object; for e.g a table, index or row
Edge - a relationship between a process and a resource. A request edge occurs when a process waits for a resource. An owner edge occurs
when a resource waits for a process. The lock mode (X, S etc.) is included in the edge description
Deadlock graphs can be retrieved from the system_health session in SSMS by clicking Management > Extended Events > Sessions > system_health
> package0.event_file and View Target Data
Query processing
The SQL Server Database Engine processes queries on local tables, partitioned tables and tables distributed across multiple servers.
The Database Engine can process T-SQL statements using two distinct processing modes
Row mode execution - used with traditional RDBMS tables, where data is stored in row format. When a query is executed and accesses data in row store tables, the execution tree operators and child operators read each required row, across all the columns specified in the table schema. From each row that is read, SQL Server then retrieves the columns that are required for the result set as referenced by a SELECT statement, JOIN predicate or filter predicate
Batch mode execution - used to process multiple rows together (hence the term batch). Each column within a batch is stored as a vector in a separate area of memory, so batch mode processing is vector-based. Batch mode processing also uses algorithms that are optimized for the multi-core CPUs and increased memory throughput that are found on modern hardware
Join operations
SQL Server employs three types of join operations
Nested Loops join - If one join input is small and the other input is fairly large and indexed on its join columns, an index nested
loops join is the fastest join operation because they require the least I/O and the fewest comparisons
Merge join - If the two join inputs are not small but are sorted on their join column, a merge join is the fastest join operation
Hash join - Hash joins can efficiently process large, unsorted, nonindexed inputs and are useful for intermediate results in complex
queries
Parallel Query Processing
SQL Server can perform a query or index operation in parallel by using several operating system worker threads and provides parallel
queries to optimize query execution and index operations in computers that have more than one CPU.
SQL Server automatically detects the best degree of parallelism for each instance of a parallel query execution or index DDL operation.
The max degree of parallelism (MAXDOP) option can be used to limit the number of processors to use in parallel plan execution. The MAXDOP
option can be overridden for individual query and index operations by specifying the MAXDOP query hint or MAXDOP index option.
Setting the option to 0 (default) enables SQL Server to use all available processors up to a maximum of 64 processors.
SELECT ProductID, OrderQty, SUM(LineTotal) AS Total FROM Sales.SalesOrderDetail
WHERE UnitPrice < $5.00
GROUP BY ProductID, OrderQty ORDER BY ProductID, OrderQty
OPTION (MAXDOP 2);
Distributed queries
Distributed queries access data from multiple heterogeneous data sources, which can be stored on either the same or different computers.
SQL Server supports distributed queries by using OLE DB, the Microsoft specification of an API for universal data access.
The OPENQUERY function executes a pass-through query on the specified linked server. OPENQUERY can also be referenced as the target table
of an INSERT, UPDATE or DELETE statement, subject to the capabilities of the OLE DB provider.
select * from openquery(linkedserver, 'select * from t1');
Linked servers
Linked servers enable the Database Engine to execute distributed, heterogeneous queries against OLE DB data sources like another instance
of SQL Server or another RDBMS. Linked servers can be configured by using SSMS or the sp_addlinkedserver procedure.
A linked server definition specifies the following
An OLE DB provider - a DLL that manages and interacts with a specific data source. OLE DB providers exist for a variety of files and
file formats including text files and spreadsheets
An OLE DB data source - identifies the specific database that can be accessed through OLE DB
Query optimization
By default, the query optimizer updates statistics as necessary to improve the query plan. In some cases query performance can be improved
by using UPDATE STATISTICS or the stored procedure sp_updatestats to update statistics more frequently than the default updates. Updating
statistics ensures that queries compile with up-to-date statistics but it causes queries to recompile.
The query optimizer uses operators to build a query plan to create the result specified in the query or to perform the operation specified
in the DML statement. The query plan is a tree consisting of physical operators and can be viewed by using the SET SHOWPLAN statements,
the graphical execution plan options in SSMS or the SQL Server Profiler Showplan event classes.
set showplan_all on; - returns execution information for each statement without executing it
set statistics profile on; - displays profile information for the statement after the resultset
set statistics xml on; - returns execution information in an XML document
Operators
Operators are classified as
Logical operators - describe the relational algebraic operation used to process a statement
Physical operators - implement the operation described by logical operators. Each physical operator is an object or routine that performs
an operation (access columns or rows from a table, index or view, perform calculations, aggregations, data integrity checks or joins).
Physical operators have costs associated with them.
The physical operators initialize, collect data, and close and can answer the following three method calls:
Init() - causes a physical operator to initialize itself and set up any required data structures. The physical operator may receive many
Init() calls, though typically receives only one
GetNext() - causes a physical operator to get the first, or subsequent row of data. The physical operator may receive zero or many
GetNext() calls
Close() - causes a physical operator to perform some clean-up operations and shut itself down. A physical operator only receives one Close()
call
The GetNext() method returns one row of data, and the number of times it is called appears as ActualRows in the Showplan output.
The ActualRebinds and ActualRewinds counts that appear in Showplan xml output refer to the number of times that the Init() method is
called. ActualRebinds = 1 and ActualRewinds = 0 unless the operator is on the inner side of a loop join, when the sum of the number of
rebinds and rewinds equals the number of rows processed on the outer side of the join. A rebind means that one or more of the correlated
parameters of the join changed and the inner side must be reevaluated. A rewind means that none of the parameters changed and the prior
inner result set may be reused.
List of operators
Clustered Index Scan - scans the clustered index specified in the argument column. When an optional WHERE predicate is present, only those
rows that satisfy the predicate are returned
Clustered Index Seek - uses the seeking ability of indexes to retrieve rows from a clustered index. The argument contains the name of the
clustered index being used and the SEEK predicate
Eager spool - the eager spool operator builds its spool file in an 'eager' manner i.e when the spool's parent operator asks for the first
row it consumes all rows from its input operator and stores them in a hidden temporary object stored in the tempdb database. If rebinding
is needed, the spooled data is discarded and the spool object is rebuilt by rescanning the input
Index Scan - scans the nonclustered index specified in the argument
Index Seek - uses the seeking ability of indexes to retrieve rows from a nonclustered index
Filter - scans the input, returning only those rows that satisfy the filter expression
Key lookup - a bookmark lookup on a table with a clustered index, always accompanied by the nested loops operator
Lazy Spool - the lazy spool operator builds its spool file in a 'lazy' manner i.e each time the spool's parent operator asks for a row,
it gets a row from its input operator and stores it in the spool in the tempdb database, rather than consuming all rows at once.
Merge Join - performs join operations. The argument contains a MERGE predicate if the operation is performing a one-to-many join, or a
MANY-TO-MANY MERGE predicate if performing a many-to-many join and a list of columns used to perform the operation.
Nested loops - nested loops joins perform a search on the inner table for each row of the outer table, typically using an index
Result(Select) - data returned at the end of the plan
Sort - sorts all incoming rows. The argument contains either a DISTINCT ORDER BY or an ORDER BY predicate with a list of the columns being
sorted
Table Scan - retrieves all rows from the table specified in the argument column
Back
IMPORTING/EXPORTING DATA
Sql Server Integration Services(SSIS)
SSIS is a platform for building data integration and data transformations solutions. SSIS can be used to create business solutions by
loading/downloading files, loading data warehouses, cleansing and mining data and managing SQL Server objects and data. SSIS can extract
and transform data from a variety of sources such as XML data files, flat files and relational data sources and load the data into one or
more destinations.
SSIS includes a rich set of built-in tasks and transformations, graphical tools for building packages and the Catalog database to store,
run and manage packages.
SSIS can be installed by the SQL Server setup wizard by choosing it from the components list. SSIS is not provided with the Express
edition.
SSIS packages
A package is an organized collection of connections, control flow elements, data flow elements, event handlers, variables, parameters and
configurations that are assembled using the graphical design tools provided by SSIS or built programmatically. The completed package can
be saved to SQL Server, the SSIS Package Store or the file system.
Contents of a package
Tasks and containers (control flow)
A control flow consists of one or more tasks and containers that execute when the package runs. Precedence constraints are used to connect
the tasks and containers and control the order and define the conditions for running the next task or container in the package control
flow. A subset of tasks and containers can also be grouped and run repeatedly as a unit within the package control flow.
Data sources and destinations (data flow)
A data flow consists of the sources and destinations that extract and load data, the transformations that modify and extend data and the
paths that link sources, transformations and destinations. Before adding a data flow to a package, the package control flow must include a
Data Flow task. The Data Flow task is the executable within the SSIS package that creates, orders and runs the data flow. A separate
instance of the data flow engine is opened for each Data Flow task in a package.
Connection managers (connections)
A package typically includes at least one connection manager. A connection manager is a link between a package and a data source that
defines the connection string for accessing the data that the tasks, transformations and event handlers in the package use. SSIS includes
connection types for data sources such as text and XML files, relational databases and Analysis Services databases and projects.
Packages may also include additional objects like variables, event handlers, configurations, log providers and parameters.
Creating a package
In SQL Server Data Tools (SSDT), a new package can be created by using one of the following methods:
Using the package template included in SSIS
Using a custom template - to use custom packages as templates for new packages, they can be copied to the DataTransformationItems
folder
Copy an existing package - if a new package is created by copying an existing package or by using a custom package as a template, the
name and the GUID of the existing package are copied as well and should be updated to help differentiate it from the package from which it
was copied. The GUID can be regenerated in the ID property and value of the Name property can be updated by using the Properties window in
SSDT
Using a custom package that has been designated as a template
Running the SQL Server Import and Export Wizard - the Import and Export Wizard creates a complete package for a simple import or
export and configures the connections, source, and destination and adds any data transformations that are required. The package can be
optionally save to run later
The packages created in SSDT using SSIS Designer are saved to the file system and to save them to SQL Server or to the package store, a
copy of the package needs to be saved.
Creating a new Integration Services project
1. On the Windows Start menu, search for and select Visual Studio (SSDT)
2. In Visual Studio, select File > New > Project and in the New Project dialog box, expand the Business Intelligence node under Installed
and select Integration Services Project in the Templates pane
3. In the Name box, change the default name. To use a folder that already exists, clear the Create directory for solution check box
4. Accept the default location or browse to locate the folder to use. In the Project Location dialog box, select the folder and then
Select Folder and select OK
5. By default, an empty package titled Package.dtsx is created and added to the project under SSIS Packages
6. In Solution Explorer, right-click Package.dtsx and rename the default package
Creating a package in SSDT using the Package Template
1. In SSDT, open the Integration Services project in which the package needs to be created
2. In Solution Explorer, right-click the SSIS Packages folder and then click New SSIS Package
3. Optionally, add control flow, data flow tasks, and event handlers to the package
4. On the File menu, click Save Selected Items to save the new package
Import and Export Wizard
The SQL Server Import and Export Wizard uses SSIS to copy data from a source to a destination. The wizard creates an SSIS package in memory
while moving through the pages of the wizard and specifying options. At the end, the SSIS package can be saved in SQL server itself or as
a file.
The wizard can be started
from the Start menu
from the command prompt by running DTSWizard.exe
from SSMS
from Visual Studio with SQL Server Data Tools (SSDT)
Bulk Import/Export
bcp utility
The bulk copy program utility (bcp) bulk copies data between an SQL Server instance and a data file in a user-specified format. It runs as
a Windows OS command. Use integrated security option over SQL server login since user/pwd are sent as clear text in the latter.
bcp {table_name | view_name | "query"} {in data_file | out data_file | queryout data_file | format nul}
BULK INSERT statement
Imports a data file into a table.
--import from the second row
bulk insert tblmageMap from 'C:\imagemap.txt'
with (firstrow=2, fieldterminator='\t', rowterminator = '\n');
LOADING XML DATA
XML data can be transferred into SQL Server in several ways
If the data is in a text or image column in a SQL Server database, the table can be imported by using SSIS and changing the column
type to XML by using the ALTER TABLE statement
the data can be bulk copied from another SQL Server database by using bcp out and then bulk inserting the data by using bcp in
client-side programming can be used to retrieve the XML that is generated at the server with FOR XML and write the XML into an XML
column in a new table
SQL Server stores XML data in Unicode (UTF-16) and XML data retrieved from the server comes out in UTF-16 encoding. If a different
encoding is needed, the required conversion has to be performed.
Bulk loading XML data
XML data can be bulk loaded by using the bulk loading capabilities of SQL Server, such as bcp. OPENROWSET allows loading data into an XML
column from files.
Loading an XML file into a table
DECLARE @SQL nvarchar(1000), @ACES_XML XML;
SET @SQL = 'SELECT @a = CONVERT(xml, BulkColumn, 2) FROM OPENROWSET( BULK ''C:\temp\test.xml'', SINGLE_BLOB) AS TempXML; '
EXEC sp_Executesql @sql , N'@a XML OUTPUT',@ACES_XML OUTPUT
SELECT ref.value('(./PartType/@id)[1]','varchar(10)') parttypeID,
ref.value('(./MfrLabel)[1]','varchar(200)') MfrLabel,
ref.value('(./Part)[1]','varchar(20)') Part
into dbo.Specialty_friction_20120919 FROM @ACES_XML.nodes('parts/App') as R(ref);
Loading an XML file into SQL Server using SQLXML 4.0 Bulk load utility
namespace XMLLoadSQL
{
class XMLLoadSQL
{
private static Boolean FileValid;
static String sAppPath = "C:\\SQLServer\\XMLBulkLoad";
static void Main(string[] args)
{
SQLXMLBULKLOADLib.SQLXMLBulkLoad objBL = new SQLXMLBULKLOADLib.SQLXMLBulkLoad();
objBL.ConnectionString = "provider=SQLOLEDB;data source=CSCINDAE747976;database=AEDB;integrated security=SSPI";
objBL.ErrorLogFile = "c:\\sqlserver\\XMLBulkLoad\\error.log";
//Validate the data file prior to bulkload
String sOutput = ValidateFile("Nouns.xml", "", "NounsSchema.xml");
Console.WriteLine (sOutput);
if (FileValid == true)
{
// Check constraints and initiate transaction (if needed)
// objBL.CheckConstraints = True
// objBL.Transaction=True
//Execute XML bulkload using file.
objBL.Execute(sAppPath + "\\" + "NounsSchema.xml", sAppPath + "\\" + "Nouns.xml");
objBL= null;
}
}
private static String ValidateFile(String strXmlFile,String strUrn, String strXsdFile)
{
// Create a schema cache and add SampleSchema.xml to it.
MSXML2.XMLSchemaCache60 xs = new MSXML2.XMLSchemaCache60();
xs.add (strUrn, sAppPath + "\\" + strXsdFile);
// Create an XML DOMDocument object.
MSXML2.DOMDocument60 xd = new MSXML2.DOMDocument60();
// Assign the schema cache to the DOM document, schemas collection.
xd.schemas = xs;
// Load XML document as DOM document.
xd.async = false;
xd.load (sAppPath + "\\" + strXmlFile);
// Return validation results in message to the user.
if (xd.parseError.errorCode != 0)
{FileValid = false ;
return ("Validation failed on " + strXmlFile + "=====================" +
"Reason: " + xd.parseError.reason + "Source: " + xd.parseError.srcText +
"Line: " + xd.parseError.line) ; }
else
{FileValid = true;
return ("Validation succeeded for " +
strXmlFile + "======================" + "Contents to be bulkloaded" ); }
}
}
}
XML file:
<ROOT>
<Nouns>
<Category>places</Category>
<arabic>tareeq</arabic>
<english>road</english>
</Nouns>
<Nouns>
<Category>places</Category>
<arabic>burj</arabic>
<english>tower</english>
</Nouns>
<Nouns>
<Category>places</Category>
<arabic>bahr</arabic>
<english>sea</english>
</Nouns>
</ROOT>
XML Schema file:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
<xsd:element name="ROOT" sql:is-constant="1" >
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Nouns" sql:relation="NounsXML" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Category" type="xsd:string" />
<xsd:element name="arabic" type="xsd:string" />
<xsd:element name="english" type="xsd:string" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
Back
BACKUP/RECOVERY
Three recovery models:
Simple - No transaction log backups. Inactive portions of the log are freed after automatic checkpoints. Suitable for small, read-only
databases, where loss of updates is not consequential. Preferable to run differential backups between full backups.
Full - Log is backed up. Data is not removed from the log till it is backed up. Preferable to run frequent log backups and also run
differential backups between full backups so number of log backups to restore is minimal.
Bulk-logged - Reduces log space usage. Enhances performance during large-scale bulk operations.
Creating backup with timestamp
declare @bk_file varchar(100);
set @bk_file = 'R:\SQLBackup\dbname_' + convert(varchar(50),GetDate(),112)
+ '_' + replace(convert(varchar(50),GetDate(),108),':','') + '.bak';
BACKUP DATABASE [dbname] TO DISK = @bk_file WITH NOFORMAT, NOINIT, NAME = N'DBNAME-Full Database Backup', SKIP, NOREWIND, NOUNLOAD,
COMPRESSION, STATS = 10;
Restoring a database from backup
RESTORE VERIFYONLY FROM DISK='C:\backup\db1.bak'; -- verifies if .bak file is valid
RESTORE HEADERONLY FROM DISK='C:\backup\db1.bak'; -- get info on .bak file contents
RESTORE FILELISTONLY FROM DISK='C:\backup\db1.bak'; -- list of db and log files contained in the backup
RESTORE DATABASE db1 FROM DISK='C:\backup\db1.bak';
RESTORE DATABASE db1 FROM DISK='C:\Backup\db1.bak' WITH MOVE N'db1' TO N'C:\vincentk\db1.mdf',
MOVE N'db1_log' TO N'C:\vincentk\db1_log.ldf'; -- move data/log files to a new location
Full Database backups
A full database backup backs up the whole database, including part of the transaction log so that the full database can be restored from
the backup. Full database backups represent the database at the time the backup finished.
BACKUP DATABASE db1 TO DISK = 'Z:\Backups\db1.bak' WITH FORMAT, MEDIANAME = 'Z_Backups', NAME = 'Full Backup of db1';
Differential backups
A differential backup is based on a previous full data backup (base) and captures only the state of any extents (collections of eight
physically contiguous pages) that have changed. Before restoring a differential backup, its base must be restored.
Differential backups are especially useful if a subset of a database is modified more frequently than the rest of the database. Under the
full recovery model, using differential backups can reduce the number of log backups that need to be restored.
BACKUP DATABASE db1 TO DISK = 'Z:\Backups\db1.bak' WITH DIFFERENTIAL;
Transaction log backups
Log backups are relevant only for full/bulk-logged recovery models.
Taking multiple transaction log backups in a row is called a log-chain. It starts with a full backup and does not end till a new backup
set is started. If one of the backups in the log chain is missing, the restore will fail. A tail-log backup is taken just before the
restore and ensures the log chain is intact, and contains all log records since the last log backup.
BACKUP LOG db1 TO DISK = 'Z:\Backups\db1.bak';
A typical backup strategy involves creating a full database backup occasionally (weekly) and optionally, creating a series of differential
backups at a shorter interval (daily) and independent of the database backups, backing up the transaction log at frequent intervals.
BACKUP DATABASE AEDB TO DISK = 'C:\Vincent\AEDB.bak' FORMAT, INIT -- Full backup
BACKUP DATABASE AEDB TO DISK = 'C:\Vincent\AEDB.bak' WITH DIFFERENTIAL, NOFORMAT, NOINIT, -- Differential backup
NAME = 'AEDB Diff bkup', COMPRESSION
BACKUP LOG AEDB TO DISK = N'C:\Vincent\AEDB.bak' WITH NOFORMAT, NOINIT, NAME = N'AEDB Trans Log Backup', -- Log backup
SKIP, NOREWIND, NOUNLOAD, STATS = 10
BACKUP LOG AEDB TO DISK = N'C:\Vincent\AEDB.bak' WITH NO_TRUNCATE , NOFORMAT, NOINIT, -- Tail log backup
NAME = N'AEDB Trans Log Tail Backup', SKIP, NOREWIND, NOUNLOAD, NORECOVERY, STATS = 10
RESTORE DATABASE AEDB FROM DISK = 'C:\Vincent\AEDB.bak' ---> Restore of full bkup
WITH FILE = 1, NORECOVERY (NORECOVERY puts db in Restoring mode)
RESTORE DATABASE AEDB FROM DISK = 'C:\Vincent\AEDB.bak' ---> Restore of last diff bkup
WITH FILE = 6, NORECOVERY
RESTORE DATABASE AEDB FROM DISK = 'C:\Vincent\AEDB.bak' ---> Restore of log bkup (file 7)
WITH FILE = 7, NORECOVERY
RESTORE DATABASE AEDB FROM DISK = 'C:\Vincent\AEDB.bak' ---> Restore of log bkup (file 8)
WITH FILE = 8, NORECOVERY
RESTORE DATABASE AEDB FROM DISK = 'C:\Vincent\AEDB.bak' ---> Restore of Tail log bkup
WITH FILE = 9, RECOVERY
Database checkpoints
A checkpoint creates a known good point from which the SQL Server Database Engine can start applying changes contained in the log during
recovery after a crash, by writing the dirty pages and transaction log information from memory to disk and recording information about the
transaction log.
The Database Engine supports several types of checkpoints
automatic - Issued automatically in the background to meet the upper time limit suggested by the recovery interval server configuration
option
EXEC sp_configure 'recovery interval','seconds'
indirect - Issued in the background to meet a user-specified target recovery time for a given database
ALTER DATABASE … SET TARGET_RECOVERY_TIME =target_recovery_time { SECONDS | MINUTES }
manual - issued when a T-SQL CHECKPOINT command is executed
internal - issued by server operations such as backup and database-snapshot creation to guarantee that disk images match the current
state of the log
Troubleshooting/Recovery
Error codes
1205: Transaction (Process ID xx) was deadlocked on lock | communication buffer resources with another process - possible intra-query
parallel deadlock (MAXDOP 1 may be considered)
3156: File 'Db' cannot be restored to 'D:\..\DATA\Db.mdf'. Use WITH MOVE to identify a valid location for the file - The file path doesn't
match the original machine
3201: Cannot open backup device ‘C:\db1.bak’. Operating system error 5 (Access is denied) - SQL Server service account needs permission
on .bak file
3241: The media family on device 'C:\db1.bak' is incorrectly formed. SQL Server cannot process this media family - Attempting to restore a
backup created on a later version of SQL Server to an earlier version
4866: Bulk Insert fails. Column is too long in the data file for row x, column y - Files generated in Unix have a LF at EOL while those
in Windows have a CR and LF
9002: The transaction log for the database is full - To find what is preventing log truncation, use the log_reuse_wait and
log_reuse_wait_desc columns of the sys.database catalog view
Back
USER MANAGEMENT/SECURITY
Authentication
SQL Server supports two authentication modes.
Windows Authentication - referred to as integrated security because it is integrated with Windows. Users are authenticated using their
local or domain windows user account. Users who are already logged onto Windows do not have to log on separately to SQL Server and SQL
Server validates the login info through the Windows token which has been confirmed valid by the OS. This is known as a trusted connection.
Mixed mode authentication - If Windows authentication fails, prompts for SQL Server login and password.
Windows authentication uses a series of encrypted messages to authenticate users. When SQL Server logins are used, login names and
encrypted passwords are passed across the network, which makes them less secure.
Access control
Principals
Principals are entities to whom permissions are granted for SQL server resources (server, database, database objects etc.).
Windows level - Windows group, Windows domain login, Windows local login
SQL Server level - fixed server role, SQL Server login
Database level - fixed database role, database user, application user
sa login
The SQL Server sa log in is a server-level principal. By default, it is created when an instance is installed and its default database is
master. The sa login is a member of the sysadmin fixed database role and has all permissions on the server and cannot be limited. The sa
login cannot be dropped, but it can be disabled so that no one can use it.
dbo user and schema
The dbo user is a special user principal in each database. All SQL Server administrators, members of the sysadmin fixed server role, sa
login and owners of the database enter databases as the dbo user. The dbo user has all permissions in the database and cannot be limited
or dropped. dbo stands for database owner, but the dbo user account is not the same as the db_owner fixed database role and the db_owner
fixed database role is not the same as the user account recorded as the database owner.
The dbo user owns the dbo schema. The dbo schema is the default schema for all users, unless some other schema is specified. The dbo schema
cannot be dropped.
Fixed server roles
Server-level roles are server-wide in their permissions scope. Fixed server roles are provided for convenience and backward compatibility.
Three main roles - sysadmin, securityadmin, serveradmin.
sysadmin - members of the sysadmin fixed server role can perform any activity in the server. Before SQL Server 2008 R2, members of the
local Windows built-in Administrators group were automatically added to the sysadmin fixed server role.
serveradmin - members of the serveradmin role can change server-wide configuration options and shut down the server
securityadmin - members manage logins and their properties. They can GRANT, DENY and REVOKE server-level permissions and also GRANT, DENY
and REVOKE database-level permissions if they have access to a database and can reset passwords for SQL Server logins.
Fixed database roles
They are security principals that are used to give access to a specific database. The two main roles are db_owner and db_securityadmin.
db_owner - members of the db_owner fixed database role can perform all configuration and maintenance activities on the database and can
also drop the database.
db_securityadmin - members can modify role membership and manage permissions
public server and database role
Every login belongs to the public fixed server role and every database user belongs to the public database role. When a login or user has
not been granted or denied specific permissions on a securable, the login or user inherits the permissions granted to public on that
securable. Database users cannot be removed from the public role.
The public roles cannot be dropped but permissions can be revoked from them.
Application roles
An application role is a database principal that enables an application to run with its own, user-like permissions and are used to enable
access to specific data to only those users who connect through a particular application. Unlike database roles, application roles contain
no members and are inactive by default.
Application roles work with both authentication modes and are enabled by using sp_setapprole, which requires a password. Since they are a
database-level principal, they can access other databases only through permissions granted in those databases to guest. Therefore, any
database in which guest has been disabled will be inaccessible to application roles in other databases.
Logins
A login is a security principal and permissions can be granted to logins. The scope of a login is the whole Database Engine and to connect
to a specific database on the instance of SQL Server, a login must be mapped to a database user.
Database users
Access to objects with a database is controlled by database users. A database user can be associated with a SQL Server login or just have
a password with no login or a Windows user.
Each database includes a guest user whose permissions are inherited by users who have access to the database, but who do not have a user
account in the database. The guest user cannot be dropped, but it can be disabled by revoking its CONNECT permission within any database
other than master or tempdb.
REVOKE CONNECT FROM GUEST;
Securables
Securables are the resources to which the SQL Server Database Engine authorization system regulates access. Access to securables is
controlled by granting or denying permissions to principals (logins and users) or by adding logins and users to roles which have access.
The three main scopes of securables are:
server
database
schema - container that consists of a collection of securables. Each schema can be owned by any database principal. When a user is created,
a default schema can be specified, else it will be dbo by default.
Every database has a number of fixed server and database roles. SQL server logins and Windows logins/groups can be added to the fixed server
roles to grant them permissions at the server level scope.
Schemas
Schemas are collections of database securables. Default schemas created in a database are dbo, guest, information_schema and sys.
Permissions
Every securable has associated permissions that can be granted to a principal. Permissions in the Database Engine are managed at the
server level assigned to logins and server roles and at the database level assigned to database users and database roles.
Server level - ALTER SETTINGS, ALTER ANY LOGIN, ALTER ANY DATABASE, ALTER TRACE, AUTHENTICATE SERVER, CONTROL SERVER, CREATE ENDPOINT,
SHUTDOWN
Database level - ALTER, ALTER ANY USER, AUTHENTICATE, BACKUP DATABASE, BACKUP LOG, CONTROL, DELETE, EXECUTE, INSERT, SELECT, SHOWPLAN
Permissions can be manipulated by the T-SQL statements - GRANT, DENY and REVOKE. DENY prevents a principal from inheriting a permission
through its group or role memberships.
GRANT INSERT on object::dbo.table1 to user1;
REVOKE DELETE on object::dbo.table1 from user2;
DENY EXECUTE on object::dbo.usp1 from user3;
Information about permissions is visible in the sys.server_permissions and sys.database_permissions catalog views.
SELECT * FROM fn_builtin_permissions(default); - all permissions
SELECT * FROM fn_builtin_permissions('schema'); - for a securable
SELECT * FROM fn_my_permissions('table1', 'object'); - permissions granted to the executing principal on an object
SELECT * FROM sys.database_permissions WHERE major_id = OBJECT_ID('table1'); - permissions for an object
Services Security
Various SQL Server services can be run on different security accounts to isolate each service. Additionally each service should use a
Local/Windows user account with permission to run that service.
Services can use one of the accounts below:
local user account
local service account
local system account
network service account
domain user account - When using this, avoid using an Administrators account and assign fewest privileges possible.
The service startup account can be changed through the SQL Server Configuration Manager.
Encryption
SQL Server encrypts data with a hierarchical encryption and key management infrastructure. Each layer encrypts the layer below it by using
a combination of certificates, asymmetric keys and symmetric keys. Asymmetric keys and symmetric keys can be stored outside of SQL Server
in an Extensible Key Management (EKM) module.
Encryption mechanisms
T-SQL functions - Individual items can be encrypted as they are inserted or updated using T-SQL functions
Symmetric key encryption - A symmetric key is one key that is used for both encryption and decryption
Asymmetric key encryption - An asymmetric key is made up of a private key and the corresponding public key. Each key can decrypt data
encrypted by the other
Digital Certificates - A public key certificate, usually just called a certificate, is a digitally-signed statement that binds the
value of a public key to the identity of the person, device or service that holds the corresponding private key
Transparent Data encryption (TDE) - TDE is a special case of encryption using a symmetric key. TDE encrypts an entire database using
that symmetric key called the database encryption key. The database encryption key is protected by other keys or certificates which are
protected either by the database master key or by an asymmetric key stored in an EKM module
SQL Server allows to choose from among several encryption algorithms including DES, Triple DES, TRIPLE_DES_3KEY, RC2, RC4, 128-bit RC4,
DESX, 128-bit AES, 192-bit AES, and 256-bit AES.
Back
HIGH AVAILABILITY/SCALABILITY
Failover Clustering - Failover clusters are made up of one or more server nodes and 2 or more shared disks. The application is installed
as a cluster resource group and hosted on one cluster at a time.
Log Shipping - Allows automatically sending transaction log backups from a primary database on a primary server instance to one or more
secondary databases on separate secondary server instances. The transaction log backups are applied to each of the secondary databases
individually. An optional third server instance, known as the monitor server, records the history and status of backup and restore
operations and, optionally, raises alerts.
Database Mirroring - Two databases (principal and mirror) exist on two difference SQL server instances, both with the same data. A mirror
is a dynamically updated copy of a source database that resides on a mirror server. Works only on DBs using the full recovery model.
Databases can be in one of these states - Synchronizing, Synchronized, Suspended, Pending Failover and Disconnected. The principal and
mirror DBs will be of the same state, except for Pending Failover.
Mirroring works at the level of the physical log record and involves redoing every insert, update and delete operation performed on the
principal DB on the mirror DB by sending the log records. SQL Server can automatically repair corrupt pages using database mirroring.
Always On Availability Groups
Always On availability groups (AG) is a high-availability and disaster-recovery solution that provides an enterprise-level alternative
to database mirroring. Introduced in SQL Server 2012, AG maximizes the availability of a set of user databases for an enterprise by
supporting a failover environment for a set of databases, known as availability databases, that fail over together. An AG supports a set
of read-write primary databases and one to eight sets of corresponding secondary databases. Optionally, secondary databases can be made
available for read-only access and/or some backup operations.
An AG fails over at the level of an availability replica. Failovers are not caused by database issues such as a database becoming suspect
due to a loss of a data file, deletion of a database, or corruption of a transaction log.
Features of AG
Supports up to nine availability replicas. An availability replica is an instantiation of an availability group that is hosted by a
specific instance of SQL Server and maintains a local copy of each availability database that belongs to the availability group. Each
availability group supports one primary replica and up to eight secondary replicas
supports asynchronous and synchronous commit modes
Supports an availability group listener for each AG. An AG listener is a server name to which clients can connect in order to access
a database in a primary or secondary replica of an AG. AG listeners direct incoming connections to the primary replica or to a read-only
secondary replica. The listener provides fast application failover after an availability group fails over
Supports automatic page repair for protection against page corruption
Provides an integrated set of tools to simplify deployment and management including T-SQL DDL statements and SSMS tools for creating
and managing availability groups
Deploying AG requires a Windows Server Failover Clustering (WSFC) cluster. To be enabled for AG, an instance of SQL Server must reside on
a WSFC node, and the WSFC cluster and node must be online. Furthermore, each availability replica of a given AG must reside on a different
node of the same WSFC cluster. The only exception is that while being migrated to another WSFC cluster, an AG can temporarily straddle two
clusters.
AG relies on the WSFC cluster to monitor and manage the current roles of the availability replicas that belong to a given AG and to
determine how a failover event affects the availability replicas. A WSFC resource group is created for every AG created. The WSFC cluster
monitors this resource group to evaluate the health of the primary replica.
Back
Always On Failover Cluster Instances
Always On Failover Cluster Instances (FCI) leverages WSFC functionality to provide local high availability through redundancy at the
server-instance level. An FCI is a single instance of SQL Server that is installed across WSFC nodes and possibly across multiple subnets.
On the network, an FCI appears to be an instance of SQL Server running on a single computer, but the FCI provides failover from one WSFC
node to another if the current node becomes unavailable.
An FCI runs in a WSFC resource group with one or more WSFC nodes. When the FCI starts up, one of the nodes assume ownership of the
resource group and brings its SQL Server instance online. The resources owned by this node include -
Network name
IP address
Shared disks
SQL Server Database Engine service
SQL Server Agent service
SQL Server Analysis Services service, if installed
One file share resource, if the FILESTREAM feature is installed
At any time, only the resource group owner (and no other node in the FCI) is running its respective SQL Server services in the resource
group. When a failover occurs, either automatic or planned, the following sequence of events happen
Unless a hardware or system failure occurs, all dirty pages in the buffer cache are written to disk
All respective SQL Server services in the resource group are stopped on the active node
The resource group ownership is transferred to another node in the FCI
The new resource group owner starts its SQL Server services
Client application connection requests are automatically directed to the new active node using the same virtual network name (VNN)
The FCI is online as long as its underlying WSFC cluster is in good quorum health (the majority of the quorum WSFC nodes are available as
automatic failover targets). When the WSFC cluster loses its quorum, whether due to hardware, software, network failure, or improper
quorum configuration, the entire WSFC cluster, along with the FCI, is brought offline. Manual intervention is then required in this
unplanned failover scenario to reestablish quorum in the remaining available nodes in order to bring the WSFC cluster and FCI back online.
Elements of FCI
An FCI consists of a set of physical servers (nodes) that contain similar hardware configuration and identical software configuration that
includes OS version and patch level, SQL Server version, patch level, components and instance name. Identical software configuration is
necessary to ensure that the FCI can be fully functional as it fails over between the nodes.
WSFC Resource Group - FCI runs in a WSFC resource group. Each node in the resource group maintains a synchronized copy of the
configuration settings and check-pointed registry keys to ensure full functionality of the FCI after a failover, and only one of the
nodes in the cluster owns the resource group at a time (the active node). The WSFC service manages the server cluster, quorum
configuration, failover policy, and failover operations, as well as the VNN and virtual IP addresses for the FCI. In case of a
hardware/OS/application/service failure or a planned upgrade, the resource group ownership is moved to another node in the FCI. The
number of nodes that are supported in a WSFC resource group depends on the SQL Server edition. Also, the same WSFC cluster can run multiple
FCIs (multiple resource groups), depending on hardware capacity, such as CPUs, memory, and number of disks
SQL Server Binaries - The product binaries are installed locally on each node of the FCI, a process similar to SQL Server stand-alone
installations. However, during startup, the services are not started automatically, but managed by WSFC
Storage - Contrary to the AG, an FCI must use shared storage between all nodes of the FCI for database and log storage. The shared
storage can be in the form of WSFC cluster disks, disks on a SAN, Storage Spaces Direct (S2D), or file shares on an SMB. This way, all
nodes in the FCI have the same view of instance data whenever a failover occurs. This does mean that the shared storage has the potential
of being the single point of failure, and FCI depends on the underlying storage solution to ensure data protection.
Network Name - The VNN for the FCI provides a unified connection point for the FCI. This allows applications to connect to the VNN
without the need to know the currently active node. When a failover occurs, the VNN is registered to the new active node after it starts,
transparent to the client or application connecting to SQL Server
Virtual IPs - In the case of a multi-subnet FCI, a virtual IP address is assigned to each subnet in the FCI. During a failover, the
VNN on the DNS server is updated to point to the virtual IP address for the respective subnet. Applications and clients can then connect
to the FCI using the same VNN after a multi-subnet failover
FCIs and Availability Groups
A second layer of failover can be set up at the server-instance level by implementing FCI together with the WSFC cluster. An
availability replica can be hosted by either a standalone instance of SQL Server or an FCI instance. Only one FCI partner can host a
replica for a given availability group. When an availability replica is running on an FCI, the possible owners list for the availability
group will contain only the active FCI node.
AG does not depend on any form of shared storage. However, if FCI is used to host one or more availability replicas, each of those FCIs
will require shared storage as per standard FCI installation
Comparison of FCI and AG
Regardless of the number of nodes in the FCI, an entire FCI hosts a single replica within an availability group. The following table
describes the distinctions in concepts between nodes in an FCI and replicas within an availability group.
Nodes within an FCI Replicas within an availability group
------------------- -------------------------------------
Uses WSFC cluster Yes Yes
Protection level Instance Database
Storage type Shared Non-shared
Storage solutions Direct attached, SAN, mount points, SMB Depends on node type
Readable secondaries No Yes
Applicable failover WSFC quorum WSFC quorum
policy settings FCI-specific Availability group settings
Availability group settings
Failed-over resources Server, instance, and database Database only
Back
TOOLS/UTILITIES
SQL Server Reporting Services (SSRS)
SSRS provides a set of tools and services that create, deploy and manage mobile and paginated reports.
SSRS installation involves server components for storing report items, rendering reports and processing of subscription and other report
services. A report server database needs to be created on a SQL server to complete the initial configuration of report services. A windows
service, named SQLServerReportingServices, is created as part of the installation.
The report server instance can be accessed and administered through a web portal whose URL, by default, is https://[ComputerName]/reports.
Report Builder
Report Builder is a stand-alone report authoring environment that can be used to create paginated reports and publish them to SSRS. Report
Builder isn't installed with SSRS and needs to be installed separately.
Report Builder can be started from the Reporting Services web portal by typing the URL for the report server in the browser. By default,
the URL is https://<servername>/reports.
Report data
Reporting Services retrieves report data from data sources through a modular and extensible data layer that uses data processing
extensions. To add data to a report, datasets are created that represent the result set from running a query command on a data source.
There are two types of datasets - embedded and shared. An embedded dataset is defined in a report and used only by that report. A shared
dataset is defined on the report server and can be used by multiple reports.
SQL Server Analysis Services (SSAS)
SSAS is an analytical data engine used in decision support and business analytics which provides enterprise-grade semantic data models for
business reports and client applications such as Power BI, Excel, Reporting Services reports and other data visualization tools.
A typical workflow includes creating a tabular or multidimensional data model project in Visual Studio, deploying the model as a database
to a server instance, setting up recurring data processing and assigning permissions to allow data access by end-users. The semantic data
model can be accessed by client applications supporting Analysis Services as a data source.
SQL Server Agent
SQL Server Agent is a Windows service that executes scheduled administrative tasks called jobs. SQL Server Agent uses SQL Server to store
job information. Jobs contain one or more job steps and each step contains its own task like backing up a database.
Jobs can be run on a schedule, in response to a specific event or on demand.
SQL Server Agent Components
SQL Server Agent uses the following components to define, when to perform and how to report the success or failure of the tasks.
Jobs
A job is a specified series of actions (steps) that can be run one or more times and monitored for success or failure. A job can
run on one local server or on multiple remote servers. Jobs can be run according to one or more schedules, In response to one or more
alerts or by executing the sp_start_job stored procedure.
Each job step runs in a specific security context. For job steps that use T-SQL, the EXECUTE AS statement can be used to set the security
context. For other types of job steps, a proxy account can be used to set the security context.
Schedules
A schedule specifies when a job runs. More than one job can run on the same schedule and more than one schedule can apply to the same
job. A schedule can define the following conditions for the time when a job runs
whenever SQL Server Agent starts
whenever CPU utilization of the computer is at a level defined as idle
one time, at a specific date and time
on a recurring schedule
Alerts
An alert is an automatic response to a specific event. For e.g, an event can be a job that starts or system resources that reach a
specific threshold.
An alert can respond to SQL Server events, SQL Server performance conditions or Microsoft Windows Management Instrumentation (WMI) events
and can notify one or more operators or run a job.
Operators
An operator defines contact information for an individual. SQL Server can notify operators of alerts through e-mail, pager (through e-mail)
or net send.
Resource Governor
Resource Governor is used to manage SQL Server workload and system resource consumption. Resource Governor enables specifying limits on the
amount of CPU, physical IO, and memory that incoming application requests can use.
ALTER RESOURCE GOVERNOR RECONFIGURE; - enable resource governor
ALTER RESOURCE GOVERNOR DISABLE;
Data Collector
Data Collector enables collection of datasets from multiple SQL Server instances. The collected data is stored in a Management Data
Warehouse (MDW). There are 3 system data collection sets - disk usage, server activity and query statistics.
Back
Maintained by: VINCENT KANDASAMY, Database Architect/Administrator (kandasf@hotmail.com)
Last updated: Mar 19, 13:30