TechnoBase DBMS | MySQL | Oracle | PostgreSQL | Microsoft .NET
SQL Server
SQL Server architecture ---- Database Engine ---- Pages and Extents ---- Memory management -------- Buffer management ---- Transaction log -------- Log truncation Data structures ---- Databases -------- Database files and filegroups ---- Tables ---- Indexes T-SQL language ---- Datatypes ---- Operators ---- Statements ---- Functions ---- Variables ---- Stored procedures ---- User-defined functions Server administration ---- Installing SQL Server -------- Default and named instances ---- Managing services ---- Server configuration ---- Connection handling ---- DBCC commands ---- System catalog views ---- Monitoring the server -------- Dynamic management views -------- Extended events Transaction/Query management ---- Isolation levels ---- Locking -------- Locking modes -------- Locking information ---- Query processing -------- Distributed queries ---- Query optimization -------- Operators Importing/Exporting data ---- SQL Server Integration Services(SSIS) -------- SSIS packages ---- Loading XML data Backup/Recovery ----Troubleshooting/Recovery -------- Error codes User management/Security ---- Authentication ---- Access control -------- Principals -------- Securables -------- Permissions ---- Encryption High availability/Scalability ---- Always On Availability Groups ---- Failover Cluster instances Tools/Utilities ---- SQL Server Reporting Services (SSRS) ---- SQL Server Agent


SQL SERVER ARCHITECTURE

Database Engine

The Database Engine component of SQL Server is the core service for storing, processing and securing data. SQL Server supports up to 50 instances of the Database Engine on a single computer. An instance of the Database Engine is a copy of the sqlservr.exe executable that runs as an OS service and handles all application requests to perform work on the data in any of the databases managed by that instance.

Applications connect to the instance through a network connection if the application and instance are on separate computers. If the application and instance are on the same computer, the SQL Server connection can run as either a network connection or an in-memory connection. When a connection has been completed, an application sends T-SQL statements across the connection that are resolved by the instance into operations against the data and objects in the databases, and if the required permissions have been granted to the login credentials, performs the work. Any data retrieved is returned to the application, along with any messages such as errors.

Pages and extents

The page is the fundamental unit of data storage in SQL Server. The disk space allocated to a data file (.mdf or .ndf) in a database is logically divided into pages numbered contiguously from 0 to n. Disk I/O operations are performed at the page level i.e SQL Server reads or writes whole data pages.

The page size is 8 KB i.e databases have 128 pages per MB. Each page begins with a 96-byte header that is used to store system information about the page including the page number, page type, the amount of free space on the page and the allocation unit ID of the object that owns the page. The following page types are used in the data files of a SQL Server database
  • Data - data rows with all data (except large object data types)
  • Index - index entries
  • Text/Image - large object data types (text, ntext, image, nvarchar(max), varchar(max), varbinary(max) and xml)
  • Global Allocation Map (GAM)/Shared Global Allocation Map (SGAM) - information about whether extents are allocated
  • Page Free Space (PFS) - information about page allocation and free space available on pages
  • Index Allocation Map - information about extents used by a table or index per allocation unit

    Extents are a collection of eight physically contiguous pages and are used to efficiently manage the pages. All pages are stored in extents. Databases have 16 extents per MB.

    SQL Server has two types of extents

  • Uniform extents are owned by a single object - all eight pages in the extent can only be used by the owning object
  • Mixed extents are shared by up to eight objects - each of the eight pages in the extent can be owned by a different object

    SQL Server uses two types of allocation maps to record the allocation of extents

  • Global Allocation Map (GAM) - GAM pages record what extents have been allocated. Each GAM covers 64,000 extents (4 GB of data) and has 1-bit for each extent in the interval it covers. If the bit is 1, the extent is free and if 0, the extent is allocated.
  • Shared Global Allocation Map (SGAM) - SGAM pages record which extents are currently being used as mixed extents and also have at least one unused page. Each SGAM covers 64,000 extents (4 GB of data) and has 1-bit for each extent in the interval it covers. If the bit is 1, the extent is being used as a mixed extent and has a free page. If the bit is 0, the extent is not used as a mixed extent or it is a mixed extent and all its pages are being used.
    	Current use of extent			GAM bit setting		SGAM bit setting
    	Free, not being used				1			0
    	Uniform extent, or full mixed extent		0			0
    	Mixed extent with free pages			0			1
    Memory management

    SQL Server dynamically acquires and frees memory as required and the amount of memory to be allocated to SQL server need not be specified. SQL Server builds a buffer pool in memory to hold pages read from the database and tries to minimize the number of physical reads and writes between the disk and the buffer pool. SQL Server tries to reach a balance between
  • keeping the buffer pool from becoming so big that the entire system is low on memory
  • Minimizing physical I/O to the database files by maximizing the size of the buffer pool

    When SQL Server is using memory dynamically, it queries the system periodically to determine the amount of free memory and if it is less, releases memory to the OS to prevent the OS from paging. SQL Server adds memory only when its workload requires more memory, a server at rest does not increase the size of its virtual address space.

    SQL Server uses the memory notification API QueryMemoryResourceNotification to determine when the SQL Server Memory Manager may allocate memory and release memory.

    When SQL Server starts, it computes the size of virtual address space for the buffer pool based on a number of parameters such as amount of physical memory on the system, number of server threads and various startup parameters and reserves the computed amount, but it acquires (commits) only the required amount of physical memory for the current load.

    The instance then continues to acquire memory as needed to support the workload. As more users connect and run queries, the additional physical memory is acquired on demand. An instance continues to acquire physical memory until it either reaches its max server memory allocation target or the OS indicates there is no longer an excess of free memory; it frees memory when it has more than the min server memory setting and the OS indicates that there is a shortage of free memory.

    SQL Server can free and acquire several megabytes of memory each second, allowing it to quickly adjust to memory allocation changes.

    min and max server memory

    The min and max server memory configuration options establish upper and lower limits to the amount of memory used by the buffer pool and other caches of the SQL Server Database Engine. The buffer pool does not immediately acquire the memory specified in min server memory and starts with only the memory required to initialize. As the workload increases, it keeps acquiring memory and does not free any of the acquired memory until it reaches min server memory. Once min server memory is reached, the buffer pool then uses the standard algorithm to acquire and free memory and never drops its memory allocation below the level specified in min server memory and never acquires more memory than max server memory.

    The SQL Server process acquires more memory than specified by max server memory option. Both internal and external components can allocate memory outside of the buffer pool, which consumes additional memory, but the memory allocated to the buffer pool usually represents the largest portion of memory consumed by SQL Server.

    Buffer management

    Buffer management is a key component in achieving I/O efficiency. The buffer management component consists of two mechanisms: the buffer manager to access and update database pages and the buffer cache (also called the buffer pool), to reduce database file I/O.

    The buffer cache is divided into 8 KB pages (buffers), the same size as a data or index page. The buffer manager manages reading data or index pages from the disk into the buffer cache and writing modified pages back to disk. A page remains in the buffer cache until the buffer manager needs the area to read in more data. Data in the buffer cache can be modified multiple times before being written back to disk.

    When SQL Server starts, it computes the size of virtual address space for the buffer cache based on a number of parameters and reserves this computed amount (called the memory target) but it acquires only the required amount of memory for the current load. The sys.dm_os_sys_info catalog view can be queried to return the number of pages reserved as the memory target and the number of pages currently committed in the buffer cache.
    	select committed_target_kb, committed_kb from sys.dm_os_sys_info;
    The interval between SQL Server startup and when the buffer cache obtains its memory target is called ramp-up which depends on the number and type of client requests. Ramp-up is expedited by transforming single page read requests into aligned eight page requests (making up one extent). This allows the ramp-up to finish much faster, especially on machines with a lot of memory.

    Because the buffer manager uses most of the memory in the SQL Server process, it cooperates with the memory manager to allow other components to use its buffers. The buffer manager interacts primarily with the following components:
  • Resource manager to control overall memory usage
  • Database manager and the SQL Server Operating System (SQLOS) for low-level file I/O operations
  • Log manager for write-ahead logging

    Transaction log

    Every SQL server database has a transaction log that records all transactions and their modifications to the database and is used to recover the database in case of system failure.

    The transaction log supports the following operations:
  • Individual transaction recovery - if an application issues a ROLLBACK or if the Database Engine detects an error such as the loss of communication with a client, the log records are used to roll back the modifications made by an incomplete transaction
  • Recovery of all incomplete transactions - If a server fails, the databases may be left in a state where some changes were never written from the buffer cache to the data files and there may be updates from incomplete transactions in the data files. When an instance of SQL Server is started, it runs a recovery of each database and modifications recorded in the log which may not have been written to the data files are rolled forward and any incomplete transactions found in the log are rolled back to ensure the database integrity is preserved
  • Rolling a restored database, file, filegroup, or page forward to the point of failure
  • Supporting transactional replication
  • Supporting high availability and disaster recovery solutions - Always On availability groups, database mirroring and log shipping

    The SQL Server Database Engine divides each physical log file internally into a number of virtual log files (VLFs) with no fixed size or fixed number. The Database Engine chooses the size of the virtual log files dynamically while it is creating or extending log files, trying to keep their number to a minimum. The size of the virtual files after a log file has been extended is the sum of the size of the existing log and the size of the new file increment. The size or number of virtual log files cannot be configured.

    Log truncation

    Log truncation frees space in the log file for reuse by the transaction log by deleting inactive VLFs. If a transaction log were never truncated, it would eventually fill all the disk space that is allocated to its physical log files. Before the log can be truncated, a checkpoint operation must occur. A checkpoint writes the current in-memory modified pages (dirty pages) and transaction log information from memory to disk and marks the inactive portion of the log as reusable.

    Truncation occurs automatically after the following events
  • Under the simple recovery model, after a checkpoint
  • Under the full or bulk-logged recovery model, if a checkpoint has occurred since the previous backup, truncation occurs after a log backup (unless it is a copy-only log backup)

    The active portion of the transaction log can never be truncated. The record at the start of the active portion of the log is identified by the minimum recovery log sequence number (MinLSN).

    The recovery model chosen for a database determines how much of the transaction log in front of the active portion must be retained. Although the log records in front of the MinLSN play no role in recovering active transactions, they are required to roll forward modifications when using log backups to restore a database to the point of failure.

    In case of a crash, data can be recovered by restoring the last database backup, and then restoring every log backup since the database backup. This means that the sequence of log backups must contain every log record that was written since the database backup. When a sequence of transaction log backups is maintained, no log record can be truncated until after it has been written to a log backup. The log records before the MinLSN are only needed to maintain a sequence of transaction log backups.

    In the simple recovery model, a sequence of transaction logs is not being maintained. All log records before the MinLSN can be truncated at any time, except while a BACKUP statement is being processed. NO_LOG and TRUNCATE_ONLY are the only BACKUP LOG options that are valid for a database that is using the simple recovery model.
    	SELECT * FROM sys.dm_db_log_space_usage;	- returns space usage information for the transaction log
    Shrinking the log

    Log truncation does not reduce the size of the physical log file which needs to be shrunk to free the unused space. A log file can be shrunk only while the database is online and at least one VLF is free.

    When a transaction log file is shrunk, inactive VLFs are removed from the end of the log file to reduce the log to approximately the target size. But if the storage space is required again after a log shrink, the transaction log will grow again and may introduce performance overhead during log growth operations.
    	-- shrink a log file to a specific target size
    	ALTER DATABASE db1 SET RECOVERY SIMPLE;	- the file is first truncated by setting the database recovery model to SIMPLE
    	DBCC SHRINKFILE (db1_Log, 1);	- shrink the truncated log file to 1 MB
    	ALTER DATABASE db1 SET RECOVERY FULL;	- reset the database recovery model
    Add or enlarge a log file

    One transaction log file is sufficient unless log space is running out and disk space is also running out on the volume that holds the log file. Space can be gained by enlarging the existing log file or by adding a log file to the database, typically on a different disk.
    	ALTER DATABASE db1 MODIFY FILE (name=db1_log, size = 200 MB);	- enlarge a log file
    	
    	-- add two log files
    	ALTER DATABASE db1 ADD LOG FILE 
    	( NAME = db1_log2, FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\db1_log2.ldf',
    		SIZE = 5MB, MAXSIZE = 100MB, FILEGROWTH = 5MB ),
    	( NAME = db1_log3, FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\db1_log3.ldf',
    		SIZE = 5MB, MAXSIZE = 100MB, FILEGROWTH = 5MB );

    Back



    DATA STRUCTURES

    Databases

    Each instance of SQL Server can contain one or many databases. Within a database, there are one or many object ownership groups called schemas and within each schema there are objects like tables, views and stored procedures. Some objects such as certificates and asymmetric keys are contained within the database but not within a schema.
    	-- create a database specifying data and transaction log files (size in MB, by default)
    	CREATE DATABASE db1 ON
    	(NAME = db1_dat, FILENAME = 'D:\Program Files\SQL Server\Data\db1.mdf', SIZE = 1000, MAXSIZE = 200000, FILEGROWTH = 5)
    	LOG ON
    	(NAME = db1_log, FILENAME = 'D:\Program Files\SQL Server\DATA\db1.ldf', SIZE = 50MB, MAXSIZE = 2500MB, FILEGROWTH = 5MB);
    	
    	select * from sys.databases;	- list all databases
    Database files and filegroups

    At a minimum, every SQL Server database has two OS files - a data file and a log file. Data files contain data and objects such as tables, indexes, stored procedures, and views. Log files contain the information that is required to recover/rollback all transactions performed on the database. Data files can be grouped together in filegroups for allocation and administration purposes.

    The three types of files are

  • Primary - The primary data file contains the startup information for the database and points to the other files in the database. User data and objects can be stored in this file or in secondary data files. Every database has one primary data file. The recommended file name extension for primary data files is .mdf

  • Secondary - Secondary data files are optional, are user-defined and store user data. Secondary files can be used to spread data across multiple disks by putting each file on a different disk drive. Additionally, if a database exceeds the maximum size for a single Windows file, secondary data files can be used so the database can continue to grow. The recommended extension for secondary data files is .ndf

  • Transaction log - There must be at least one log file for each database and the recommended extension is .ldf

    Database detach and attach

    The data and log files of a database can be detached and then reattached to the same or another instance of SQL Server. Detaching and attaching a database is useful for moving the database to a different instance of SQL Server or computer.
    	EXEC sp_detach_db 'db1', 'true';	- detach database
    	CREATE DATABASE testdb ON (FILENAME = 'C:\mssql\testdb_data.mdf'), (FILENAME = 'C:\mssql\testdb_log.ldf') 
    				FOR ATTACH;	- attach database
    A database cannot be detached if
  • the database is replicated and published - if replicated, the database must be unpublished and publishing must be disabled by running sp_replicationdboption or replication must be removed by running sp_removedbreplication
  • a database snapshot exists on the database - before detaching the database, all of its snapshots must be dropped
  • the database is being mirrored in a database mirroring session - the database cannot be detached unless the session is terminated
  • the database is suspect - a suspect database must be put into emergency mode before detaching
  • the database is a system database

    System Databases

    master - stores all the system-level information for an SQL Server instance
    msdb - used by SQL server Agent to schedule jobs and alerts
    model - template for all new databases
    resource - read-only database that contains system objects
    tempdb - workspace for temporary objects and intermediate resultsets


    Tables

    SQL Server can have as many as two billion tables per database and 1,024 columns per table. The number of rows and total size of the table are limited only by the available storage.

    The maximum number of bytes per row is 8,060. If tables are created with varchar, nvarchar or varbinary columns in which the total defined width exceeds 8,060 bytes, the table is created, but a warning message appears. Trying to insert more than 8,060 bytes into such a row or to update a row so that its total row size exceeds 8,060 produces an error message and the statement fails.

    CREATE TABLE statements that include a sql_variant column can generate the following warning: The total row size (xx) for table 'yy' exceeds the maximum number of bytes per row (8060). Rows that exceed the maximum number of bytes will not be added.. This warning occurs because sql_variant can have a maximum length of 8016 bytes and when a sql_variant column contains values close to the maximum, it can overshoot the row's maximum size limit.
    	CREATE TABLE t1 ( c1 int primary key identity(1,1), c2 varchar(255), INDEX ix_1 NONCLUSTERED (c2));
    	
    	EXEC sp_rename 'sales.t1', 't2';	- rename table
    	
    	ALTER TABLE t1 ADD col1 int NOT NULL, col2 varchar(20);	- add columns
    	ALTER TABLE t1 DROP COLUMN col1;			- drop column
    	EXEC sp_rename 'dbo.t1.col1', 'col2', 'COLUMN';		- rename column
    	ALTER TABLE dbo.t1 ALTER COLUMN col1 DECIMAL(5,2);	- modify column
    A computed column is a virtual column that is not physically stored in the table, unless the column is marked PERSISTED. A computed column expression uses data from other columns to calculate a value for the column to which it belongs.
    	CREATE TABLE dbo.products	- table with computed column
    	(
    	product_id int IDENTITY (1,1) NOT NULL,
    	quantity smallint,
    	unit_price money,
    	inventory_value AS quantity * unit_price );
    	
    	ALTER TABLE dbo.products DROP COLUMN retail_value;	- drop a computed column
    	ALTER TABLE dbo.products ADD retail_value AS (quantity * unit_price * 1.5);	- add a computed column
    Temporary tables

    A local temp table name is prefixed with single number sign (#table_name) and a global temp table with a double number sign (##table_name).

    Differences between local and global temporary tables:
    Visibility - Local temporary tables are visible only in the current session; global temporary tables are visible to all sessions.
    Lifetime - A local temporary table created in a stored procedure is dropped automatically when the stored procedure completes. All other local temporary tables are dropped automatically at the end of the current session. Global temporary tables are automatically dropped when the session that created the table ends and all other tasks have stopped referencing them.
    	IF OBJECT_ID('tempdb..#Results') IS NOT NULL	-- check if temp table exists
    	     DROP TABLE #Results;
    System tables

    sys.sysobjects
    	select 'drop table ' + name from sys.sysobjects where type = 'U' and name like '%_old'; -- create drop statements for tables
    SYS.SYSPROCESSES
    	select cmd,* from sys.sysprocesses where blocked > 0;  -- find all blocked statements


    In-memory tables

    Memory-optimized tables are created using the MEMORY_OPTIMIZED = ON option in the create table statement. The entire table resides in memory. A copy is maintained on disk, but for durability purposes only. Non-durable in-memory tables are also supported, with no logging of data and no persistence to disk. Transactions on these tables do not require any disk IO, but the data will not be recovered if there is a server crash.

    Data in memory-optimized tables can be accessed
  • Through natively compiled stored procedures.
  • Through interpreted Transact-SQL, outside of a natively-compiled stored procedure. These T-SQL statements may be either inside interpreted stored procedures or they may be ad-hoc T-SQL statements.


    Indexes

    Each table can contain a maximum of 249 nonclustered indexes and 1 clustered index. A clustered index is automatically created when a PRIMARY KEY is defined for a table and a non-clustered index when a UNIQUE constraint is defined. If a PRIMARY KEY is enforced on a table with an existing clustered index, SQL Server enforces the primary key using a nonclustered index.

    Columns that are of the large object (LOB) data types ntext, text, varchar(max), nvarchar(max), varbinary(max), xml, or image cannot be specified as key columns for an index.
    	CREATE [CLUSTERED /NONCLUSTERED /UNIQUE] INDEX idx1 on table/view (col1,col2 etc.)
    		WITH options
    		ON [filegroup];
    Options

    FILLFACTOR=n - specifies a percentage that indicates how full the Database Engine should make the leaf level of each index page during index creation or rebuild. The default is 0. If fillfactor is 100 or 0, leaf pages are filled to capacity
    IGNORE_DUP_KEY - for multiple row insert operations, when set to ON, only the rows violating an unique index fail and a warning message is issued. When set to OFF, an error message is issued and the entire INSERT transaction is rolled back.

    Included columns

    Non-key columns can be included in the leaf-level of non-clustered indexes using the INCLUDE clause to cover more queries. Non-key columns have the following benefits
  • they can be data types not allowed as index key columns
  • they are not considered by the Database Engine when calculating the number of index key columns or index key size

    An index with non-key columns can significantly improve query performance when all columns in the query are included in the index either as key or non-key columns. The query optimizer can then locate all the column values within the index, table or clustered index data is not accessed resulting in fewer disk I/O operations.
    	CREATE NONCLUSTERED INDEX ix_address_postalcode	ON dbo.address (postalcode)
    		INCLUDE (addr_line1, addr_line2, city, state);
    XML indexes

    XML indexes can be created on xml data type columns. They index all tags, values and paths over the XML instances in the column and benefit query performance. XML instances are stored in xml type columns as BLOBs and without an index, these BLOBs are shredded at run time to evaluate a query which can be time-consuming. If XML BLOBs are commonly queried in the application, it helps to index the xml type columns with the additional cost of index maintenance during data modification.

    XML indexes fall into the following two categories
  • Primary XML index - The first index on the xml type column must be the primary XML index
  • Secondary XML index - Using the primary XML index, the following types of secondary indexes are supported - PATH, VALUE, and PROPERTY

    Full-text indexes

    A full-text index includes one or more character-based columns of the following data types - char, varchar, nchar, nvarchar, text, ntext, image, xml or varbinary(max) and FILESTREAM. A full-text index can contain up to 1024 columns.

    A full-text catalog needs to be created before creating a full-text index. A full-text catalog is a logical container for a group of full-text indexes. A full-text catalog is a virtual object that does not belong to any filegroup.
    	CREATE FULLTEXT CATALOG ftCatalog AS DEFAULT;
    	CREATE FULLTEXT INDEX ON dbo.candidate(resume) KEY INDEX uk_candidate  WITH STOPLIST = SYSTEM;
    Rebuilding and reorganizing indexes

    Rebuilding an index drops and re-creates the index. This removes fragmentation, reclaims disk space by compacting the pages based on the specified or existing fill factor setting and reorders the index rows in contiguous pages.

    Reorganizing an index uses minimal system resources. It defragments the leaf level of clustered and nonclustered indexes on tables and views by physically reordering the leaf-level pages to match the logical, left to right, order of the leaf nodes. Reorganizing also compacts the index pages. Compaction is based on the existing fill factor value.

    Fragmentation exists when indexes have pages in which the logical ordering, based on the key value, does not match the physical ordering inside the data file and can degrade query performance.
    	-- find the average fragmentation percentage of all indexes in a table 
    	SELECT a.index_id, name, avg_fragmentation_in_percent
    	FROM sys.dm_db_index_physical_stats (DB_ID(N'HumanResDB'), OBJECT_ID(N'dbo.Employee'), NULL, NULL, NULL) AS a
    	JOIN sys.indexes AS b ON a.object_id = b.object_id AND a.index_id = b.index_id;
    For fragmentation between 5% and 30%, ALTER INDEX REORGANIZE can be used and for > 30% ALTER INDEX REBUILD WITH (ONLINE = ON) can be used. Rebuilding an index can be executed online or offline, while reorganizing an index is always executed online. To achieve availability similar to the reorganize option, indexes can be rebuilt online.

    Back



    T-SQL LANGUAGE

    T-SQL or Transact-SQL is Microsoft's implementation of the SQL standard. All tools and applications communicate with a SQL database by sending T-SQL commands. Some of the Microsoft tools that execute T-SQL commands are SSMS, SQL Server Data Tools (SSDT), sqlcmd and Azure Data Studio.

    Datatypes

    BINARY/VARBINARY - Binary datatypes of either fixed-length or variable length. Syntax: BINARY(n) or VARBINARY(n)

    BIT - Integer datatype 1, 0 or NULL.

    BIGINT/INT/SMALLINT/TINYINT - Integer data types of 8, 4, 2 and 1 bytes storage length respectively.

    CHAR/VARCHAR - Fixed-length and variable length character datatypes. Syntax: CHAR(n) or VARCHAR(n)

    DATETIME/SMALLDATETIME - Date and time data types for representing date and time of day.

    DECIMAL/NUMERIC - Numeric datatypes with fixed precision and scale. Syntax: decimal(precision, scale) or numeric (p,s)

    FLOAT/REAL - Approcimate number data types for use with floating-point numeric data. Syntax: FLOAT(n) or REAL

    MONEY/SMALLMONEY - Monetary datatypes for representing monetary or currency values.

    NCHAR, NVARCHAR - Unicode datatypes that help in storing data from various languages.

    NTEXT/TEXT/IMAGE - Fixed and variable-length datatypes for storing large non-unicode, unicode character and binary data.
  • ntext - variable-length unicode data with a max length of 230 - 1 characters. Storage size is two times the number of characters entered
  • text - variable length non-unicode data in the code page of the server and with a maximum length of 231 - 1 characters
  • image - variable length binary data from 0 through 231 - 1 bytes
  • table - is a special data type used to store a result set for processing at a later time. table is primarily used for temporarily storing a set of rows that are returned as the table-valued function result set. Functions and variables can be declared to be of type table. table variables can be used in functions, stored procedures and batches

    XML - SQL server allows storage of native xml with the xml datatype. Creating a typed xml column is a 2 step process - creating a schema and associating it to the column
    	CREATE XML SCHEMA COLLECTION dbo.nounsSchema AS
    	'<?xml version="1.0">
    	   <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
                   <xsd:element name="Category" type="xsd:string" />
                   <xsd:element name="arabic" type="xsd:string" />
                   <xsd:element name="english"  type="xsd:string" />
    		</xsd:schema>';
    
    	CREATE TABLE dbo.words (id int PRIMARY KEY,
    				nouns xml(dbo.nounsSchema));

    Operators

    T-SQL uses these operator categories -

  • Arithmetic + - * / %(modulo)
  • Assignment =
  • Bitwise & (AND), | (OR), ^ (XOR)
  • Comparison =, >, <, >=, <=, <> (not equal), != (not equal), !< (not lesser), !> (not greater)
  • Logical ALL, AND, ANY, BETWEEN, EXISTS, IN, LIKE, NOT, OR, SOME
  • String concatenation +
  • Unary +, - . ~ (bitwise NOT)

    Statements

    Control flow

    BEGIN...END - encloses a series of T-SQL statements, so that a group of statements can be executed. BEGIN...END blocks can be nested.
    	BEGIN
    		statements
    	END
    IF....ELSE - imposes conditions on the execution of a T-SQL statement
    	IF boolean expression
    		BEGIN 
    			sql statements
    		END
    	ELSE
    		BEGIN
    			sql statements
    		END
    GOTO - alters the flow of execution to a label. GOTO statements can be nested
    	GOTO label1
    	...
    	label1:
    THROW - raises an exception and transfers execution to a CATCH block of a TRY...CATCH construct
    	THROW 51000, 'The record does not exist.', 1;
    	
    	Msg 51000, Level 16, State 1, Line 1
    	The record does not exist.
    TRY...CATCH - used to implement error handling logic
    	BEGIN TRY
    		statements
    	END TRY
    	BEGIN CATCH
    		statements
    	END CATCH
    WHILE - sets a condition for the repeated execution of an SQL statement or statement block. BREAK causes an exit from the innermost WHILE loop. CONTINUE causes the WHILE loop to restart ignoring any statements after CONTINUE
    	WHILE boolean expression
    		BEGIN 
    			sql statements
    			[BREAK] 	
    			[CONTINUE]
    		END
    Cursors

    CLOSE - Closes an open cursor by releasing the current resultset and freeing any cursor locks.
    	CLOSE cursor1
    DEALLOCATE - removes a cursor reference. When the last cursor reference is deallocated, the data structures comprising the cursor are released. Removes the association between a cursor and a cursor name.
    	DEALLOCATE cursor_name
    DECLARE CURSOR - Defines the attributes of a T-SQL cursor, such as its scrolling behaviour and the query used to build the cursor.
    	DECLARE cursorname CURSOR
    		[LOCAL | GLOBAL]
    		[FORWARD_ONLY | SCROLL]
    		FOR select_statement
    		[FOR UPDATE OF columnname]
    FETCH - Retrieves a specific row from a cursor.
    	FETCH [NEXT | PRIOR | FIRST | LAST]	FROM cursor1 INTO @variable1, @variable2....@variablen
    OPEN - Open and populate a cursor

    DDL

    ALTER TABLE - modifies a table definition by altering, adding or dropping columns or constraints

    	ALTER TABLE t1 add primary key (col1, col2, col3);
    	ALTER TABLE t1 ADD CONSTRAINT fk_t1_col1 FOREIGN KEY (col1) REFERENCES RefTableName(RefColumnName);	- add a foreign key
    	ALTER TABLE t1 ADD col1 (varchar(50));	- add a column
    	ALTER TABLE tbl1 WITH NOCHECK ADD CONSTRAINT CK_tbl1_Col1 CHECK (Col1 in (1,2));	- add a check constraint
    ALTER SCHEMA - Change schema
    	ALTER SCHEMA dbo TRANSFER AS0386.spAuthors1
    DROP TABLE - Removes a table definition and all data, indexes, triggers, constraints and permission specifications for that table.

    DML

    INSERT - inserts a new row to a table or a view

    MERGE - performs insert, update, or delete operations on a target table based on the results of a join with a source table
    	/*Using Merge to Insert/Update*/	
    	MERGE dbo.UnitMeasure AS target
    	USING (SELECT @UnitMeasureCode, @Name) AS source (UnitMeasureCode, Name)
    		ON (target.UnitMeasureCode = source.UnitMeasureCode)
    	WHEN MATCHED THEN 
    	UPDATE SET Name = source.Name
    	WHEN NOT MATCHED THEN	
    	    INSERT (UnitMeasureCode, Name)
    	    VALUES (source.UnitMeasureCode, source.Name)
    	    OUTPUT deleted.*, $action, inserted.* INTO #MyTempTable;
    
    	/*Using Merge to Update/Delete*/	
    	CREATE PROCEDURE dbo.usp_UpdateInventory @OrderDate datetime AS
    		MERGE dbo.ProductInventory AS target
    			USING (SELECT ProductID, SUM(OrderQty) FROM dbo.SalesOrderDetail AS sod
    			JOIN dbo.SalesOrderHeader AS soh
    				ON sod.SalesOrderID = soh.SalesOrderID
    				AND soh.OrderDate = @OrderDate
    				GROUP BY ProductID) AS source (ProductID, OrderQty)
    			ON (target.ProductID = source.ProductID)
    			WHEN MATCHED AND target.Quantity - source.OrderQty <= 0
    				THEN DELETE	
    			WHEN MATCHED 
    				THEN UPDATE SET target.Quantity = target.Quantity - source.OrderQty, 
    					target.ModifiedDate = GETDATE()
    			OUTPUT $action, Inserted.ProductID, Inserted.Quantity, Inserted.ModifiedDate, Deleted.ProductID,
    				Deleted.Quantity, Deleted.ModifiedDate;
    TRUNCATE TABLE - removes all rows from a table without logging the individual row deletes. It is faster than DELETE and uses lesser system and log resources. The DELETE statement removes rows one at a time and records an entry in the transaction log for each deleted row. TRUNCATE TABLE removes the data by deallocating the data pages used by the table, and only the page deallocations are recorded in the log. The counter used by an identity for new rows is reset to the seed for the column. TRUNCATE TABLE cannot be used on a table referenced by a FOREIGN KEY constraint. Because TRUNCATE TABLE is not logged, it cannot activate a trigger.

    UPDATE

    General

    DECLARE - Declares a variable in the body of a batch or a procedure. After declaration, all variables are initialized as NULL.
    	DECLARE @local_variable data_type;
    EXECUTE - Executes a command string or character string within a Transact-SQL batch, or a system/user/extended stored procedure or a scalar-valued user-defined function.

    KILL - ends a user process using the session id. If the specified session id has much work to undo, the statement may take some time to complete. Active session ID values can be obtained by querying the session_id column of the sys.dm_tran_locks, sys.dm_exec_sessions and sys.dm_exec_requests dynamic management views or viewing the SPID column that the sp_who system stored procedure returns. If a rollback is in progress for a specific SPID, the cmd column in the sp_who result set for that SPID indicates KILLED/ROLLBACK.
    	KILL sessionid;
    SELECT - retrieves rows from the database. The main clauses are
    	[ WITH common_table_expression] 
    	SELECT select_list [ INTO new_table ] 
    	[ FROM table_source ] [ WHERE search_condition ] 
    	[ GROUP BY group_by_expression [WITH CUBE | ROLLUP] ] 
    	[ HAVING search_condition ] 
    	[ ORDER BY order_expression [ ASC | DESC ] ] 
    	[ FOR XML [RAW | AUTO | PATH | EXPLICIT] ]
    The UNION, EXCEPT, and INTERSECT operators can be used between queries to combine or compare their results into one result set.

    CUBE - Specifies that in addition to the usual rows provided by GROUP BY, summary rows are introduced into the result set. A GROUP BY summary row is returned for every possible combination of group and subgroup in the result set.

    ROLLUP - Specifies that in addition to the usual rows provided by GROUP BY, summary rows are introduced into the result set. Groups are summarized in a hierarchical order, from the lowest level in the group to the highest.

    FOR XML - retrieves the query output as XML. One of these modes can be specified:
    The RAW mode generates a single <row> element per row in the rowset that is returned by the SELECT statement. XML hierarchy can be generated by writing nested FOR XML queries.
    The AUTO mode generates nesting in the resulting XML by using heuristics based on the way the SELECT statement is specified and the user has minimal control over the shape of the XML generated. The nested FOR XML queries can be written to generate XML hierarchy beyond the XML shape that is generated by AUTO mode heuristics.
    The EXPLICIT mode allows more control over the shape of the XML. Attributes and elements can be mixed at will in deciding the shape of the XML. It requires a specific format for the resulting rowset that is generated because of query execution. This rowset format is then mapped into XML shape. The power of EXPLICIT mode is to mix attributes and elements at will, create wrappers and nested complex properties, create space-separated values (for e.g OrderID attribute may have a list of order ID values) and mixed contents.
    The PATH mode together with the nested FOR XML query capability provides the flexibility of the EXPLICIT mode in a simpler manner.
    	select * from vw_nouns FOR XML RAW;  -- RAW mode uses a generic row tag
    	select * from vw_nouns FOR XML RAW, XMLSCHEMA;  -- returns the schema of the xml output along with the result
    SET - sets the specified variable to the given value
    	SET @local_variable = expression
    WITH common_table_expression(CTE) - specifies a temporary named resultset. This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement.
    	with t as
    		(select distinct * from [DB1].[dbo].[tbl1])
    			select COUNT(*) from t;
    Transactions

    BEGIN TRANSACTION - BEGIN TRANSACTION marks the starting point of an explicit, local transaction. It increments @@TRANCOUNT by 1

    COMMIT - ensures all of a transaction's modifications are made permanent in the database

    COMMIT TRANSACTION - marks the end of a successful implicit or explicit transaction. If @@TRANCOUNT is 1, COMMIT TRANSACTION makes all data modifications since the start of the transaction permanent, frees the transaction's resources and decrements @@TRANCOUNT to 0. When @@TRANCOUNT is greater than 1, COMMIT TRANSACTION decrements @@TRANCOUNT only by 1 and the transaction stays active

    ROLLBACK - rolls back a user-specified transaction to the beginning of the transaction

    ROLLBACK TRANSACTION - rolls back an explicit or implicit transaction to the beginning or to a savepoint

    SAVE TRANSACTION - sets a savepoint within a transaction


    Functions

    ASCII - returns the ASCII code value of the leftmost character of a character expression.

    CASE - evaluates a list of conditions and returns one of the possible expressions.

    	Simple CASE function:
    		CASE input expression
    			WHEN cond1 THEN result1
    			WHEN cond2 THEN result2
    			.......
    			ELSE result_else
    		END
    
    	Searched CASE function:
    		CASE
    			WHEN boolean expression THEN result1
    			.....
    			ELSE result_else
    		END
    CAST/CONVERT - explicitly convert an expression of one data type to another
    	CAST (expression AS data_type)
    	CONVERT (data_type [(length)], expression [,style])
    
    	SELECT title, ytd_sales FROM tblBooks WHERE CAST (ytd_sales AS char[20]) LIKE '3%'
    	SELECT title, ytd_sales FROM tblBooks WHERE CONVERT (char[20], ytd_sales) LIKE '3%'
    CHAR - a string function that converts an int ASCII code to a character

    CHARINDEX - returns starting position of a specified pattern in a string

    DATEDIFF - returns the date and time difference between two dates passed to it as parameters
    DATEDIFF ( datepart, startdate, enddate ) Datepart can be one of the following parameters that specifies the unit of difference to be calculated - Year (abbr. yy, yyyy), quarter (qq, q), Month (mm, m), dayofyear (dy, y), Day (dd, d), Week (wk, ww), Hour (hh), minute (mi, n), second (ss, s), millisecond (ms)

    LEN - returns the number of characters of a given string expression.
    	-- count the no of Y's
    	SELECT LENGTH(col) - LENGTH(REPLACE(col, 'Y', ''))

    LTRIM/RTRIM - return a character expression after removing leading/trailing blanks

    PATINDEX - returns the starting position of the first occurrence of a pattern in a string expression. Returns 0 if the pattern is not found. Allows wild card characters in the pattern
    	PATINDEX ('%pattern%', string_expression)
    Determine non-printable characters in a column:
    	select fn,  patindex('%[^ !-~]%' COLLATE Latin1_General_BIN,fn) as [Position],
    	substring(fn,patindex('%[^ !-~]%' COLLATE Latin1_General_BIN,fn),1) as [InvalidCharacter],
    	ascii(substring(fn,patindex('%[^ !-~]%' COLLATE Latin1_General_BIN,fn),1)) as [ASCIICode]
    	from  #vvk1
    	where patindex('%[^ !-~]%' COLLATE Latin1_General_BIN,fn) > 0;

    SUBSTRING - returns a portion of a character or a binary/text string.expression. Returns 0 if the pattern is not found and allows wild card characters in the pattern
     	SELECT x = SUBSTRING ('abcdef', 2, 3) 	-- 'bcd'

    UNICODE - returns the integer value, as defined by the Unicode standard, for the first character of the input expression
    	DECLARE @nstring nchar(12)
    	SET @nstring = N'Åkergatan 24'
    	SELECT UNICODE(@nstring), NCHAR(UNICODE(@nstring))
    
    	returns the result set:
    	------    - 
    	197       Å

    Aggregate functions

    AVG, COUNT, COUNT_BIG, MAX, MIN, SUM, VAR, VARP, STDEV, STDEVP

    Ranking functions

    RANK - Displays the rank value for each row in the resultset
    	SELECT Name, Salary, RANK() OVER(ORDER BY Salary DESC) as Ranking FROM Employees;
    DENSE_RANK - Similar to the RANK function without missing ranking values
    	SELECT Name, Dept, Salary, DENSE_RANK() OVER(PARTITION BY Dept ORDER BY Salary DESC) as DenseRanking FROM Employees;
    ROW_NUMBER - returns the sequential number of a row within a partition of a result set
    	SELECT ROW_NUMBER() OVER(ORDER BY name ASC) AS Row#, name, recovery_model_desc FROM sys.databases WHERE database_id < 5;
    NTILE - distributes the rows in an ordered partition into a specified number of groups

    System functions

    These scalar functions perform operations and return information about values, objects and settings in SQL Server.

    @@ERROR - returns the error number for the last T-SQL statement executed
    @@FETCH_STATUS - returns the status of the last cursor FETCH statement issued against any cursor currently opened by the connection
    @@IDENTITY - returns the last-inserted identity value
    @@ROWCOUNT - returns the number of rows affected by the last statement
    @@TRANCOUNT - number of BEGIN TRANSACTION statements that have occurred on the current connection
    NEWID() - Creates a unique value of type uniqueidentifier.

    System statistical functions

    These scalar functions return statistical information about the system

    @@CONNECTIONS - returns the number of attempted connections (successful or not) since the server was last started
    @@CPU_BUSY - the amount of cpu time(ticks) that SQL Server has spent in active operation since its start
    @@IDLE - time SQL server has been idle
    @@IO_BUSY - time spent on performing I/O
    @@TIMETICKS - number of microseconds per tick (computer-dependent)


    Variables

    A T-SQL local variable is an object that can hold a single data value of a specific type.

    The DECLARE statement initializes a Transact-SQL variable by
  • assigning a name - the name must have a single @ as the first character
  • assigning a system-supplied or user-defined data type and a length
  • setting the value to NULL
    	DECLARE @var1 int;	- declare a variable
    	
    	SET @var1 = 10;		- set variable to a value
    	
    	SELECT @var1 = MAX(emp_id) FROM hr.employee;	- set variable to the value of an expression
    table variables

    Variables can be declared to be of type table. table variables can be used in functions, stored procedures and batches.

    table variables that are used in stored procedures cause fewer stored procedure recompilations than when temporary tables are used. Transactions involving table variables last only for the duration of an update on the table variable and require less locking and logging resources.
    	declare @t1 table (rowid int identity(1,1), col1 char(2), col2 char(2);
    	insert into @t1 select id, ovf, xfer from shipments where status = 'U';
    	declare @rows int, @rowid int;
    		select @rows = count(1) from @t1;
    		while (@rows > 0) 
    			begin     
    				select top 1 @rowid = rowid, @ovfnbr = col1, @xfernbr = col2 from @t1; 
    				/* perform row processing */
    				delete from @t1 where rowid = @rowid;
    				select @rows = count(1) from @t1;
    			end

    Stored procedures

    A user-defined procedure can be created in a user-defined database or in all system databases except the Resource database. The procedure can be developed in either T-SQL or as a reference to a Microsoft .NET Framework common runtime language (CLR) method.

    There are three ways of returning data from a procedure to a calling program
  • result sets - if a SELECT statement is included in the body of a stored procedure (not a SELECT .. INTO or INSERT .. SELECT), the rows specified by the SELECT statement will be sent directly to the client

  • output parameters - by specifying the OUTPUT keyword for a parameter in the procedure definition, the procedure can return the current value of the parameter to the calling program when the procedure exits. To save the value of the parameter in a variable that can be used in the calling program, the calling program must use the OUTPUT keyword when executing the procedure

  • return codes - a procedure can return an integer value called a return code to indicate the execution status of a procedure using the RETURN statement

    Sample procedure
    	create proc dbo.usp_kwi_ExtractCustRecords
    		@parm1 char(23),
    		@parm2 char(23) as    
    	set nocount on
    
    	declare @telephoneno varchar(16), @emailaddress varchar(50);
    	declare @transsttime datetime, @transendtime datetime;
    
    	set @transsttime = cast(@parm1 as datetime)
    	set @transendtime = cast(@parm2 as datetime)
    
    	create table #tempcust
    	 (		telephoneno varchar(16),
    			emailaddress varchar(50)   )
    
    	   declare cursorcust cursor for
    		select telephone_no1, email_address
    				 from customer cu
    				 where cu.entry_date_time >=  @transsttime  
    				 and cu.entry_date_time <= @transendtime 
    			
    		open cursorcust
    		fetch next from cursorcust into @telephoneno,  @emailaddress;
    		
    		while @@fetch_status = 0
    		begin
    		insert into #tempcust values 	(@telephoneno,  @emailaddress );	
    					fetch next from cursorcust into  @telephoneno, @emailaddress;
    		 end
    		
    	select * from #tempcust
    	cleanup:
    		close cursorcust;
    		deallocate cursorcust;
    		drop table #tempcust;
    		return;
    	go
    System stored procedures

    sp_lock - displays info about locks
    	exec sp_lock; -- displays all locks
    	exec sp_lock 53; - displays locks for a sessionid
    sp_monitor - displays statistics about SQL Server
    sp_send_dbmail - sends emails
    	EXEC msdb.dbo.sp_send_dbmail
    		@profile_name = 'WHI', @recipients = 'vkandasamy@paypal.com', @subject = 'OE Gen 4.5: New Makes Alert',
    		@body = 'Automated Message: New Makes have been received in the OE Gen 4.5 process';
    sp_spaceused - displays space usage details
    	EXEC sp_spaceused @updateusage = N'TRUE'; -- for current DB, returns current values
    	EXEC sp_spaceused N'dbo.vendor';   -- find space used by a table
    	EXEC sp_MSforeachtable @command1="EXEC sp_spaceused '?'"; -- find space used by all tables
    sp_who/sp_who2 - provide user, session and process information
    	EXEC sp_who;  -- displays statistics for all users/sessions
    	EXEC sp_who 'janetl';  -- show info on a single user login
    	EXEC sp_who '10';  - show info by sessionid
    Columns in sp_who output:
    spid - session id
    status - process status. The possible values are dormant. SQL Server is resetting the session
    running. The session is running one or more batches. When Multiple Active Result Sets (MARS) is enabled, a session can run multiple batches
    background. The session is running a background task, such as deadlock detection
    rollback. The session has a transaction rollback in process
    pending. The session is waiting for a worker thread to become available
    runnable. The session's task is in the runnable queue of a scheduler while waiting to get a time quantum
    spinloop. The session's task is waiting for a spinlock to become free
    suspended. The session is waiting for an event, such as I/O, to complete
    blk - session id of the blocking process if one exists, else 0
    cmd - command executing for the process

    Extended stored procedures

    Extended stored procedures allow developers to create their own external routines in a programming language such as C. They appear to users as normal stored procedures and are executed in the same way. Parameters can be passed to extended stored procedures and they can return results and return status.

    Extended stored procedures are dynamic-link libraries (DLLs) that SQL Server can dynamically load and execute. Extended stored procedures run directly in the address space of SQL Server and are programmed using the SQL Server Open Data Services API.

    An extended stored procedure works as follows
    1. When a client executes an extended stored procedure, the request is transmitted in tabular data stream (TDS) format from the client application through the Net-Libraries and Open Data Services to SQL Server
    2. SQL Server searches for the DLL associated with the extended stored procedure, and loads the DLL if it is not already loaded
    3. SQL Server calls the requested extended stored procedure (implemented as a function inside the DLL)
    4. The extended stored procedure passes result sets and return parameters back to the server by using the Open Data Services API


    User-defined functions

    A stored procedure returns only an int datatype value while a function can return any datatype.

    Calling a function
    	select @var1 = [dbo].[fnConvertBintoStr] (@var2,@var3);
    Scalar-valued function

    Scalar functions return a single data value of the type defined in the RETURNS clause. The return type can be any data type except text, ntext, image, cursor, and timestamp.
    	ALTER FUNCTION [dbo].[fnGetCountryForState]
    	(@InState char(2))
    	RETURNS char(2)
    	AS
    	BEGIN
    	declare @outCtryCde char(2);
    		if @InState in ('AB','BC','MB','NB','NL','NT', 'NS','NU','ON','PE','PQ','QC','SK','YT','NF')
    			set @outCtryCde = 'CA';
    		else
    			set @outCtryCde = 'US';
    		return @outCtryCde;
    	END
    Table-valued function

    Table-valued functions (TVFs) return a table data type.
    	CREATE FUNCTION [dbo].[fnGetLocationInfo]
    	(	
    		@RegNumber 	char(9),
    		@CompanyCde CHAR(3),
    		@OrigCde CHAR(2),
    		@OrigNbr CHAR(2)
    	)
    	RETURNS TABLE 
    	AS
    	RETURN 
    	(	SELECT 	rtrim(LCTN_ADDR1) + ' ' + rtrim(LCTN_ADDR2) as LCTN_ADDR, 
    			LCTN_CITY_NME,
    			lctn_st_cde = case lctn_st_cde
    				when 'PQ' then 'QC'
    				when 'NF' then 'NL'	
    				else lctn_st_cde end,	
    			LCTN_ZIP_CODE,
    			LCTN_COUNTRY_CDE FROM SH_LOCATION 
    			where REG_NBR =  @RegNumber and COMPANY_CDE = @CompanyCde
    				AND LCTN_TYP_CDE = @OrigCde and LCTN_TYP_NBR = @OrigNbr);

    Back



    SERVER ADMINISTRATION

    Installing SQL Server

    Editions

    SQL Server is available in Enterprise, Standard, Developer and Express editions. The Developer edition is free to use for non-production environments. The Express edition is free to use for small-scale applications and supports a maximum of 4 cores and a database size of 10GB.

    SQL Server Management Studio (SSMS) is a free tool that can be used to configure and manage SQL Server instances, deploy and monitor data-tier components used by applications and build queries and scripts.

    SQL Server Configuration Manager is a tool to manage the services associated with SQL Server, to configure the network protocols used and to manage the network connectivity configuration from client computers. SQL Server Configuration Manager is a Microsoft Management Console (MMC) snap-in that is available from the Start menu (All programs -> SQL server -> Configuration tools) or can be added to any other MMC display.

    Default and named instances

    SQL Server supports multiple instances of SQL Server on a single server or processor, but only one instance can be the default instance. All others must be named instances. A computer can run multiple instances of SQL Server concurrently and each instance runs independently of other instances.

    While installing SQL Server, a name can be assigned to the instance enabling the installation of multiple instances on a single machine. Each named instance is identified by the network name of the computer plus the instance name, in the format computer_name\instance_name.

    If a name is not specified, a default instance is created which can be accessed by just using the computer_name. The default instance name is MSSQLSERVER and a client is not required to specify the name of the instance to make a connection.

    Default instances of SQL Server use TCP port 1433 by default. Named instances, however, dynamically assign an unused TCP port number the first time the instance is started. SQL Server also uses UDP port 1434 to establish communications links from applications.


    Managing services

    Database Engine service - the executable process that is the SQL Server Database Engine. The Database Engine can be the default instance (limit one per computer) or can be one of many named instances. The default instance is listed as SQL Server (MSSQLSERVER). Named instances are listed as SQL Server () and SQL Server Express is installed as SQL Server (SQLEXPRESS) by default.

    SQL Server Agent service - a Windows service that executes scheduled administrative tasks called jobs and alerts.

    SQL Server Browser service - a Windows service that listens for incoming requests for SQL Server resources and provides clients information about SQL Server instances installed on the computer. A single instance of the SQL Server Browser service is used for all instances installed on the computer.

    SQL Server Configuration Manager can be used to determine which instances are installed on the computer, start, pause, resume or stop the services (DB engine/Agent/Browser) and view/change service properties. Pausing prevents new users from connecting to the DB engine but already connected users can continue to work till their connections are broken.

    By default, only members of the local administrators group can start, stop, pause, resume or restart a service.

    The Database Engine can be stopped by using the SHUTDOWN statement.
    	SHUTDOWN;	- wait for currently running statements to finish and then stop the DB Engine
    	SHUTDOWN WITH NOWAIT;	- stop immediately

    Server configuration

    SQL Server resources can be managed and optimized through configuration options by using SSMS or the sp_configure system stored procedure.

    Commonly used options can be viewed/changed using SSMS by right-clicking the server in Object Explorer and choosing Properties. Some options are accessible through the SQL Server configuration manager. All options are accessible using sp_configure.

    sp_configure

    The sp_configure system procedure displays or changes global configuration settings for the current server. All users can display options and the ALTER SETTINGS server-level permission is required for changing options.

    RECONFIGURE updates the currently configured value (the config_value column in the sp_configure result set) of a configuration option changed with sp_configure. RECONFIGURE does not always update the currently running value (the run_value column) because some configuration options require a server stop and restart to update the currently running value.
    	sp_configure;	- display all options
    	
    	sp_configure 'show advanced options', 1;
    	RECONFIGURE;
    
    	sp_configure 'max server memory', 4096;	- set max memory to 4GB
    	RECONFIGURE;
    Server memory configuration

    SQL Server uses memory dynamically, but the amount of memory managed by the SQL Server Memory Manager for a SQL Server process used by an instance of SQL Server can be configured by two options - min server memory (default 0 MB) and max server memory (default 2048TB).
    	SELECT physical_memory_in_use_kb/1024 'physical_memory_in_use_MB',	- currently allocated memory
    	large_page_allocations_kb/1024 'large_page_allocations_MB', 
    	locked_page_allocations_kb/1024 'locked_page_allocations_MB',
    	virtual_address_space_reserved_kb/1024 'VAS_reserved_MB', 
    	virtual_address_space_committed_kb/1024 'VAS_committed_MB', 
    	virtual_address_space_available_kb/1024 'VAS_available_MB',
    	page_fault_count,
    	memory_utilization_percentage, 
    	process_physical_memory_low, 
    	process_virtual_memory_low
    	FROM sys.dm_os_process_memory;

    Collations in SQL Server

    Collations can be set at server, database, column, expression or identifier levels. The default server collation is determined during SQL Server setup based on the OS locale and it becomes the default collation of the system databases and all user databases.
    	SELECT CONVERT (varchar, SERVERPROPERTY('collation'));		- server collation
    	SELECT name, collation_name FROM sys.databases;			- database collation
    	SELECT name, collation_name FROM sys.columns WHERE name = N'colname';	- column collation
    	
    	SELECT * FROM fn_helpcollations();	- list of supported collations
    Collation options are specified by appending them to the collation name

    _CS/_CI - case sensitive/insensitive
    _AS/_AI - accent sensitive/insensitive

    The COLLATE clause defines a collation of a database or table column or a collation cast operation when applied to character string expression. Collation name can be either a Windows collation name or a SQL collation name. If not specified during database creation, the database is assigned the default collation of the instance of SQL Server. If not specified during table creation, the column is assigned the default collation of the database.
    	CREATE TABLE t1 (
    	LatinColumn nvarchar(15) COLLATE latin1_general_cs_as,
    	FrenchColumn ncarchar(15) COLLATE french_100_cs_as);

    Connection handling

    Network Protocols

    SQL Server uses any of these protocols for listening to connections - Shared memory, Named pipes, TCP/IP, VIA.

    The protocol must be enabled on both the client and server to work. The server can listen for requests on all enabled protocols at the same time. Client computers can pick one or try the protocols in the order listed in SQL Server Configuration Manager.

    Connecting to SQL Server over the Internet

    Users can connect to an instance of SQL Server over the internet using SQL Query Analyzer or a client application based on ODBC or DB-Library.

    To share data over the Internet, the client and server must be connected to the Internet. In addition, TCP/IP or Multiprotocol Net-Libraries must be used. If Multiprotocol Net-Library is used, it must be ensured that TCP/IP support is enabled. If the server is registered with Domain Name System (DNS), it can be connected to using its registered name.

    User connections

    The user connections option specifies the maximum number of simultaneous user connections that are allowed on an instance of SQL Server. The default is 0, which means that the maximum (32,767) user connections are allowed.
    	SELECT @@max_connections;
    
    	SELECT * FROM sys.dm_tran_locks WHERE resource_type = 'DATABASE' AND resource_database_id = DB_ID('dbname');	- current connections
    	SELECT * FROM sys.dm_exec_requests WHERE database_id = DB_ID('dbname');		- connections executing tasks

    DBCC Commands

    T-SQL provides DBCC statements that acts as database console commands for SQL server. DBCC commands take input parameters and return values. All DBCC command parameters can accept both Unicode and DBCS literals.

    DBCC commands are grouped into four categories

  • Maintenance

    DBCC SHRINKDATABASE - shrinks the size of all the data and log files in the specified database. DBCC SHRINKDATABASE operations can be stopped at any point in the process and any completed work is kept.
    	DBCC SHRINKDATABASE ( dbname|dbid|0  [, target_percent] [, NOTRUNCATE|TRUNCATEONLY] );
    			0 - current database is used
    			target_percent - percentage of free space to be left in the database file after shrinking
    			NOTRUNCATE - applicable only to data files, the free space at the end of file is not released to the OS
    			TRUNCATEONLY - releases all free space at the end of the file to the operating system
    DBCC SHRINKFILE - shrinks the size of the specified data or log file for the current database

    DBCC DBREINDEX - rebuilds one or more indexes for a table
    	DBCC DBREINDEX ("HumanResources.Employee", PK_Employee_BusinessEntityID,80);
    		80 is the fillfactor, the percentage of space on each index page for storing data
    DBCC UPDATEUSAGE - update usage statistics for database/table
    	DBCC UPDATEUSAGE (dbname) WITH NO_INFOMSGS;
  • Validation

    DBCC CHECKDB - Checks the logical and physical integrity of all the objects in the specified database.
    	DBCC CHECKDB('dbname');
    	DBCC CHECKDB('dbname') REPAIR_ALLOW_DATA_LOSS | REPAIR_FAST | REPAIR_REBUILD;
    
    		REPAIR_ALLOW_DATA_LOSS - tries to repair all reported errors. These repairs can cause some data loss
    		REPAIR_FAST - maintains syntax for backward compatibility only. No repair actions are performed
    		REPAIR_REBUILD - performs repairs that have no possibility of data loss
  • Informational

    DBCC USEROPTIONS - returns the SET options active (set) for the current connection
    	DBCC USEROPTIONS
  • Miscellaneous

    DBCC HELP - returns syntax information for the specified DBCC command. Requires membership in the sysadmin fixed server role
    	DBCC HELP ('CHECKDB');
    	DBCC HELP ('?');	- returns all DBCC statements for which help is available

    Shrinking the tempdb

    The tempdb is a temporary workspace. Among other uses, SQL Server uses the tempdb for
  • Storage of explicitly created temporary tables
  • Worktables that hold intermediate results created during query processing and sorting
  • Materialized static cursors
  • Storage of version records when snapshot isolation levels or read-committed snapshot isolation levels are used

    By default, the tempdb database is configured to autogrow as needed and may unexpectedly grow in time to a size larger than the desired size. A simple restart of SQL Server resets the size of tempdb to its last configured size. The configured size is the last explicit size that was set by using a file size changing operation such as ALTER DATABASE with the MODIFY FILE option or the DBCC SHRINKFILE or DBCC SHRINKDATABASE statements.

    The tempdb can be shrunk by

    System catalog views

    Catalog views return information that is used by the SQL Server Database Engine and are the most general interface to the catalog metadata. All user-available catalog metadata is exposed through catalog views.

    Databases and files catalog views

    sys.databases - has one row per database in the SQL server instance name - unique within an SQL server instance
    database_id - unique within an SQL server instance
    sys.database_files - per database view containing one row per file of the database file_id - id of the file within database
    type - 0(rows), 1(log), 2(filestream)unique within an SQL server instance name - logical name of the file
    physical name - OS file name
    size - current size in 8kB pages
    Object catalog views

    sys.columns - contains a row for each column of objects with columns like tables and views object_id - id of the object to which the column belongs
    name - unique within the object
    column_id - unique within the object
    user_type_id - id of the type of the column collation_name - collation for character-based columns
    sys.indexes - contains a row per index or heap of a table, view, or table-valued function object_id - id of the object to which the index belongs
    name - unique within the object
    index_id - unique within the object
    type - 0(Heap), 1(Clustered), 2(Nonclustered), 3(XML), 4(Spatial)
    sys.objects - has a row for each user-defined object created within a database name - object name
    object_id - unique within a database
    schema_id - id of the schema that the object is contained in
    parent_object_id - id of the object to which this object belongs (0 if not a child object)
    type - U(User table), V(view), S(system table), IT(internal table), P(procedure)
    create_date
    modify_date
    sys.partitions - contains a row for each partition. All tables and indexes contain at least one partition
    partition_id - unique within a database
    object_id - ID of the object to which this partition belongs
    index_id - ID of the index within the object to which this partition belongs
    hobt_id - ID of the data heap or B-tree that contains the rows for this partition

    Back



    Monitoring the Server

    Dynamic management views

    Dynamic management views and functions return server state information that can be used to monitor the health of a server instance, diagnose problems and tune performance.

    sys.dm_db_index_physical_stats - Fragmentation details for indexes
    sys.dm_db_log_space_usage - Returns space usage information for the transaction log
    sys.dm_exec_procedure_stats - Find when stored procs were executed
    	SELECT TOP 1000 b.name, a.* From sys.dm_exec_procedure_stats a, sys.objects b
    	where  a.object_id = b.object_id and DB_NAME(a.database_id) = 'db1';
    sys.dm_exec_requests - all requests running on the server
    	select top 100 session_id, start_time, status, command, sql_handle, total_elapsed_time/1000 'elapsed_time_seconds'
    		from sys.dm_exec_requests order by total_elapsed_time desc;		- long running queries
    sys.dm_exec_connections - returns information about the connections established to the SQL server instance session_id - session associated with this connection
    most_recent_sql_handle - sql handle of the last request executed on this connection
    sys.dm_exec_sessions - returns one row per authenticated session on SQL Server. It shows information about all active user connections and internal tasks status - Running(currently running one or more requests), Sleeping(currently running no requests)
    reads/writes - number of reads/writes performed by requests in this session
    	select * from sys.dm_exec_sessions;
    sys.dm_exec_sql_text - Returns the text of the SQL batch identified by the specified sql_handle
    	select * from sys.dm_exec_sql_text (sql_handle);
    Finding SQL text from sp_id
    	declare    @spid int,   @stmt_start int,   @stmt_end int,   @sql_handle binary(20);
    	set @spid = 59; -- fill this in
    
    	select top 1
    	@sql_handle = sql_handle,
    	@stmt_start = case stmt_start when 0 then 0 else stmt_start / 2 end,
    	@stmt_end = case stmt_end when -1 then -1 else stmt_end / 2 end
    	from master.dbo.sysprocesses
    	where spid = @spid
    	order by ecid;
    
    	SELECT
    		SUBSTRING(	text,
    			COALESCE(NULLIF(@stmt_start, 0), 1),
    			CASE @stmt_end
    				WHEN -1 THEN DATALENGTH(text)
    				ELSE (@stmt_end - @stmt_start)
    			END
    				)
    	FROM ::fn_get_sql(@sql_handle);
    sys.dm_os_process_memory - information about memory allocations and the process address space
    	select * FROM sys.dm_os_process_memory;
    sys.dm_os_sys_info - returns information about resources available to and consumed by SQL Server
    sys.dm_tran_locks - reports information about locks

    Extended events

    Extended events is a lightweight performance monitoring system that uses minimal performance resources and provides GUIs to create, modify, display and analyze session data.

    Extended events can synchronously generate event data (and asynchronously process that data) which provides a flexible solution for event handling. Extended events also provides the following features:
  • a unified approach to handling events across the server system, while enabling users to isolate specific events for troubleshooting purposes
  • integration with, and support for existing ETW tools
  • a fully configurable event handling mechanism that is based on T-SQL
  • the ability to dynamically monitor active processes with minimal effect on those processes
  • a default system health session that collects system data that can be used to help troubleshoot performance issues

    A new event session can be created using SSMS by clicking Management -> Extended events -> Sessions and using the New Session dialog or the New Session Wizard.

    The system_health session

    The system_health session is an Extended events session that is included by default with SQL Server. This session starts automatically when the Database Engine starts, runs without any noticeable performance effects and collects system data that can be used to help troubleshoot performance issues.

    The session collects information including
  • the sql_text and session_id for any sessions that encounter an error that has a severity >= 20 or a memory-related error
  • any deadlocks that are detected, including the deadlock graph
  • the callstack, sql_text and session_id for any sessions that have waited on latches for > 15 seconds or locks for > 30 seconds

    The session uses the ring buffer target and event file target to store the data which can be viewed using the Extended events UI in SSMS or using T-SQL.

    Back



    TRANSACTION/QUERY MANAGEMENT

    Isolation levels

    The SET TRANSACTION ISOLATION LEVEL statement controls the locking and row versioning behavior of T-SQL statements issued by a connection.
    	SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED/READ COMMITTED/REPEATABLE READ/SNAPSHOT/SERIALIZABLE;
    READ UNCOMMITTED - Transactions do not issue shared locks to prevent other transactions from modifying data read by the current transaction and are also not blocked by exclusive locks that would prevent reading rows that have been modified but not committed by other transactions.

    READ COMMITTED - this option is the SQL Server default and its behavior depends on the setting of the READ_COMMITTED_SNAPSHOT database option:
  • If READ_COMMITTED_SNAPSHOT is set to OFF (the default), shared locks are used to prevent other transactions from modifying rows while the current transaction is running a read operation. The shared locks also block the statement from reading rows modified by other transactions until the other transaction is completed
  • If READ_COMMITTED_SNAPSHOT is set to ON, row versioning is used to present each statement with a transactionally consistent snapshot of the data as it existed at the start of the statement. Locks are not used to protect the data from updates by other transactions

    REPEATABLE READ - shared locks are placed on all data read by the transaction and are held until the transaction completes preventing other transactions from modifying the rows. Other transactions can insert new rows that match the search conditions of statements issued by the current transaction, which can results in phantom reads if a statement is retried

    SNAPSHOT - the statements in a transaction get a snapshot of the committed data as it existed at the start of the transaction. Except when a database is being recovered, SNAPSHOT transactions do not request locks when reading data and do not block other transactions from writing data. Transactions writing data do not block SNAPSHOT transactions from reading data

    SERIALIZABLE - the most restrictive of the isolation levels since it locks entire ranges of keys and holds them until the transaction completes. Statements cannot read data that has been modified but not yet committed by other transactions and other transactions cannot modify data read by the current transaction. Other transactions cannot insert new rows with key values that would fall in the range of keys read by the current transaction


    Locking

    SQL Server has multi-granular locking that allows different types of resources to be locked by a transaction. The Database Engine often has to acquire locks at multiple levels of granularity to fully protect a resource and this group of locks is called a lock hierarchy. For e.g to fully protect a read of an index, SQL Server may have to acquire share locks on rows and intent share locks on the pages and table.

    The following are the resources that SQL Server can lock.
    Resource	Description
    RID		A row identifier used to lock a single row within a heap
    KEY		A row lock within an index used to protect key ranges in serializable transactions
    PAGE		An 8KB page in a database, such as data or index pages
    EXTENT		A contiguous group of eight pages
    HoBT		A heap or B-tree. A lock protecting a B-tree (index) or the heap data pages in a table that does not have a clustered index
    TABLE		The entire table, including all data and indexes
    FILE		A database file
    APPLICATION	An application-specified resource
    METADATA	Metadata locks
    ALLOCATION_UNIT	An allocation unit
    DATABASE	The entire database
    Locking modes

    Shared(S) - used for read operations(SELECT). No other transactions can modify the data while S locks exist on the resource which are released as soon as the read operation completes, unless the transaction isolation level is set to repeatable read or higher

    Update(U) - used on resources that can be updated. Prevents a common form of deadlock that occurs when multiple sessions are reading, locking and potentially updating resources later. Only one transaction can obtain a U lock to a resource at a time. If a transaction modifies a resource, the U lock is converted to an X lock

    Exclusive(X) - used for DML operations such as INSERT, UPDATE and DELETE. Ensures that multiple updates cannot be made to the same resource at the same time

    Intent - placed on a higher-level resource before a lock is placed on a lower-level resource. Intent locks serve two purposes
  • prevent other transactions from modifying the higher-level resource in a way that would invalidate the lock at the lower level
  • improve the efficiency of SQL Server in detecting lock conflicts at the higher level of granularity

    Bulk Update (BU) - used when bulk copying data into a table

    Schema - used when an operation dependent on the schema of a table is executing Schema modification locks - used when a DDL operation or a DML operation like truncation is executing on a table
    Schema stability locks - used when queries are compiling
    Key-range - protects the range of rows read by a query when using the serializable transaction isolation level


    Locking information

    sys.dm_tran_locks

    Each row in sys.dm_tran_locks represents a currently active request to the lock manager for a lock that has been granted or is waiting to be granted.
    	select * from sys.dm_tran_locks;
    Columns: resource_type - DATABASE, FILE, OBJECT, PAGE, KEY, EXTENT, RID, APPLICATION, METADATA, HOBT or ALLOCATION_UNIT
    resource_subtype - a subtype of the resource_type
    resource_database_id - id of the database
    resource_associated_entity_id - can be an object ID, Hobt ID or an Allocation Unit ID depending on the resource type
    resource_lock_partition - id of the lock partition for a partitioned lock resource, 0 for nonpartitioned resources
    request_mode - for granted requests, this is the granted mode; for waiting requests, this is the mode being requested
    request_type - the value is LOCK
    request_status - GRANTED, CONVERT, WAIT, LOW_PRIORITY_CONVERT, LOW_PRIORITY_WAIT or ABORT_BLOCKERS
    request_session_id - Session ID that currently owns this request
    request_owner_type - entity type that owns the request(TRANSACTION, SESSION, SHARED_TRANSACTION_WORKSPACE, EXCLUSIVE_TRANSACTION_WORKSPACE)
    	-- display current lock requests
    	SELECT
    	dtl.request_session_id AS SPID,
    	DB_NAME(dtl.resource_database_id) AS DBName,
    	dtl.resource_type AS LockedResource,
    	CASE
    	WHEN dtl.resource_type = 'OBJECT' THEN o.object_id
    	WHEN dtl.resource_type = 'KEY' THEN i.object_id
    	END AS LockedObjId,
    	CASE
    	WHEN dtl.resource_type = 'OBJECT' THEN o.name
    	WHEN dtl.resource_type = 'KEY' THEN i.name
    	END AS LockedObjName,
    	dtl.request_mode AS LockType,
    	dtl.request_status AS LockStatus,
    	es.program_name AS ProgramName,
    	es.original_login_name AS OrigLoginName,
    	st.text AS SQLText
    	FROM sys.dm_tran_locks dtl
    	left join sys.partitions p ON dtl.resource_associated_entity_id = p.partition_id AND dtl.resource_type = 'KEY'
    	left join sys.indexes i on p.object_id = i.object_id and p.index_id = i.index_id
    	left join sys.objects o ON dtl.resource_associated_entity_id = o.object_id AND dtl.resource_type = 'OBJECT'
    	join sys.dm_exec_sessions es ON dtl.request_session_id = es.session_id
    	JOIN sys.dm_exec_connections dc ON dc.session_id = es.session_id
    	CROSS APPLY sys.dm_exec_sql_text(dc.most_recent_sql_handle) AS st
    	WHERE resource_database_id = db_id()
    	ORDER BY dtl.request_session_id;
    Blocks and Deadlocks

    A process is referred to as blocked when it is waiting for a resource to be released by another process.

    Deadlocks may result from Worker threads, memory, parallel query execution, MARS (Multiple Active Resultsets). SQL Server performs periodic deadlock detection through a lock monitor thread that periodically initiates a search through all of the tasks in an instance. When a deadlock is detected, one of the threads is chosen as the deadlock victim and terminated.

    Locking information can be viewed through - Activity Monitor, SQL Server Profiler (Locks event), sys.dm_tran_locks. Locked processes can be killed using Activity Monitor.

    Deadlock Graphs

    SQL Server Profiler and SSMS use a deadlock wait-for graph to describe a deadlock. The deadlock wait-for graph contains process nodes, resource nodes and edges representing the relationships between the processes and the resources.
  • Process node - a thread that performs a task; for e.g INSERT, UPDATE or DELETE
  • Resource node - a database object; for e.g a table, index or row
  • Edge - a relationship between a process and a resource. A request edge occurs when a process waits for a resource. An owner edge occurs when a resource waits for a process. The lock mode (X, S etc.) is included in the edge description

    Deadlock graphs can be retrieved from the system_health session in SSMS by clicking Management > Extended Events > Sessions > system_health > package0.event_file and View Target Data

    Query processing

    The SQL Server Database Engine processes queries on local tables, partitioned tables and tables distributed across multiple servers. The Database Engine can process T-SQL statements using two distinct processing modes
  • Row mode execution - used with traditional RDBMS tables, where data is stored in row format. When a query is executed and accesses data in row store tables, the execution tree operators and child operators read each required row, across all the columns specified in the table schema. From each row that is read, SQL Server then retrieves the columns that are required for the result set as referenced by a SELECT statement, JOIN predicate or filter predicate
  • Batch mode execution - used to process multiple rows together (hence the term batch). Each column within a batch is stored as a vector in a separate area of memory, so batch mode processing is vector-based. Batch mode processing also uses algorithms that are optimized for the multi-core CPUs and increased memory throughput that are found on modern hardware

    Join operations

    SQL Server employs three types of join operations
  • Nested Loops join - If one join input is small and the other input is fairly large and indexed on its join columns, an index nested loops join is the fastest join operation because they require the least I/O and the fewest comparisons
  • Merge join - If the two join inputs are not small but are sorted on their join column, a merge join is the fastest join operation
  • Hash join - Hash joins can efficiently process large, unsorted, nonindexed inputs and are useful for intermediate results in complex queries

    Parallel Query Processing

    SQL Server can perform a query or index operation in parallel by using several operating system worker threads and provides parallel queries to optimize query execution and index operations in computers that have more than one CPU.

    SQL Server automatically detects the best degree of parallelism for each instance of a parallel query execution or index DDL operation. The max degree of parallelism (MAXDOP) option can be used to limit the number of processors to use in parallel plan execution. The MAXDOP option can be overridden for individual query and index operations by specifying the MAXDOP query hint or MAXDOP index option.

    Setting the option to 0 (default) enables SQL Server to use all available processors up to a maximum of 64 processors.
    	SELECT ProductID, OrderQty, SUM(LineTotal) AS Total FROM Sales.SalesOrderDetail
    	WHERE UnitPrice < $5.00  
    	GROUP BY ProductID, OrderQty ORDER BY ProductID, OrderQty
    	OPTION (MAXDOP 2);
    Distributed queries

    Distributed queries access data from multiple heterogeneous data sources, which can be stored on either the same or different computers. SQL Server supports distributed queries by using OLE DB, the Microsoft specification of an API for universal data access.

    The OPENQUERY function executes a pass-through query on the specified linked server. OPENQUERY can also be referenced as the target table of an INSERT, UPDATE or DELETE statement, subject to the capabilities of the OLE DB provider.
    	select * from openquery(linkedserver, 'select * from t1');
    Linked servers

    Linked servers enable the Database Engine to execute distributed, heterogeneous queries against OLE DB data sources like another instance of SQL Server or another RDBMS. Linked servers can be configured by using SSMS or the sp_addlinkedserver procedure.

    A linked server definition specifies the following
  • An OLE DB provider - a DLL that manages and interacts with a specific data source. OLE DB providers exist for a variety of files and file formats including text files and spreadsheets
  • An OLE DB data source - identifies the specific database that can be accessed through OLE DB


    Query optimization

    By default, the query optimizer updates statistics as necessary to improve the query plan. In some cases query performance can be improved by using UPDATE STATISTICS or the stored procedure sp_updatestats to update statistics more frequently than the default updates. Updating statistics ensures that queries compile with up-to-date statistics but it causes queries to recompile.

    The query optimizer uses operators to build a query plan to create the result specified in the query or to perform the operation specified in the DML statement. The query plan is a tree consisting of physical operators and can be viewed by using the SET SHOWPLAN statements, the graphical execution plan options in SSMS or the SQL Server Profiler Showplan event classes.
    	set showplan_all on;	- returns execution information for each statement without executing it
    	set statistics profile on;	- displays profile information for the statement after the resultset
    	set statistics xml on;	- returns execution information in an XML document
    Operators

    Operators are classified as

    Logical operators - describe the relational algebraic operation used to process a statement

    Physical operators - implement the operation described by logical operators. Each physical operator is an object or routine that performs an operation (access columns or rows from a table, index or view, perform calculations, aggregations, data integrity checks or joins). Physical operators have costs associated with them.

    The physical operators initialize, collect data, and close and can answer the following three method calls:

    Init() - causes a physical operator to initialize itself and set up any required data structures. The physical operator may receive many Init() calls, though typically receives only one

    GetNext() - causes a physical operator to get the first, or subsequent row of data. The physical operator may receive zero or many GetNext() calls

    Close() - causes a physical operator to perform some clean-up operations and shut itself down. A physical operator only receives one Close() call

    The GetNext() method returns one row of data, and the number of times it is called appears as ActualRows in the Showplan output.

    The ActualRebinds and ActualRewinds counts that appear in Showplan xml output refer to the number of times that the Init() method is called. ActualRebinds = 1 and ActualRewinds = 0 unless the operator is on the inner side of a loop join, when the sum of the number of rebinds and rewinds equals the number of rows processed on the outer side of the join. A rebind means that one or more of the correlated parameters of the join changed and the inner side must be reevaluated. A rewind means that none of the parameters changed and the prior inner result set may be reused.

    List of operators

    Clustered Index Scan - scans the clustered index specified in the argument column. When an optional WHERE predicate is present, only those rows that satisfy the predicate are returned
    Clustered Index Seek - uses the seeking ability of indexes to retrieve rows from a clustered index. The argument contains the name of the clustered index being used and the SEEK predicate
    Eager spool - the eager spool operator builds its spool file in an 'eager' manner i.e when the spool's parent operator asks for the first row it consumes all rows from its input operator and stores them in a hidden temporary object stored in the tempdb database. If rebinding is needed, the spooled data is discarded and the spool object is rebuilt by rescanning the input
    Index Scan - scans the nonclustered index specified in the argument
    Index Seek - uses the seeking ability of indexes to retrieve rows from a nonclustered index
    Filter - scans the input, returning only those rows that satisfy the filter expression
    Key lookup - a bookmark lookup on a table with a clustered index, always accompanied by the nested loops operator
    Lazy Spool - the lazy spool operator builds its spool file in a 'lazy' manner i.e each time the spool's parent operator asks for a row, it gets a row from its input operator and stores it in the spool in the tempdb database, rather than consuming all rows at once.
    Merge Join - performs join operations. The argument contains a MERGE predicate if the operation is performing a one-to-many join, or a MANY-TO-MANY MERGE predicate if performing a many-to-many join and a list of columns used to perform the operation.
    Nested loops - nested loops joins perform a search on the inner table for each row of the outer table, typically using an index
    Result(Select) - data returned at the end of the plan
    Sort - sorts all incoming rows. The argument contains either a DISTINCT ORDER BY or an ORDER BY predicate with a list of the columns being sorted
    Table Scan - retrieves all rows from the table specified in the argument column

    Back



    IMPORTING/EXPORTING DATA

    Sql Server Integration Services(SSIS)

    SSIS is a platform for building data integration and data transformations solutions. SSIS can be used to create business solutions by loading/downloading files, loading data warehouses, cleansing and mining data and managing SQL Server objects and data. SSIS can extract and transform data from a variety of sources such as XML data files, flat files and relational data sources and load the data into one or more destinations.

    SSIS includes a rich set of built-in tasks and transformations, graphical tools for building packages and the Catalog database to store, run and manage packages.

    SSIS can be installed by the SQL Server setup wizard by choosing it from the components list. SSIS is not provided with the Express edition.

    SSIS packages

    A package is an organized collection of connections, control flow elements, data flow elements, event handlers, variables, parameters and configurations that are assembled using the graphical design tools provided by SSIS or built programmatically. The completed package can be saved to SQL Server, the SSIS Package Store or the file system.

    Contents of a package

  • Tasks and containers (control flow)

    A control flow consists of one or more tasks and containers that execute when the package runs. Precedence constraints are used to connect the tasks and containers and control the order and define the conditions for running the next task or container in the package control flow. A subset of tasks and containers can also be grouped and run repeatedly as a unit within the package control flow.

  • Data sources and destinations (data flow)

    A data flow consists of the sources and destinations that extract and load data, the transformations that modify and extend data and the paths that link sources, transformations and destinations. Before adding a data flow to a package, the package control flow must include a Data Flow task. The Data Flow task is the executable within the SSIS package that creates, orders and runs the data flow. A separate instance of the data flow engine is opened for each Data Flow task in a package.

  • Connection managers (connections)

    A package typically includes at least one connection manager. A connection manager is a link between a package and a data source that defines the connection string for accessing the data that the tasks, transformations and event handlers in the package use. SSIS includes connection types for data sources such as text and XML files, relational databases and Analysis Services databases and projects.

    Packages may also include additional objects like variables, event handlers, configurations, log providers and parameters.

    Creating a package

    In SQL Server Data Tools (SSDT), a new package can be created by using one of the following methods:
  • Using the package template included in SSIS
  • Using a custom template - to use custom packages as templates for new packages, they can be copied to the DataTransformationItems folder
  • Copy an existing package - if a new package is created by copying an existing package or by using a custom package as a template, the name and the GUID of the existing package are copied as well and should be updated to help differentiate it from the package from which it was copied. The GUID can be regenerated in the ID property and value of the Name property can be updated by using the Properties window in SSDT
  • Using a custom package that has been designated as a template
  • Running the SQL Server Import and Export Wizard - the Import and Export Wizard creates a complete package for a simple import or export and configures the connections, source, and destination and adds any data transformations that are required. The package can be optionally save to run later

    The packages created in SSDT using SSIS Designer are saved to the file system and to save them to SQL Server or to the package store, a copy of the package needs to be saved.

    Creating a new Integration Services project

    1. On the Windows Start menu, search for and select Visual Studio (SSDT)
    2. In Visual Studio, select File > New > Project and in the New Project dialog box, expand the Business Intelligence node under Installed and select Integration Services Project in the Templates pane
    3. In the Name box, change the default name. To use a folder that already exists, clear the Create directory for solution check box
    4. Accept the default location or browse to locate the folder to use. In the Project Location dialog box, select the folder and then Select Folder and select OK
    5. By default, an empty package titled Package.dtsx is created and added to the project under SSIS Packages 6. In Solution Explorer, right-click Package.dtsx and rename the default package

    Creating a package in SSDT using the Package Template

    1. In SSDT, open the Integration Services project in which the package needs to be created
    2. In Solution Explorer, right-click the SSIS Packages folder and then click New SSIS Package
    3. Optionally, add control flow, data flow tasks, and event handlers to the package
    4. On the File menu, click Save Selected Items to save the new package


    Import and Export Wizard

    The SQL Server Import and Export Wizard uses SSIS to copy data from a source to a destination. The wizard creates an SSIS package in memory while moving through the pages of the wizard and specifying options. At the end, the SSIS package can be saved in SQL server itself or as a file.

    The wizard can be started
  • from the Start menu
  • from the command prompt by running DTSWizard.exe
  • from SSMS
  • from Visual Studio with SQL Server Data Tools (SSDT)


    Bulk Import/Export

    bcp utility

    The bulk copy program utility (bcp) bulk copies data between an SQL Server instance and a data file in a user-specified format. It runs as a Windows OS command. Use integrated security option over SQL server login since user/pwd are sent as clear text in the latter.
    	bcp {table_name | view_name | "query"} {in data_file | out data_file | queryout data_file | format nul}
    BULK INSERT statement

    Imports a data file into a table.
    	--import from the second row
    	bulk insert tblmageMap from 'C:\imagemap.txt'
    			with (firstrow=2, fieldterminator='\t', rowterminator = '\n');
    LOADING XML DATA

    XML data can be transferred into SQL Server in several ways
  • If the data is in a text or image column in a SQL Server database, the table can be imported by using SSIS and changing the column type to XML by using the ALTER TABLE statement
  • the data can be bulk copied from another SQL Server database by using bcp out and then bulk inserting the data by using bcp in
  • client-side programming can be used to retrieve the XML that is generated at the server with FOR XML and write the XML into an XML column in a new table

    SQL Server stores XML data in Unicode (UTF-16) and XML data retrieved from the server comes out in UTF-16 encoding. If a different encoding is needed, the required conversion has to be performed.

    Bulk loading XML data

    XML data can be bulk loaded by using the bulk loading capabilities of SQL Server, such as bcp. OPENROWSET allows loading data into an XML column from files.

    Loading an XML file into a table
    	DECLARE @SQL nvarchar(1000), @ACES_XML XML;
    
    	SET @SQL =  'SELECT @a = CONVERT(xml, BulkColumn, 2) FROM OPENROWSET( BULK ''C:\temp\test.xml'', SINGLE_BLOB) AS TempXML; '
    	EXEC sp_Executesql  @sql , N'@a XML OUTPUT',@ACES_XML OUTPUT
    
    	SELECT  ref.value('(./PartType/@id)[1]','varchar(10)') parttypeID,
    		ref.value('(./MfrLabel)[1]','varchar(200)') MfrLabel,
    		ref.value('(./Part)[1]','varchar(20)') Part
    	 into dbo.Specialty_friction_20120919	 FROM @ACES_XML.nodes('parts/App') as R(ref);
    Loading an XML file into SQL Server using SQLXML 4.0 Bulk load utility
    	namespace XMLLoadSQL
    	{
    		class XMLLoadSQL
    		{
    			private static Boolean FileValid;
    			static String sAppPath = "C:\\SQLServer\\XMLBulkLoad";
    			static void Main(string[] args)
    			{
    				SQLXMLBULKLOADLib.SQLXMLBulkLoad objBL = new SQLXMLBULKLOADLib.SQLXMLBulkLoad();
    				objBL.ConnectionString = "provider=SQLOLEDB;data source=CSCINDAE747976;database=AEDB;integrated security=SSPI";
    				objBL.ErrorLogFile = "c:\\sqlserver\\XMLBulkLoad\\error.log";
    
    			   //Validate the data file prior to bulkload
    				String sOutput = ValidateFile("Nouns.xml", "", "NounsSchema.xml");
    				Console.WriteLine (sOutput);
    
    		if (FileValid == true) 
    		{
    	   // Check constraints and initiate transaction (if needed)
    	   // objBL.CheckConstraints = True
    	   // objBL.Transaction=True
    	   //Execute XML bulkload using file.
    		objBL.Execute(sAppPath + "\\" + "NounsSchema.xml", sAppPath + "\\" + "Nouns.xml");
    		objBL= null;
    		}
    			}
    		
    	private static String ValidateFile(String strXmlFile,String strUrn, String strXsdFile)
    	{
    	   // Create a schema cache and add SampleSchema.xml to it.
    	   MSXML2.XMLSchemaCache60 xs =  new MSXML2.XMLSchemaCache60();
    		 
    	   xs.add (strUrn, sAppPath + "\\" + strXsdFile);
    
    	   // Create an XML DOMDocument object.
    	   MSXML2.DOMDocument60 xd = new MSXML2.DOMDocument60();
    
    	   // Assign the schema cache to the DOM document, schemas collection.
    	   xd.schemas = xs;
    
    	   // Load XML document as DOM document.
    	   xd.async = false;
    	   xd.load (sAppPath + "\\" + strXmlFile);
    
    	   // Return validation results in message to the user.
    	   if (xd.parseError.errorCode != 0)
    			{FileValid = false ;
    			 return ("Validation failed on " +  strXmlFile + "=====================" +
    				 "Reason: " + xd.parseError.reason +  "Source: " + xd.parseError.srcText +
    				  "Line: " +  xd.parseError.line) ;             }
    		else
    			{FileValid = true;
    			 return ("Validation succeeded for " +
    				 strXmlFile  +  "======================" + "Contents to be bulkloaded" );        }
    		}
    		}
    	}
    
    	XML file:
    	<ROOT>
    	  <Nouns>
    		<Category>places</Category>
    		<arabic>tareeq</arabic>
    		<english>road</english>
    	  </Nouns>
    	  <Nouns>
    		<Category>places</Category>
    		<arabic>burj</arabic>
    		<english>tower</english>
    	  </Nouns>
    	  <Nouns>
    		<Category>places</Category>
    		<arabic>bahr</arabic>
    		<english>sea</english>
    	  </Nouns>
    	</ROOT>
    
    	XML Schema file:
    	<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    				xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
    	   <xsd:element name="ROOT" sql:is-constant="1" >
    		 <xsd:complexType>
    		   <xsd:sequence>
    			 <xsd:element name="Nouns" sql:relation="NounsXML" maxOccurs="unbounded">
    			   <xsd:complexType>
    				 <xsd:sequence>
    				   <xsd:element name="Category" type="xsd:string" />
    				   <xsd:element name="arabic" type="xsd:string" />
    				   <xsd:element name="english"  type="xsd:string" />
    				 </xsd:sequence>
    			   </xsd:complexType>
    			 </xsd:element>
    		   </xsd:sequence>
    		  </xsd:complexType>
    		 </xsd:element>
    	</xsd:schema>

    Back



    BACKUP/RECOVERY

    Three recovery models:
    Simple - No transaction log backups. Inactive portions of the log are freed after automatic checkpoints. Suitable for small, read-only databases, where loss of updates is not consequential. Preferable to run differential backups between full backups.
    Full - Log is backed up. Data is not removed from the log till it is backed up. Preferable to run frequent log backups and also run differential backups between full backups so number of log backups to restore is minimal.
    Bulk-logged - Reduces log space usage. Enhances performance during large-scale bulk operations.

    Creating backup with timestamp
    	declare @bk_file varchar(100);
    	set @bk_file = 'R:\SQLBackup\dbname_' + convert(varchar(50),GetDate(),112) 
    		+ '_' + replace(convert(varchar(50),GetDate(),108),':','') + '.bak';
    	BACKUP DATABASE [dbname] TO  DISK = @bk_file WITH NOFORMAT, NOINIT,  NAME = N'DBNAME-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, 
    	COMPRESSION,  STATS = 10;

    Restoring a database from backup

    	RESTORE VERIFYONLY FROM DISK='C:\backup\db1.bak';		-- verifies if .bak file is valid
    	RESTORE HEADERONLY FROM DISK='C:\backup\db1.bak';		-- get info on .bak file contents
    	RESTORE FILELISTONLY FROM DISK='C:\backup\db1.bak';		-- list of db and log files contained in the backup
    	RESTORE DATABASE db1 FROM DISK='C:\backup\db1.bak';
    	
    	RESTORE DATABASE db1 FROM DISK='C:\Backup\db1.bak' WITH MOVE N'db1' TO N'C:\vincentk\db1.mdf', 
    			MOVE N'db1_log' TO N'C:\vincentk\db1_log.ldf';	-- move data/log files to a new location
    Full Database backups

    A full database backup backs up the whole database, including part of the transaction log so that the full database can be restored from the backup. Full database backups represent the database at the time the backup finished.
    	BACKUP DATABASE db1 TO DISK = 'Z:\Backups\db1.bak' WITH FORMAT, MEDIANAME = 'Z_Backups', NAME = 'Full Backup of db1';
    Differential backups

    A differential backup is based on a previous full data backup (base) and captures only the state of any extents (collections of eight physically contiguous pages) that have changed. Before restoring a differential backup, its base must be restored.

    Differential backups are especially useful if a subset of a database is modified more frequently than the rest of the database. Under the full recovery model, using differential backups can reduce the number of log backups that need to be restored.
    	BACKUP DATABASE db1 TO DISK = 'Z:\Backups\db1.bak' WITH DIFFERENTIAL;
    Transaction log backups

    Log backups are relevant only for full/bulk-logged recovery models.

    Taking multiple transaction log backups in a row is called a log-chain. It starts with a full backup and does not end till a new backup set is started. If one of the backups in the log chain is missing, the restore will fail. A tail-log backup is taken just before the restore and ensures the log chain is intact, and contains all log records since the last log backup.
    	BACKUP LOG db1 TO DISK = 'Z:\Backups\db1.bak';
    A typical backup strategy involves creating a full database backup occasionally (weekly) and optionally, creating a series of differential backups at a shorter interval (daily) and independent of the database backups, backing up the transaction log at frequent intervals.
    	BACKUP DATABASE AEDB TO DISK = 'C:\Vincent\AEDB.bak' FORMAT, INIT				-- Full backup
    	BACKUP DATABASE AEDB TO DISK = 'C:\Vincent\AEDB.bak' WITH DIFFERENTIAL, NOFORMAT, NOINIT,		-- Differential backup
    		NAME = 'AEDB Diff bkup', COMPRESSION
    	BACKUP LOG AEDB TO DISK = N'C:\Vincent\AEDB.bak' WITH NOFORMAT, NOINIT, NAME = N'AEDB Trans Log  Backup',	-- Log backup
    		SKIP, NOREWIND, NOUNLOAD, STATS = 10
    	BACKUP LOG AEDB TO DISK = N'C:\Vincent\AEDB.bak' WITH  NO_TRUNCATE , NOFORMAT, NOINIT,		-- Tail log backup
    		NAME = N'AEDB Trans Log  Tail Backup', SKIP, NOREWIND, NOUNLOAD, NORECOVERY, STATS = 10
    
    	RESTORE DATABASE AEDB FROM DISK = 'C:\Vincent\AEDB.bak'   ---> Restore of full bkup
    		WITH FILE = 1, NORECOVERY                           	 (NORECOVERY puts db in Restoring mode)   
    	RESTORE DATABASE AEDB FROM DISK = 'C:\Vincent\AEDB.bak'   ---> Restore of last diff bkup
    		WITH FILE = 6, NORECOVERY
    	RESTORE DATABASE AEDB FROM DISK = 'C:\Vincent\AEDB.bak'   ---> Restore of log bkup (file 7)
    		WITH FILE = 7, NORECOVERY
    	RESTORE DATABASE AEDB FROM DISK = 'C:\Vincent\AEDB.bak'   ---> Restore of log bkup (file 8)
    		WITH FILE = 8, NORECOVERY
    	RESTORE DATABASE AEDB FROM DISK = 'C:\Vincent\AEDB.bak'   ---> Restore of Tail log bkup
    		WITH FILE = 9, RECOVERY
    


    Database checkpoints

    A checkpoint creates a known good point from which the SQL Server Database Engine can start applying changes contained in the log during recovery after a crash, by writing the dirty pages and transaction log information from memory to disk and recording information about the transaction log.

    The Database Engine supports several types of checkpoints
  • automatic - Issued automatically in the background to meet the upper time limit suggested by the recovery interval server configuration option
    	EXEC sp_configure 'recovery interval','seconds'
  • indirect - Issued in the background to meet a user-specified target recovery time for a given database
    	ALTER DATABASE … SET TARGET_RECOVERY_TIME =target_recovery_time { SECONDS | MINUTES }
  • manual - issued when a T-SQL CHECKPOINT command is executed
  • internal - issued by server operations such as backup and database-snapshot creation to guarantee that disk images match the current state of the log


    Troubleshooting/Recovery

    Error codes

    1205: Transaction (Process ID xx) was deadlocked on lock | communication buffer resources with another process - possible intra-query parallel deadlock (MAXDOP 1 may be considered)
    3156: File 'Db' cannot be restored to 'D:\..\DATA\Db.mdf'. Use WITH MOVE to identify a valid location for the file - The file path doesn't match the original machine
    3201: Cannot open backup device ‘C:\db1.bak’. Operating system error 5 (Access is denied) - SQL Server service account needs permission on .bak file
    3241: The media family on device 'C:\db1.bak' is incorrectly formed. SQL Server cannot process this media family - Attempting to restore a backup created on a later version of SQL Server to an earlier version
    4866: Bulk Insert fails. Column is too long in the data file for row x, column y - Files generated in Unix have a LF at EOL while those in Windows have a CR and LF
    9002: The transaction log for the database is full - To find what is preventing log truncation, use the log_reuse_wait and log_reuse_wait_desc columns of the sys.database catalog view

    Back



    USER MANAGEMENT/SECURITY

    Authentication

    SQL Server supports two authentication modes.

    Windows Authentication - referred to as integrated security because it is integrated with Windows. Users are authenticated using their local or domain windows user account. Users who are already logged onto Windows do not have to log on separately to SQL Server and SQL Server validates the login info through the Windows token which has been confirmed valid by the OS. This is known as a trusted connection.

    Mixed mode authentication - If Windows authentication fails, prompts for SQL Server login and password.

    Windows authentication uses a series of encrypted messages to authenticate users. When SQL Server logins are used, login names and encrypted passwords are passed across the network, which makes them less secure.


    Access control

    Principals

    Principals are entities to whom permissions are granted for SQL server resources (server, database, database objects etc.).
    Windows level - Windows group, Windows domain login, Windows local login
    SQL Server level - fixed server role, SQL Server login
    Database level - fixed database role, database user, application user

    sa login

    The SQL Server sa log in is a server-level principal. By default, it is created when an instance is installed and its default database is master. The sa login is a member of the sysadmin fixed database role and has all permissions on the server and cannot be limited. The sa login cannot be dropped, but it can be disabled so that no one can use it.

    dbo user and schema

    The dbo user is a special user principal in each database. All SQL Server administrators, members of the sysadmin fixed server role, sa login and owners of the database enter databases as the dbo user. The dbo user has all permissions in the database and cannot be limited or dropped. dbo stands for database owner, but the dbo user account is not the same as the db_owner fixed database role and the db_owner fixed database role is not the same as the user account recorded as the database owner.

    The dbo user owns the dbo schema. The dbo schema is the default schema for all users, unless some other schema is specified. The dbo schema cannot be dropped.

    Fixed server roles

    Server-level roles are server-wide in their permissions scope. Fixed server roles are provided for convenience and backward compatibility. Three main roles - sysadmin, securityadmin, serveradmin.

    sysadmin - members of the sysadmin fixed server role can perform any activity in the server. Before SQL Server 2008 R2, members of the local Windows built-in Administrators group were automatically added to the sysadmin fixed server role.

    serveradmin - members of the serveradmin role can change server-wide configuration options and shut down the server

    securityadmin - members manage logins and their properties. They can GRANT, DENY and REVOKE server-level permissions and also GRANT, DENY and REVOKE database-level permissions if they have access to a database and can reset passwords for SQL Server logins.

    Fixed database roles

    They are security principals that are used to give access to a specific database. The two main roles are db_owner and db_securityadmin.

    db_owner - members of the db_owner fixed database role can perform all configuration and maintenance activities on the database and can also drop the database.

    db_securityadmin - members can modify role membership and manage permissions

    public server and database role

    Every login belongs to the public fixed server role and every database user belongs to the public database role. When a login or user has not been granted or denied specific permissions on a securable, the login or user inherits the permissions granted to public on that securable. Database users cannot be removed from the public role.

    The public roles cannot be dropped but permissions can be revoked from them.

    Application roles

    An application role is a database principal that enables an application to run with its own, user-like permissions and are used to enable access to specific data to only those users who connect through a particular application. Unlike database roles, application roles contain no members and are inactive by default.

    Application roles work with both authentication modes and are enabled by using sp_setapprole, which requires a password. Since they are a database-level principal, they can access other databases only through permissions granted in those databases to guest. Therefore, any database in which guest has been disabled will be inaccessible to application roles in other databases.

    Logins

    A login is a security principal and permissions can be granted to logins. The scope of a login is the whole Database Engine and to connect to a specific database on the instance of SQL Server, a login must be mapped to a database user.

    Database users

    Access to objects with a database is controlled by database users. A database user can be associated with a SQL Server login or just have a password with no login or a Windows user.

    Each database includes a guest user whose permissions are inherited by users who have access to the database, but who do not have a user account in the database. The guest user cannot be dropped, but it can be disabled by revoking its CONNECT permission within any database other than master or tempdb.
    	REVOKE CONNECT FROM GUEST;

    Securables

    Securables are the resources to which the SQL Server Database Engine authorization system regulates access. Access to securables is controlled by granting or denying permissions to principals (logins and users) or by adding logins and users to roles which have access. The three main scopes of securables are:
    server
    database
    schema - container that consists of a collection of securables. Each schema can be owned by any database principal. When a user is created, a default schema can be specified, else it will be dbo by default.

    Every database has a number of fixed server and database roles. SQL server logins and Windows logins/groups can be added to the fixed server roles to grant them permissions at the server level scope.

    Schemas

    Schemas are collections of database securables. Default schemas created in a database are dbo, guest, information_schema and sys.


    Permissions

    Every securable has associated permissions that can be granted to a principal. Permissions in the Database Engine are managed at the server level assigned to logins and server roles and at the database level assigned to database users and database roles.

    Server level - ALTER SETTINGS, ALTER ANY LOGIN, ALTER ANY DATABASE, ALTER TRACE, AUTHENTICATE SERVER, CONTROL SERVER, CREATE ENDPOINT, SHUTDOWN
    Database level - ALTER, ALTER ANY USER, AUTHENTICATE, BACKUP DATABASE, BACKUP LOG, CONTROL, DELETE, EXECUTE, INSERT, SELECT, SHOWPLAN

    Permissions can be manipulated by the T-SQL statements - GRANT, DENY and REVOKE. DENY prevents a principal from inheriting a permission through its group or role memberships.
    	GRANT INSERT on object::dbo.table1 to user1;
    	REVOKE DELETE on object::dbo.table1 from user2;
    	DENY EXECUTE on object::dbo.usp1 from user3;
    Information about permissions is visible in the sys.server_permissions and sys.database_permissions catalog views.
    	SELECT * FROM fn_builtin_permissions(default);	- all permissions
    	SELECT * FROM fn_builtin_permissions('schema');	- for a securable
    	
    	SELECT * FROM fn_my_permissions('table1', 'object');	- permissions granted to the executing principal on an object
    	SELECT * FROM sys.database_permissions WHERE major_id = OBJECT_ID('table1');	- permissions for an object


    Services Security

    Various SQL Server services can be run on different security accounts to isolate each service. Additionally each service should use a Local/Windows user account with permission to run that service.

    Services can use one of the accounts below:
    local user account
    local service account
    local system account
    network service account
    domain user account - When using this, avoid using an Administrators account and assign fewest privileges possible.

    The service startup account can be changed through the SQL Server Configuration Manager.


    Encryption

    SQL Server encrypts data with a hierarchical encryption and key management infrastructure. Each layer encrypts the layer below it by using a combination of certificates, asymmetric keys and symmetric keys. Asymmetric keys and symmetric keys can be stored outside of SQL Server in an Extensible Key Management (EKM) module.

    Encryption mechanisms

  • T-SQL functions - Individual items can be encrypted as they are inserted or updated using T-SQL functions

  • Symmetric key encryption - A symmetric key is one key that is used for both encryption and decryption

  • Asymmetric key encryption - An asymmetric key is made up of a private key and the corresponding public key. Each key can decrypt data encrypted by the other

  • Digital Certificates - A public key certificate, usually just called a certificate, is a digitally-signed statement that binds the value of a public key to the identity of the person, device or service that holds the corresponding private key

  • Transparent Data encryption (TDE) - TDE is a special case of encryption using a symmetric key. TDE encrypts an entire database using that symmetric key called the database encryption key. The database encryption key is protected by other keys or certificates which are protected either by the database master key or by an asymmetric key stored in an EKM module

    SQL Server allows to choose from among several encryption algorithms including DES, Triple DES, TRIPLE_DES_3KEY, RC2, RC4, 128-bit RC4, DESX, 128-bit AES, 192-bit AES, and 256-bit AES.

    Back



    HIGH AVAILABILITY/SCALABILITY

    Failover Clustering - Failover clusters are made up of one or more server nodes and 2 or more shared disks. The application is installed as a cluster resource group and hosted on one cluster at a time.

    Log Shipping - Allows automatically sending transaction log backups from a primary database on a primary server instance to one or more secondary databases on separate secondary server instances. The transaction log backups are applied to each of the secondary databases individually. An optional third server instance, known as the monitor server, records the history and status of backup and restore operations and, optionally, raises alerts.

    Database Mirroring - Two databases (principal and mirror) exist on two difference SQL server instances, both with the same data. A mirror is a dynamically updated copy of a source database that resides on a mirror server. Works only on DBs using the full recovery model. Databases can be in one of these states - Synchronizing, Synchronized, Suspended, Pending Failover and Disconnected. The principal and mirror DBs will be of the same state, except for Pending Failover.

    Mirroring works at the level of the physical log record and involves redoing every insert, update and delete operation performed on the principal DB on the mirror DB by sending the log records. SQL Server can automatically repair corrupt pages using database mirroring.


    Always On Availability Groups

    Always On availability groups (AG) is a high-availability and disaster-recovery solution that provides an enterprise-level alternative to database mirroring. Introduced in SQL Server 2012, AG maximizes the availability of a set of user databases for an enterprise by supporting a failover environment for a set of databases, known as availability databases, that fail over together. An AG supports a set of read-write primary databases and one to eight sets of corresponding secondary databases. Optionally, secondary databases can be made available for read-only access and/or some backup operations.

    An AG fails over at the level of an availability replica. Failovers are not caused by database issues such as a database becoming suspect due to a loss of a data file, deletion of a database, or corruption of a transaction log.

    Features of AG

  • Supports up to nine availability replicas. An availability replica is an instantiation of an availability group that is hosted by a specific instance of SQL Server and maintains a local copy of each availability database that belongs to the availability group. Each availability group supports one primary replica and up to eight secondary replicas

  • supports asynchronous and synchronous commit modes

  • Supports an availability group listener for each AG. An AG listener is a server name to which clients can connect in order to access a database in a primary or secondary replica of an AG. AG listeners direct incoming connections to the primary replica or to a read-only secondary replica. The listener provides fast application failover after an availability group fails over

  • Supports automatic page repair for protection against page corruption

  • Provides an integrated set of tools to simplify deployment and management including T-SQL DDL statements and SSMS tools for creating and managing availability groups

    Deploying AG requires a Windows Server Failover Clustering (WSFC) cluster. To be enabled for AG, an instance of SQL Server must reside on a WSFC node, and the WSFC cluster and node must be online. Furthermore, each availability replica of a given AG must reside on a different node of the same WSFC cluster. The only exception is that while being migrated to another WSFC cluster, an AG can temporarily straddle two clusters.

    AG relies on the WSFC cluster to monitor and manage the current roles of the availability replicas that belong to a given AG and to determine how a failover event affects the availability replicas. A WSFC resource group is created for every AG created. The WSFC cluster monitors this resource group to evaluate the health of the primary replica.

    Back



    Always On Failover Cluster Instances

    Always On Failover Cluster Instances (FCI) leverages WSFC functionality to provide local high availability through redundancy at the server-instance level. An FCI is a single instance of SQL Server that is installed across WSFC nodes and possibly across multiple subnets. On the network, an FCI appears to be an instance of SQL Server running on a single computer, but the FCI provides failover from one WSFC node to another if the current node becomes unavailable.

    An FCI runs in a WSFC resource group with one or more WSFC nodes. When the FCI starts up, one of the nodes assume ownership of the resource group and brings its SQL Server instance online. The resources owned by this node include - Network name IP address Shared disks SQL Server Database Engine service SQL Server Agent service SQL Server Analysis Services service, if installed One file share resource, if the FILESTREAM feature is installed At any time, only the resource group owner (and no other node in the FCI) is running its respective SQL Server services in the resource group. When a failover occurs, either automatic or planned, the following sequence of events happen
  • Unless a hardware or system failure occurs, all dirty pages in the buffer cache are written to disk
  • All respective SQL Server services in the resource group are stopped on the active node
  • The resource group ownership is transferred to another node in the FCI
  • The new resource group owner starts its SQL Server services
  • Client application connection requests are automatically directed to the new active node using the same virtual network name (VNN)

    The FCI is online as long as its underlying WSFC cluster is in good quorum health (the majority of the quorum WSFC nodes are available as automatic failover targets). When the WSFC cluster loses its quorum, whether due to hardware, software, network failure, or improper quorum configuration, the entire WSFC cluster, along with the FCI, is brought offline. Manual intervention is then required in this unplanned failover scenario to reestablish quorum in the remaining available nodes in order to bring the WSFC cluster and FCI back online.

    Elements of FCI

    An FCI consists of a set of physical servers (nodes) that contain similar hardware configuration and identical software configuration that includes OS version and patch level, SQL Server version, patch level, components and instance name. Identical software configuration is necessary to ensure that the FCI can be fully functional as it fails over between the nodes.

  • WSFC Resource Group - FCI runs in a WSFC resource group. Each node in the resource group maintains a synchronized copy of the configuration settings and check-pointed registry keys to ensure full functionality of the FCI after a failover, and only one of the nodes in the cluster owns the resource group at a time (the active node). The WSFC service manages the server cluster, quorum configuration, failover policy, and failover operations, as well as the VNN and virtual IP addresses for the FCI. In case of a hardware/OS/application/service failure or a planned upgrade, the resource group ownership is moved to another node in the FCI. The number of nodes that are supported in a WSFC resource group depends on the SQL Server edition. Also, the same WSFC cluster can run multiple FCIs (multiple resource groups), depending on hardware capacity, such as CPUs, memory, and number of disks

  • SQL Server Binaries - The product binaries are installed locally on each node of the FCI, a process similar to SQL Server stand-alone installations. However, during startup, the services are not started automatically, but managed by WSFC

  • Storage - Contrary to the AG, an FCI must use shared storage between all nodes of the FCI for database and log storage. The shared storage can be in the form of WSFC cluster disks, disks on a SAN, Storage Spaces Direct (S2D), or file shares on an SMB. This way, all nodes in the FCI have the same view of instance data whenever a failover occurs. This does mean that the shared storage has the potential of being the single point of failure, and FCI depends on the underlying storage solution to ensure data protection.

  • Network Name - The VNN for the FCI provides a unified connection point for the FCI. This allows applications to connect to the VNN without the need to know the currently active node. When a failover occurs, the VNN is registered to the new active node after it starts, transparent to the client or application connecting to SQL Server

  • Virtual IPs - In the case of a multi-subnet FCI, a virtual IP address is assigned to each subnet in the FCI. During a failover, the VNN on the DNS server is updated to point to the virtual IP address for the respective subnet. Applications and clients can then connect to the FCI using the same VNN after a multi-subnet failover

    FCIs and Availability Groups

    A second layer of failover can be set up at the server-instance level by implementing FCI together with the WSFC cluster. An availability replica can be hosted by either a standalone instance of SQL Server or an FCI instance. Only one FCI partner can host a replica for a given availability group. When an availability replica is running on an FCI, the possible owners list for the availability group will contain only the active FCI node.

    AG does not depend on any form of shared storage. However, if FCI is used to host one or more availability replicas, each of those FCIs will require shared storage as per standard FCI installation

    Comparison of FCI and AG

    Regardless of the number of nodes in the FCI, an entire FCI hosts a single replica within an availability group. The following table describes the distinctions in concepts between nodes in an FCI and replicas within an availability group.
    					Nodes within an FCI				Replicas within an availability group
    					-------------------				-------------------------------------
    	Uses WSFC cluster		Yes						Yes
    	Protection level		Instance					Database
    	Storage type			Shared						Non-shared
    	Storage solutions		Direct attached, SAN, mount points, SMB		Depends on node type
    	Readable secondaries		No						Yes
    	Applicable failover 		WSFC quorum					WSFC quorum
    		policy settings		FCI-specific					Availability group settings
    					Availability group settings
    	Failed-over resources		Server, instance, and database			Database only

    Back



    TOOLS/UTILITIES

    SQL Server Reporting Services (SSRS)

    SSRS provides a set of tools and services that create, deploy and manage mobile and paginated reports.

    SSRS installation involves server components for storing report items, rendering reports and processing of subscription and other report services. A report server database needs to be created on a SQL server to complete the initial configuration of report services. A windows service, named SQLServerReportingServices, is created as part of the installation.

    The report server instance can be accessed and administered through a web portal whose URL, by default, is https://[ComputerName]/reports.

    Report Builder

    Report Builder is a stand-alone report authoring environment that can be used to create paginated reports and publish them to SSRS. Report Builder isn't installed with SSRS and needs to be installed separately.

    Report Builder can be started from the Reporting Services web portal by typing the URL for the report server in the browser. By default, the URL is https://<servername>/reports.

    Report data

    Reporting Services retrieves report data from data sources through a modular and extensible data layer that uses data processing extensions. To add data to a report, datasets are created that represent the result set from running a query command on a data source. There are two types of datasets - embedded and shared. An embedded dataset is defined in a report and used only by that report. A shared dataset is defined on the report server and can be used by multiple reports.


    SQL Server Analysis Services (SSAS)

    SSAS is an analytical data engine used in decision support and business analytics which provides enterprise-grade semantic data models for business reports and client applications such as Power BI, Excel, Reporting Services reports and other data visualization tools.

    A typical workflow includes creating a tabular or multidimensional data model project in Visual Studio, deploying the model as a database to a server instance, setting up recurring data processing and assigning permissions to allow data access by end-users. The semantic data model can be accessed by client applications supporting Analysis Services as a data source.


    SQL Server Agent

    SQL Server Agent is a Windows service that executes scheduled administrative tasks called jobs. SQL Server Agent uses SQL Server to store job information. Jobs contain one or more job steps and each step contains its own task like backing up a database.

    Jobs can be run on a schedule, in response to a specific event or on demand.

    SQL Server Agent Components

    SQL Server Agent uses the following components to define, when to perform and how to report the success or failure of the tasks.

    Jobs

    A job is a specified series of actions (steps) that can be run one or more times and monitored for success or failure. A job can run on one local server or on multiple remote servers. Jobs can be run according to one or more schedules, In response to one or more alerts or by executing the sp_start_job stored procedure.

    Each job step runs in a specific security context. For job steps that use T-SQL, the EXECUTE AS statement can be used to set the security context. For other types of job steps, a proxy account can be used to set the security context.

    Schedules

    A schedule specifies when a job runs. More than one job can run on the same schedule and more than one schedule can apply to the same job. A schedule can define the following conditions for the time when a job runs
  • whenever SQL Server Agent starts
  • whenever CPU utilization of the computer is at a level defined as idle
  • one time, at a specific date and time
  • on a recurring schedule

    Alerts

    An alert is an automatic response to a specific event. For e.g, an event can be a job that starts or system resources that reach a specific threshold.

    An alert can respond to SQL Server events, SQL Server performance conditions or Microsoft Windows Management Instrumentation (WMI) events and can notify one or more operators or run a job.

    Operators

    An operator defines contact information for an individual. SQL Server can notify operators of alerts through e-mail, pager (through e-mail) or net send.


    Resource Governor

    Resource Governor is used to manage SQL Server workload and system resource consumption. Resource Governor enables specifying limits on the amount of CPU, physical IO, and memory that incoming application requests can use.
    	ALTER RESOURCE GOVERNOR RECONFIGURE;	- enable resource governor
    	ALTER RESOURCE GOVERNOR DISABLE;
    Data Collector

    Data Collector enables collection of datasets from multiple SQL Server instances. The collected data is stored in a Management Data Warehouse (MDW). There are 3 system data collection sets - disk usage, server activity and query statistics.

    Back


    Maintained by: VINCENT KANDASAMY, Database Architect/Administrator (kandasf@hotmail.com)
    Last updated: Mar 19, 13:30