Best Practices for Backing up Large Databases

In an ideal world, hard drives and other hardware never fail, software is never defective, users do not make mistakes, and hackers are never successful. However, we live in a less than perfect world and we should plan and prepare to handle adverse events.

In todays topic, I will focus on best practices for backing up large mission critical databases. Performing and maintaining good backups is one of the top priority for any DBA/Developer/Engineer working with databases.

RULE OF THUMB! BACKUP AND RESTORE IS NOT A HIGH AVAILABILITY FEATURE. YOU MUST PERFORM REGULAR BACKUPS OF YOUR DATABASES.

RESTORING a database from a backup is simply a repair feature and not an availability feature. If you are running a mission-critical system and if your database requires high availability, then please look into various H/A features available with SQL Server.

If you are running a large/mission-critical database system than you need your database to be available continuously or for extended periods of time with minimal down-time for maintenance tasks. Therefore, the duration of situations that require databases to be restored must be kept as short as possible.

Additionally, if your databases are large, requiring longer periods of time to perform backup and restore than you MUST look into some of the cool new features that SQL Server offers to increase the speed of backup and restore operations to minimize the effect on users during both backup and restore operations.

Here are some such features that would help you improve database backups:

Use Multiple Backup Devices Simultaneously

If you are performing backups/restore on a large database than use multiple backup devices simultaneously to allow backups to be written to all the devices at the same time. Using multiple backup devices in SQL Server, allows database backups to be written to all devices in parallel. One of the potential bottleneck in backup throughput is the backup device speed. Using multiple backup devices can increase throughput in proportion to the number of devices used. Similarly, the backup can be restored from multiple devices in parallel.

Use Mirrored Media Set

Use a mirrored media set. A total of four mirrors is possible per media set. With the mirrored media set, the backup operation writes to multiple groups of backup devices. Each group of backup devices makes up a single mirror in the mirrored media set. Each single mirror set must use the same quantity and type of physical backup devices, and must all have the same properties.

Use Snapshot Backups (Fastest Backup)

This is the fastest way to perform backups on databases. A snapshot backup is a specialized backup that is created almost instantaneously by using a split-mirror solution obtained from an independent hardware and software vendor. Snapshot backups minimize or eliminate the use of SQL Server resources to accomplish the backup. This is especially useful for moderate to very large databases in which availability is very important. Snapshot backups and restores can be performed sometimes in seconds with very little or zero effect on the server.

Use Low Priority Backup Compression

Backing up databases using the newly introduced backup compression feature, could increase CPU usage and any additional CPU consumed by the compression process can adversely impact concurrent operations. Therefore, when possible create a low priority compressed backup whose CPU usage is limited by Resource Governor to prevent any CPU contention.

Use Full, Differential & Log Backups

If the database recovery model is set to FULL, than use different combination of backups (FULL, DIFFERENTIAL, LOG). This will help you minimize the number of backups that need to be applied to bring the database to the point of failure.

Use File / Filegroup Backups

Use file and file group backups and T-log backups. These allow for only those files that contain the relevant data, instead of the whole database, to be backed up or restored.

Use different disk for Backups

Do not use the same physical disk that holds database files or Log files for backup purposes. Using the same physical disk not only affects the performance, but also may reduce the recover-ability of the plan.

Creating Sequential Numbers in SQL Server

There are many applications that require incremental numbers for maintaining unique rows in a table. For several years, we have been using IDENTITY() column as our primary option to generate incremental numbers for unique records or for Primary key values and it has been the first choice for developers. You’ve tried IDENTITY() and now try the new feature introduced in SQL Server 2012 and experience the difference.

SQL Server introduces a brand new schema bound object called SEQUENCE. Sequence generates numeric values based on the specification of a SEQUENCE object. You can generate numeric values in either ascending or descending order and they can be independent of tables unlike IDENTITY columns.

Here’s the syntax on how you can create a SEQUENCE using various options.

Syntax of SEQUENCE:

CREATE SEQUENCE [schema_name . ] sequence_name
[ AS [ built_in_integer_type | user-defined_integer_type ] ]
[ START WITH ]
[ INCREMENT BY ]
[ { MINVALUE [ ] } | { NO MINVALUE } ]
[ { MAXVALUE [ ] } | { NO MAXVALUE } ]
[ CYCLE | { NO CYCLE } ]
[ { CACHE [ ] } | { NO CACHE } ]
[ ; ]

Lets now take a look at an example on how to use SEQUENCE in SQL Server.

  1. Creating a sequence
  2. Create a sample Item table
  3. Use Sequence while inserting data
  4. Query the table
  5. Reset Sequence values
  6. Alter or Delete Sequence values

Step 1: In this step, we will create a new object called ItemIDSequence that will start with a value of 1 and increment its value by 1.

CREATE SEQUENCE ItemIDSequence
       START WITH 1
       INCREMENT BY 1;

Step 2: In this step, we will create a sample ITEM table with just two columns.

CREATE TABLE ITEM 
        ( 
          ITEMNO INT, 
          ITEMNAME VARCHAR(50)
        );

Step 3: In this step, we will insert data in the ITEM table using ItemIDSequence which will automatically insert and increment the values of ItemIDSequence by 1.

INSERT INTO ITEM VALUES (NEXT VALUE FOR ITEMIDSEQUENCE, 'MANGO');
INSERT INTO ITEM VALUES (NEXT VALUE FOR ITEMIDSEQUENCE, 'APPLE');
INSERT INTO ITEM VALUES (NEXT VALUE FOR ITEMIDSEQUENCE, 'BANANA');
INSERT INTO ITEM VALUES (NEXT VALUE FOR ITEMIDSEQUENCE, 'ORANGE');
INSERT INTO ITEM VALUES (NEXT VALUE FOR ITEMIDSEQUENCE, 'GRAPES');

Step 4: In this step, we will query the ITEM table to review the data in the ITEM table. Notice that the ITEMNO column in the ITEM table has sequential numbers that were auto-inserted using ItemIDSequence.

SELECT * FROM ITEM
 
--You will see the following result set.
 
ITEMNO	    ITEMNAME
====================
1	    MANGO
2	    APPLE
3	    BANANA
4	    ORANGE
5	    GRAPES

You can Alter, Delete or Query any sequence using the below examples:

--In this step, we will look at Altering, Deleting & Querying Sequences:
 
ALTER, DELETE, AND QUERY SEQUENCES
 
--To delete a sequence:
 
DROP SEQUENCE ItemIDSequence
 
--To alter a sequence:
 
ALTER SEQUENCE ItemIDSequence
  RESTART WITH 900
  INCREMENT BY 1
GO
 
--To query the next value of Sequence for a particular SEQUENCE object:
 
SELECT NEXT VALUE FOR ItemIDSequence
 
--To query all sequences available in a database, you can use the below query:
 
SELECT * FROM SYS.SEQUENCES

Important and Best Practices for Sequence:

  1. Sequence values are not automatically protected after insertion into a table. You should use Update Trigger on a table to prevent sequence values from being changed.
  2. Sequence does not automatically enforce uniqueness for sequence values. You should create unique index on the sequence column to enforce uniqueness.
  3. If you have created a sequence (Example: 1 through 100) and if the rows in the table grows beyond 100 rows, SQL Server would start assigning values 1 through 100 again.
  4. By default, if you do not specify data type for a sequence, BIGINT data type is used. (Note: you can create sequence of any integer data type)

Using MAXDOP to Speed Up Index Creation

Ever noticed that even though you have a beefy server, SQL Server index operations sometime takes time? There are many things that happen behind the scenes that could contribute to the slowness of index operations.

Some of the reasons include:

  1. Load on SQL Server
  2. Out of date statistics
  3. Server configuration setting for degree of parallelism (MAXDOP)
  4. Amount of available memory
  5. Amount of available resources, etc.

SQL Server can intelligently detect the load on the server and auto-adjusts the amount of resources that can be allocated to a process. In the case of Index operations; if SQL Server is busy, it will automatically adjust the Max Degree of Parallelism server configuration setting to accommodate the process load. This means it could either grant or limit the number of CPUs Index operations can use. In most of the cases this is the best practice and you should not change the default value.

However, SQL Server can also use the index hint called MAXDOP. With MAXDOP indexing hint, you can now control the number of processors / CPU’s that can be used for performing index operations. Using MAXDOP indexing option enables parallelism for Index operations, which means it can use multiple processors to fulfill a single query statement which can potentially improve the performance of index operations such as:

  1. Creating an index
  2. Altering an index
  3. Rebuilding an index
  4. Dropping a clustered index, etc.

When you use the MAXDOP hint for any index operations, it will override the server level Max Degree of Parallelism value for that specific query. (All other Index operations with no MAXDOP will continue to honor the server level max degree of parallelism settings)

MAXDOP hint supports three values when used with any indexing operations:

  1. MAXDOP = 0 This is the default server level setting. This instructs SQL Server to grant CPU’s based on the available resources and server load.
  2. MAXDOP = 1 This value disables use of parallelism and enables the operations to execute serially.
  3. MAXDOP = 2 through 64 This value indicates SQL Server to use the number of CPU’s specified in the value to process index operations.

Keep in mind that if your SQL Server has only 8 CPU’s and you specify more CPU’s (Ex: MAXDOP = 32) then SQL Server will default to the number of CPU’s available on the system.

Here’s an example of how you can write an Index creation statement that would use 6 CPUs out of 8 when creating an index:

Create NonClustered Index EmailAddress_IDX on
  Employee.Contact(EmailAddress)
With (MAXDOP=6);

Note: Please make sure to test this option in the test environment before implementing it in the production environment.

Pagination in SQL Server using Offset and Fetch

SQL Server offers new query hints that allow you to implement query paging solution. In the past, we’ve used TOP operator to return the top number of rows from a table, however, OFFSET & FETCH query clauses can give you more benefits than just the TOP operator.

Let’s assume you have 50,000 records in a table and you want to query 1200 rows starting from 25000. Traditionally you would use a cursor to implement this solution however with the introduction of OFFSET and FETCH its much easier.

OFFSET and FETCH when used together can return the exact set of data you are looking for.

OFFSET: Allows you to offset the record pointer to a specific row in a table

FETCH: Allows you to fetch/return number of rows you request in Fetch.

Lets take an example of querying rows from a table that has 50,000 records. Lets query 1200 records starting from 25,000 thousand record.

Step 1 – Create a table

CREATE TABLE ITEMS
(
  ITEM_ID INT IDENTITY(1,1),
  ITEM_NAME VARCHAR(50)
);

Step 2 – Populate the table with data

INSERT INTO ITEMS VALUES 
      ('MANGO'),
      ('APPLE'),
      ('BANANA'),
      ('KIWI'),
      ('PLUMS'),
      ('GRAPES'),
      ('WATERMELON'),
      ('HONEYDEW'),
      ('CHERRY'),
      ('STRAWBERRY');

Step 3 – Query the table with Offset and Fetch

--IN THIS QUERY, WE ARE OFFSETTING/SKIPPING THE ROWS BY 3 RECORDS AND RETURNING THE NEXT 5 ROWS.
SELECT * FROM ITEMS
  ORDER BY ITEM_ID
  OFFSET 3 ROWS
  FETCH NEXT 5 ROWS ONLY
GO
 
--EXECUTING THE ABOVE STATEMENT WILL RETURN THE BELOW RESULT SET AS REQUESTED:
 
ITEM_ID	  ITEM_NAME
4	       KIWI
5	       PLUMS
6	       GRAPES
7	       WATERMELON
8	       HONEYDEW
 
(5 row(s) affected)
 
--NOTE: YOU CAN ALSO USE VARIABLES WITH OFFSET AND FETCH CLAUSES.
 
DECLARE @OFFSET INT=3, @FETCH INT=5
SELECT * FROM ITEMS ORDER BY ITEM_ID
  OFFSET @OFFSET ROWS
  FETCH NEXT @FETCH ROWS ONLY
GO
 
--EXECUTING THE ABOVE COMMAND WILL RETURN THE SAME RESULTS AS THE USING CONSTANT VALUE WITH OFFSET AND FETCH.
 
ITEM_ID	  ITEM_NAME
4	       KIWI
5	       PLUMS
6	       GRAPES
7	       WATERMELON
8	       HONEYDEW
 
(5 row(s) affected)

Concatenating Strings in SQL Server using CONCAT

CONCAT() string function allows you to concatenate up to 255 string or variable values in to one single string. It requires a minimum of two input values when calling the function. CONCAT takes care of implicitly converting the data values to a string by following the data type conversion rules of SQL Server. This eliminates the need of explicit data conversions when concatenating two values.

Note: NULL values are implicitly converted to an empty string. If all the variables passed as parameters in CONCAT function are NULL, an empty string of type VARCHAR(1) is returned.

Contact function syntax:

SELECT CONCAT (String_Value1, String_Value2, String_Value3 [, String_ValueN])

Let’s look at an example using Concat string function: In this example, we will use three strings stored in different variables and then concatenate the strings using Concat function.

Note: Concat function can concatenate up to 254 values.

Test 1 – Concatenating with CONCAT function

Declare @a Varchar(100)='Microsoft SQL Server is now '
Declare @b int=32
Declare @c varchar(200)=' years old'
Select CONCAT (@a, @b, @c)
Go
 
--Above script returns the following value:
Microsoft SQL Server is now 32 Years old

Test 2 – Let’s take similar example but with data type conversion:

Declare @a Varchar(100)='Microsoft SQL Server is now '
Declare @b int=32
Declare @c varchar(200)=' years old'
 
Select @a+Convert(varchar,(@b))+@c 
 
--Please note the convert function usage above
 
Returns:
Microsoft SQL Server is now 32 Years old

In the above examples, observe the use of CONCAT() function. There’s no data conversion being performed in the 1st test. However, in the 2nd, we are using data conversion function to convert Integer value to a string.

The data returned by CONCAT function depends on the type of values passed. The below table shows the mapping of input and output types:

Input Value / TypeOutput Value / Type
SQL-CLR (TYPES & UDT)NVARCHAR(MAX)
NVARCHAR(MAX)NVARCHAR(MAX)
NVARCHAR(<=4000)NVARCHAR(<=4000)
VARBINARY(MAX)NVARCHAR(MAX)
ALL OTHER INPUT TYPESVARCHAR(<=8000) *Unless one of the parameters is an NVARCHAR of any length and in that case, the output value would be in NVARCHAR(MAX)

NOTE: CONCAT function only works with all the versions above SQL12. However, to make this work in older versions of SQL Server, you will need to create your own custom function that can take care of concatenating values and appropriately converting their data types.