Combining Binary Large Object (BLOBs) with relational databases like SQL Server has always been like combining oil and water. They just don’t mix very well. BLOB objects like pictures, videos, text documents and xml documents don’t fit within the standard relational data types. SQL Server originally offered support for BLOBs using the TEXT, NTEXT and IMAGE data types. While all these data types still exists today in SQL Server they have limitations and have been depreciated since SQL Server 2005. First, they were limited to 2GB in size which is far bigger that most other relational data type but unfortunately 2GB is not large enough to deal will the bigger BLOB objects. You can’t use these data types as local variables in stored procedures or T-SQL batches. They also can’t be as part of an index, or in WHERE, ORDER BY, COMPUTE, or GROUP BY clauses. In addition, from the development side you need to work with them differently than the traditional data types. Finally, SQL Server uses the Buffer Pool to process them impacting the overall performance of the system. These issues caused most application developers to store BLOB data in the NTFS file system.
You can watch a video summary of this information here:
The FILESTREAM data type is Microsoft’s answer to SQL Server BLOB storage. The FILESTREAM data type lets you combine the best of both worlds. It uses the higher performance native NTFS file system to store and access BLOB data which means that the size limitations are lifted to maximum file size supported by the NTFS file systems which is 16 TB. It removes the need to process BLOB data in the Buffer Pool. At the same time it provides you with full data consistency. The FILESTREAM data type stores pointers in each column that connect the relational data in the row to the FILESTREAM data stored in the file system. This enables that data relationships to be consistent and durable. For instance, if you backup a database that’s using the FILESTREAM data type all of the file system data will be backed up right along with your relational data.
Consideration for using the FILESTREAM Data Type
Like you might expect there are some considerations for using FILESTREAM data.
- Table with FILESTREAM columns must have a nonnull unique row ID
- FILESTREAM filegroups support multiple data containers
- FILESTREAM data containers cannot be nested
- FILESTREAM filegroups are supported with failover clustering but the filegroups must be on shared disk resources
- FILESTREAM filegroups are supported by AlwaysOn Availability Groups
- FILESTREAM filegroups can be on compressed volumes
- FILESTREAM filegroups and containers should reside on volumes other than the operating system, paging file, SQL Server database, SQL Server log, or tempdb
Enabling the FILESTREAM Data Type
FILESTREAM support is not enabled by default. It can be enabled during the installation process or to can be enabled after the installation. During installation of the SQL Server relational engine the SQL Server Installation Center will display a FILESTREAM tab that enables you to enable FILESTREAM support. To enable FILESTREAM support after SQL Server has been installed you can use SQL Server Configuration Manager. Open SQL Server Configuration Manager and right click your SQL Server relational database instance then select Properties from the context menu to open the Properties dialog that you can see in Figure 1.
Figure 1 – Enabling FILESTREAM support
On the FILESTREAM tab check the level of FILESTREAM support you require. The Enable FILESTREAM for Transact-SQL access enables basic FILESTREAM support. Enabling FILESTREAM for file I/O access lets users access the FILESTREAM data externally from the Windows OS. If you enable FILESTREAM for file I/O access then you need to supply a Windows share name. In Figure 1 the share name is MSSQLSERVER. If remote clients on networked systems need to access the FILESTREAM data that is stored on this share then check Allow remote clients to have streaming access to FILESTREAM data. After you enable FILESTREAM support you need to restart the SQL Server service.
The following T-SQL code shows how you can create a new database named MyFileStreamDB that provides support for the FILESTREAM data type.
CREATE DATABASE MyFileStreamDB ON PRIMARY ( NAME = MyFileStreamDB, FILENAME = 'C:\Temp\MyFileStreamDB.mdf'), FILEGROUP MyFileStreamDBFS CONTAINS FILESTREAM( NAME = MyFileStreamDBFS, FILENAME = 'C:\Temp\MyFileStreamDBFS') LOG ON ( NAME = MyFileStreamDBLOG, FILENAME = 'C:\Temp\MyFileStreamDBLOG.ldf') GO
Specifying the CONTAINS FILESTREAM keyword as part of the CREATE DATABASE statement enables FILESTREAM support for this database.
FILESTREAM storage is implemented as a VARBINARY(MAX) column where the data is stored in the file system. The sizes of the BLOBs are limited only by the volume size of the file system. The maximum NTFS file size is 16 TB. The following listing shows you how to create a table that uses FILESTREAM data.
CREATE TABLE [dbo].[MyFileStreamTable]( [FSID] UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE, [FSDescription] VARCHAR(50), [FSBLOB] VARBINARY(MAX) FILESTREAM NULL)
The first column is the required unique nonnull ID. The FSBLOB column uses the VARBINARY(MAX) FILESTREAM data type to indicate that this column will be stored as FILESTREAM data.