An introduction to SQL Server Index

In SQL Server, using indexes is one of the most important performance-tuning techniques. In this post, we are going to introduce index and what it can do for SQL Server, as well as the best practices to help you design effective indexes to boost performance in a SQL Server database.

1. What is an index?

An index is an on-disk structure associated with a table or view that is used to retrieve data from the database very fast. An index contains keys built from one or more columns in the table or view. These keys are stored in a structure (B-tree) that enables SQL Server to find the row or rows associated with the key values quickly and efficiently.

2. How index works?

Here is the scenario of running a query with and without an index.

When a query is executed, the query optimizer in SQL Server reliably chooses the most efficient method from all available methods to execute the query. It might be scanning one or more indexes, or doing a table scan if the index doesn’t exist.

If the database doesn’t have an index, the query optimizer would scan the entire table to find the record you need. For a small-sized table, that’s just fine. But for a huge table with millions of rows, for example, a table scan generates many resource intensive operations and that’s gonna be terribly inefficient.
When you add an index, it scans the records from the index instead. Generally, scanning the index is much faster than scanning the table because an index contains very few columns data and the rows are normal in sorted order, so it can quickly identify the location and navigate to the records you need in the table.

You can relate to the way that the index in a dictionary helps you quickly find and navigate to the exact page that contains the info you needed.

It narrows the search field and reduces the workload
It speeds up the querying process
It reduces the system resource usage

Indexes are automatically created when PRIMARY KEY and UNIQUE constraints are defined on table columns.

Note: Updating a table with indexes takes more time than updating a table without because the indexes also need an update.

3. Types of indexes

The lack of indexes is a poorly designed database, but having an index is not enough. You have to design a good index strategy with a variety of indexes for the query optimizer to choose from to deliver the optimal performance.

There are many types of indexes and they can be added, modified, and dropped without affecting the database schema or application design. The following table lists the types of indexes available in SQL Server.

Index type	Description
Hash	With a hash index, data is accessed through an in-memory hash table. Hash indexes consume a fixed amount of memory, which is a function of the bucket count.
memory-optimized Nonclustered	For memory-optimized nonclustered indexes, memory consumption is a function of the row count and the size of the index key columns
Clustered	A clustered index sorts and stores the data rows of the table or view in order based on the clustered index key. The clustered index is implemented as a B-tree index structure that supports fast retrieval of the rows, based on their clustered index key values.
Nonclustered	A nonclustered index can be defined on a table or view with a clustered index or on a heap. Each index row in the nonclustered index contains the nonclustered key value and a row locator. This locator points to the data row in the clustered index or heap having the key value. The rows in the index are stored in the order of the index key values, but the data rows are not guaranteed to be in any particular order unless a clustered index is created on the table.
Unique	A unique index ensures that the index key contains no duplicate values and therefore every row in the table or view is in some way unique. Uniqueness can be a property of both clustered and nonclustered indexes.
Columnstore	An in-memory columnstore index stores and manages data by using column-based data storage and column-based query processing. Columnstore indexes work well for data warehousing workloads that primarily perform bulk loads and read-only queries. Use the columnstore index to achieve up to 10x query performance gains over traditional row-oriented storage, and up to 7x data compression over the uncompressed data size.
Index with included columns	A nonclustered index that is extended to include nonkey columns in addition to the key columns.
Index on computed columns	An index on a column that is derived from the value of one or more other columns, or certain deterministic inputs.
Filtered	An optimized nonclustered index, especially suited to cover queries that select from a well-defined subset of data. It uses a filter predicate to index a portion of rows in the table. A well-designed filtered index can improve query performance, reduce index maintenance costs, and reduce index storage costs compared with full-table indexes.
Spatial	A spatial index provides the ability to perform certain operations more efficiently on spatial objects (spatial data) in a column of the geometry data type. The spatial index reduces the number of objects on which relatively costly spatial operations need to be applied.
XML	A shredded, and persisted, representation of the XML binary large objects (BLOBs) in the xml data type column.
Full-text	A special type of token-based functional index that is built and maintained by the Microsoft Full-Text Engine for SQL Server. It provides efficient support for sophisticated word searches in character string data.

Out of the list above, the most important index types are clustered index and nonclustered index.

Clustered:

Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be stored in only one order. 
The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.

Nonclustered:

Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains the nonclustered index key values and each key value entry has a pointer to the data row that contains the key value. 
The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key. 
You can add nonkey columns to the leaf level of the nonclustered index to by-pass existing index key limits, and execute fully covered, indexed, queries.

Need a good GUI Client to work with MS SQL Server? TablePlus provides a modern, native tool with intuitive UI to manage multiple relational databases in cluding SQL Server, MySQL, PostgreSQL, SQLite, Oracle…