In my many conversations with customers during Microsoft events, people often confuse between the terms ‘In Memory’ and ‘Memory-Optimized’ and many think that they are one and the same. If you continue reading this blog, you will realize that they are somewhat related but can lead to very different performance/scalability.
To understand this, let us travel back in time few years when the size of OLTP databases was much larger than the memory available on the Server. For example, your OLTP database could be 500GB while your Server box has 128 GB of memory. We all know the familiar strategy to store the data in pages. SQL Server supports 8k pages and brings pages in/out of memory as needed by deploying complex heuristics as implemented as part of Buffer Pool. When running a query, if the PAGE containing the requested row(s) in not in memory, an explicit physical IO is done to bring it into memory. This impacts query performance negatively. Today, you can buy a Server class machine under $10k with 1 TB of physical memory that keep your full 500GB database in memory which can improve the performance of your workload by removing bottleneck due to IO path. This is what I refer to as ‘your database is in memory’. However, the more important question to be asked ‘Is your database optimized for memory?’.
Let us consider a simple query on the employee table which is fully in memory including all its indexes
SELECT NAME
FROM Employee
WHERE SSN = '123-44-4444'
Assuming you have an nonclustered index on SSN column, SQL Server fetches this row by traversing the nonclustered index starting from the root page, multiple intermediate pages and then finally landing to the leaf-page. If this index is not a covering index, the SQL Server now needs to traverse the clustered index to the data page to ultimately find the row as shown in the picture below. Assuming that each index was 4 levels deep, SQL Server needed to eight pages and for each page, it has to take a share latch on the page, search it to find the next pointer and the releasing the latch. This is a lot of work assuming the requested row is guaranteed to be in memory. I consider this case where database or the table is ‘in memory’ but it is not optimized for ‘in memory’. If you think about it, this inefficiency occurs because database or tables are organized as pages, a right decision for the time when databases were much larger than the size of available memory but not a right decision for today.
SQL Server In-Memory OLTP engine is designed for the case where table(s) are guaranteed to be in memory and it exploits this fact to deliver significantly higher performance. There are no pages and indexes can access the data rows directly. For example, a hash index, on SSN column can be used to find direct pointer to the requested row by first hashing it on the key and then traversing the pointer to find row as shown in the picture below.
This is just one example to show that with SQL Server In-Memory OLTP engine, your data is not only in memory but it is also optimized for in memory access. Please refer to https://blogs.technet.microsoft.com/dataplatforminsider/tag/in-memory/ for more information in In-Memory OLTP engine.
Thanks,
Sunil Agarwal
SQL Server Tiger Team
Follow us on Twitter: @mssqltiger | Team Blog: Aka.ms/sqlserverteam