This SQL and NoSQL comparison is the next step after converting your SQL Server database to Couchbase. In the previous post, I copied AdventureWorks from SQL Server to Couchbase.
In this post, I’m going to show an ASP.NET Core application that uses SQL Server, and how that same application would use Couchbase. If you’d like to follow along, you can check out the SqlServerToCouchbase project on GitHub.
Unlike the previous post, I’m making no attempt at an “automatic” conversion of an application. Instead, think of this more as a SQL and NoSQL comparison at the application level.
ASP.NET SQL Server Applications
I created a very simple ASP.NET Core REST API style application. I used Entity Framework, but if you’re using Dapper, ADO.NET, NHibernate, etc, you should still be able to follow along.
Each endpoint returns JSON. I’ve also added Swashbuckle to the project, so you can issue requests right from your browser via OpenAPI.
ASP.NET Couchbase Server Application
The Couchbase version of the application returns the same data, because it’s using the same SQL Server AdventureWorks data.
In the application, I’m using the Couchbase .NET SDK and Couchbase Transactions libraries. (You could use Linq2Couchbase as a type of Entity Framework replacement).
Otherwise, the application is the same, providing a SQL and NoSQL comparison (and contrast). The endpoints are returning JSON, and Swashbuckle is installed.
There’s one controller in each sample. Let’s go through each endpoint in the controller and perform a SQL and NoSQL comparison.
SQL and NoSQL Comparison: Get by ID
Let’s start with the GetPersonByIdAsync
endpoint. Given a person ID, this endpoint returns the Person data for the given ID.
SQL Server
Here’s the SQL Server example using Entity Framework:
1 2 3 4 5 6 7 8 |
[HttpGet("/person/{personId}")] public async Task<IActionResult> GetPersonByIdAsync(int personId) { var person = await _context.Persons .SingleOrDefaultAsync(p => p.BusinessEntityID == personId); return Ok(person); } |
I also wrote another version of this method, called GetPersonByIdRawAsync
which uses a “raw” SQL query. This query is very similar to the one that Entity Framework (above) ultimately generates, and it’s similar to a Dapper approach.
1 2 3 4 5 6 7 8 9 |
[HttpGet("/personRaw/{personId}")] public async Task<IActionResult> GetPersonByIdRawAsync(int personId) { var person = await _context.Persons .FromSqlRaw(@"SELECT * FROM Person.Person WHERE BusinessEntityID = {0}", personId) .SingleOrDefaultAsync(); return Ok(person); } |
Note that either way, a SQL query is being executed.
With N1QL, we could query the data in Couchbase in a very similar way. Here’s the GetPersonByIdRawAsync
in the Couchbase project:
1 2 3 4 5 6 7 8 9 10 |
[HttpGet("/personRaw/{personId}")] public async Task<IActionResult> GetPersonByIdRawAsync(int personId) { var bucket = await _bucketProvider.GetBucketAsync(); var cluster = bucket.Cluster; var personResult = await cluster.QueryAsync<Person>(@" SELECT p.* FROM AdventureWorks2016.Person.Person p WHERE p.BusinessEntityID = $personId", new QueryOptions().Parameter("personId", personId)); return Ok(await personResult.Rows.SingleOrDefaultAsync()); } |
(There’s an extra step going from “bucket” to “cluster”. This could be skipped, but I use bucket elsewhere in the controller, so I left it in).
However, using a N1QL query involves some extra overhead (indexing, query parsing, etc). With Couchbase, if we already know the Person ID, we can skip a N1QL query and do a direct key/value (K/V) lookup.
Get by ID with K/V
The key is already known; it’s given as an argument. Instead of using SQL, let’s do a key/value lookup. I did this in an endpoint method called GetPersonByIdAsync
:
1 2 3 4 5 6 7 8 9 |
[HttpGet("/person/{personId}")] public async Task<IActionResult> GetPersonByIdAsync(int personId) { var bucket = await _bucketProvider.GetBucketAsync(); var scope = await bucket.ScopeAsync("Person"); var coll = await scope.CollectionAsync("Person"); var personDoc = await coll.GetAsync(personId.ToString()); return Ok(personDoc.ContentAs<Person>()); } |
Unlike SQL Server, Couchbase supports a variety of APIs to interact with data. In this case, the key/value lookup will be pulling the Person document directly from memory. There’s no need to parse a SQL query or use any indexing. Key/value lookups in Couchbase will often be measured in microseconds.
My advice: use the key/value lookup whenever you can.
Get an expanded entity by ID
Data can be complex and span multiple tables (or multiple documents in the case of Couchbase). Depending on which tools you are using, you may have some functionality that can load related entities.
For instance, with Entity Framework, you can use an Include
to pull in related entities, as shown in this GetPersonByIdExpandedAsync
example:
1 2 3 4 5 6 7 8 9 |
[HttpGet("/personExpanded/{personId}")] public async Task<IActionResult> GetPersonByIdExpandedAsync(int personId) { var person = await _context.Persons .Include(p => p.EmailAddresses) .SingleOrDefaultAsync(p => p.BusinessEntityID == personId); return Ok(person); } |
Behind the scenes, Entity Framework may generate a JOIN and/or multiple SELECT queries to make this happen.
This is where any O/RM (not just Entity Framework) can be dangerous. Make sure to use a tool like SQL Profiler to see what queries are actually being run.
Note
|
O/RMs can help, but in a SQL to NoSQL comparison, it’s important to remember that impedance mismatch is much less of a problem in the NoSQL world. |
For the Couchbase sample, I’m not using Entity Framework, but instead I can use the NEST syntax that’s part of the N1QL extensions on the SQL standard. Here’s how the Couchbase version of GetPersonByIdExpandedAsync
looks:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
[HttpGet("/personExpanded/{personId}")] public async Task<IActionResult> GetPersonByIdExpandedAsync(int personId) { var bucket = await _bucketProvider.GetBucketAsync(); var cluster = bucket.Cluster; var personResult = await cluster.QueryAsync<Person>(@" SELECT p.*, EmailAddresses FROM AdventureWorks2016.Person.Person p NEST AdventureWorks2016.Person.EmailAddress EmailAddresses ON EmailAddresses.BusinessEntityID = p.BusinessEntityID WHERE p.BusinessEntityID = $personId", new QueryOptions().Parameter("personId", personId)); return Ok(await personResult.Rows.SingleOrDefaultAsync()); } |
NEST is a kind of JOIN that puts the JOINed data into a nested JSON object. Instead of using an O/RM to map the data, this data can be directly serialized into C# objects.
Paging query
Let’s look at an example where we do NOT have a single key to look up a piece of data. Let’s look at a method that returns a “page” of results (perhaps to populate a UI grid or list).
Paging in SQL Server
Here’s the SQL Server version of GetPersonsPageAsync
:
1 2 3 4 5 6 7 8 9 10 11 12 |
[HttpGet("/persons/page/{pageNum}")] public async Task<IActionResult> GetPersonsPageAsync(int pageNum) { var pageSize = 10; var personPage = await _context.Persons .OrderBy(p => p.LastName) .Skip(pageNum * pageSize) .Take(pageSize) .Select(p => new { p.BusinessEntityID, p.FirstName, p.LastName }) .ToListAsync(); return Ok(personPage); } |
With Entity Framework, OrderBy
, Skip
, and Take
are typically used for paging. If we open up SQL Server Profiler, the SQL this generates looks something like this:
1 2 3 4 |
exec sp_executesql N'SELECT [p].[BusinessEntityID], [p].[FirstName], [p].[LastName] FROM [Person].[Person] AS [p] ORDER BY [p].[LastName] OFFSET @__p_0 ROWS FETCH NEXT @__p_0 ROWS ONLY',N'@__p_0 int',@__p_0=10 |
OFFSET …​ ROWS FETCH NEXT …​
is the syntax being used for paging here.
Paging in Couchbase
Paging syntax always varies between SQL implementations. Couchbase leans more towards Oracle/MySQL syntax in this regard. Here’s the Couchbase version of GetPersonsPageAsync
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
[HttpGet("/persons/page/{pageNum}")] public async Task<IActionResult> GetPersonsPageAsync(int pageNum) { var pageSize = 10; var bucket = await _bucketProvider.GetBucketAsync(); var bucketName = bucket.Name; var cluster = bucket.Cluster; var personPage = await cluster.QueryAsync<Person>($@" SELECT p.LastName, p.BusinessEntityID, p.FirstName FROM `{bucketName}`.Person.Person p WHERE p.LastName IS NOT MISSING ORDER BY p.LastName LIMIT {pageSize} OFFSET {(pageNum * pageSize)} "); return Ok(await personPage.Rows.ToListAsync()); } |
In this case, LIMIT …​ OFFSET …​
is being used.
I also want to point out the WHERE p.LastName IS NOT MISSING
. Because Couchbase is a NoSQL database, the query engine cannot assume that LastName
will be in every document, even with ORDER BY p.LastName
. By adding this WHERE
clause, the query now knows which index to use. Without this, the query will take much longer to run.
Update with an ACID transaction
With the relational-style model that we’re using in both SQL Server and Couchbase for this example, ACID transactions will be important for both applications.
In these examples, there is a PersonUpdateApi
which will allow the user to update both a person’s name and their email address. Since this data is in two separate tables/rows (SQL Server) or two separate documents (Couchbase), we want this to be an all-or-nothing, atomic operation.
Note
|
An ID is specified for both (to simplify the API), since it’s possible (but rare in this dataset) for a person to have multiple email addresses. |
ACID with Entity Framework
Here’s an example of an ACID transaction using Entity Framework to update both a row of data in the Person table and a row of data in the EmailAddress table.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
[HttpPut("/person")] public async Task<IActionResult> UpdatePurchaseOrderAsync(PersonUpdateApi personUpdateApi) { var transaction = await _context.Database.BeginTransactionAsync(); try { // find the person var person = await _context.Persons .Include(p => p.EmailAddresses) .SingleOrDefaultAsync(p => p.BusinessEntityID == personUpdateApi.PersonId); // update name person.FirstName = personUpdateApi.FirstName; person.LastName = personUpdateApi.LastName; // get the particular email address and update it // if the supplied ID is invalid, this will throw an exception var email = person.EmailAddresses.Single(e => e.EmailAddressID == personUpdateApi.EmailAddressId); email.EmailAddress = personUpdateApi.EmailAddress; await _context.SaveChangesAsync(); // commit transaction await transaction.CommitAsync(); return Ok($"Person {personUpdateApi.PersonId} name and email updated."); } catch (Exception ex) { await transaction.RollbackAsync(); return BadRequest("Something went wrong, transaction rolled back"); } } |
Note the four main parts of a transaction:
- Begin transaction (
_context.Database.BeginTransactionAsync();
) try
/catch
- Commit transaction (
await transaction.CommitAsync();
) - Rollback transaction in the
catch
(transaction.RollbackAsync();
)
This is an important feature where a SQL and NoSQL comparison has changed in recent years. With Couchbase, ACID transactions are now possible.
ACID with a Couchbase transaction
With Couchbase, the API is slightly different, but the same steps are all there:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
[HttpPut("/person")] public async Task<IActionResult> UpdatePurchaseOrderAsync(PersonUpdateApi personUpdateApi) { // setup bucket, cluster, and collections var bucket = await _bucketProvider.GetBucketAsync(); var scope = await bucket.ScopeAsync("Person"); var personColl = await scope.CollectionAsync("Person"); var emailColl = await scope.CollectionAsync("EmailAddress"); // create transaction var cluster = bucket.Cluster; var transaction = Transactions.Create(cluster, TransactionConfigBuilder.Create() .DurabilityLevel(DurabilityLevel.None) .Build()); try { await transaction.RunAsync(async (context) => { // update person and email documents // based on values passed in API object var personKey = personUpdateApi.PersonId.ToString(); var emailKey = personKey + "::" + personUpdateApi.EmailAddressId.ToString(); var person = await context.GetAsync(personColl, personKey); var email = await context.GetAsync(emailColl, emailKey); var personDoc = person.ContentAs<dynamic>(); var emailDoc = email.ContentAs<dynamic>(); personDoc.FirstName = personUpdateApi.FirstName; personDoc.LastName = personUpdateApi.LastName; emailDoc.EmailAddress = personUpdateApi.EmailAddress; await context.ReplaceAsync(person, personDoc); await context.ReplaceAsync(email, emailDoc); }); return Ok($"Person {personUpdateApi.PersonId} name and email updated."); } catch (Exception ex) { return BadRequest("Something went wrong, transaction rolled back."); } } |
The same main steps are:
- Begin transaction (
transaction.RunAsync( …​ )
) try
/catch
- Commit transaction (implied, but
context.CommitAsync()
could be used) - Rollback transaction (again, implied, but
context.RollbackAsync()
could be used).
In both cases, we have an ACID transaction. Unlike SQL Server, however, we can later optimize and consolidate the data in Couchbase to reduce the amount of ACID transactions that we need and increase performance.
Stored Procedures: a SQL and NoSQL comparison
Stored procedures are a sometimes controversial topic. Generally speaking, they can contain a lot of functionality and logic.
Stored Procedure in SQL Server
I’ve created a stored procedure called “ListSubcomponents” (you can view the full details on GitHub). With Entity Framework, you can use FromSqlRaw
to execute it and map the results to C# objects. I’ve created a pseudo-entity C# object called ListSubcomponents
that’s used just for this sproc:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
// sproc example - see ExampleStoredProcedure.sql [HttpGet("/getListSubcomponents/{listPriceMin}/{listPriceMax}")] public async Task<IActionResult> GetListSubcomponents(decimal listPriceMin, decimal listPriceMax) { var listPriceMinParam = new SqlParameter("@ListPriceMin", SqlDbType.Decimal) {Value = listPriceMin }; var listPriceMaxParam = new SqlParameter("@ListPriceMax", SqlDbType.Decimal) {Value = listPriceMax }; var result = await _context.ListSubcomponents .FromSqlRaw("EXECUTE dbo.ListSubcomponents @ListPriceMin, @ListPriceMax", listPriceMinParam, listPriceMaxParam) .ToListAsync(); return Ok(result); } |
The stored procedure has two parameters.
Couchbase User-defined function
Couchbase doesn’t have anything called a “stored procedure” (yet), but it does have something called a user-defined function (UDF) that can also contain complex logic when necessary.
I created a UDF called ListSubcomponents
(which you can also view on GitHub) which matches the functionality of the SQL Server sproc.
Here’s how to execute that UDF from ASP.NET:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
// sproc example - see ExampleStoredProcedure.sql [HttpGet("/getListSubcomponents/{listPriceMin}/{listPriceMax}")] public async Task<IActionResult> GetListSubcomponents(decimal listPriceMin, decimal listPriceMax) { var bucket = await _bucketProvider.GetBucketAsync(); var cluster = bucket.Cluster; var options = new QueryOptions(); options.Parameter("$listPriceMin", listPriceMin); options.Parameter("$listPriceMax", listPriceMax); var result = await cluster.QueryAsync<ListSubcomponent>( "SELECT l.* FROM ListSubcomponents($listPriceMin, $listPriceMax) l", options); return Ok(await result.Rows.ToListAsync()); } |
Invoking it in Couchbase with two parameters is very similar to using FromSqlRaw with Entity Framework.
Performance – SQL to NoSQL Comparison
Now that I’ve converted the app to use Couchbase, does the new version run at least as fast as the old SQL Server version?
It’s a complicated question to answer because:
- I haven’t made ANY optimizations to the data model. I’m still using the literal conversion of data from the previous post.
- Data access can vary wildly from use case to use case.
- Environments can vary wildly from person to person, company to company.
However, I wanted to do some ‘back of the envelope’ load testing as a sanity check.
I ran both applications on my local machine, and I used ngrok to expose them to the Internet. I then used loader.io (an excellent tool for load testing with concurrency). I then ran some quick performance tests against only the ‘paging’ endpoint. This is the endpoint I’m most concerned about for performance, and I also think it’s the most “apples to apples” SQL and NoSQL comparison among the endpoints.
Load testing SQL and NoSQL comparison
Here are the results of the SQL Server application:
And here are the results of the Couchbase Server application:
Interpreting the results of the SQL and NoSQL comparison load test
This is not meant to be a benchmark or data point saying “Couchbase is faster than SQL Server”.
It is only meant to be a sanity check.
If I’m not getting at least as good performance under load as I was before, maybe I’m doing something wrong. This is a crucial benefit to the proof of concept process. Even though Couchbase, especially Couchbase 7, is very relational-friendly, there are still differences and nuances between every database, and this process will help you identify the differences that matter most to you and your project.
If you are looking for more robust benchmarks, here are some resources that you can check out:
- Benchmark Reports from Altoros (3rd party)
- Cloud Benchmarks
- Couchbase Server “ShowFast” benchmarks
Conclusion
The SQL and NoSQL comparison and conversion of the application code, combined with some very basic load testing shows me that I can:
- Host a relational data model as-is, no modeling changes
- Convert ASP.NET endpoint(s) to use the Couchbase SDK
- Expect at least as good performance to start, with plenty of room to scale and improve, at a low risk.
Your use case may vary, but also remember that during this conversion, Couchbase gave us:
- Easy horizontal scalability
- High availability
- Built-in caching
- Schema flexibility (which is probably why you’re looking to use Couchbase in the first place).
Appendix
Here is a succinct guide to the SQL and NoSQL comparison that I made in the application.
SQL Server operation | Couchbase operation |
---|---|
Read/write one row/entity |
|
Read/write multiple rows/pages |
|
SELECT one entity with related entities |
N1QL query with NEST |
BeginTransaction |
|
Stored Procedure |
Reminders:
- Switch to key/value API when you can
- Use indexing, indexing plan visualization, and index advisor when writing N1QL
- Use an ACID transaction (only) when you need to
- Think about performance goals, and establish a way to test
Next steps
Check out Couchbase Server 7, currently in beta, today. It’s a free download. Try loading your relational data into it, converting a few endpoints, and see if the process works for you.