Couchbase Collections available with the upcoming version 7 brings Couchbase JSON database closer to that of other RDBMS. Collections will make it easier for customers who are looking to migrate their current RDBMS data model to Couchbase document model. This allows customers to benefit from Couchbase distributed shared-nothing database architecture, bringing high availability, and horizontal scalability. In addition to the flexibility of the data model, Couchbase N1QL is the only NoSQL database that provides N1QL, and extended SQL language, for querying its database.
What is a SCOPE
A bucket scope is equivalent to a schema in RDBMS. It is a logical container for Couchbase collections. Every Couchbase bucket contains a default scope. You can use these default containers directly, or define your own.
What is a COLLECTION
A collection is analogous to a table in RDBMS. Every scope has a default collection. While a collection can be used to store similar types of records, similar to RDBMS tables, there is no schema restriction on what you can store in a collection.
System keyspaces
Keyspaces | Description |
system:all_scopes | A list of all scopes, including system objects, such as the default scopes |
system:scopes | All available scopes except the system scopes |
system:all_collections | A list of all collections, including system objects, such as the default collections |
system:collections | All available collections except system collections |
Key benefits for N1QL Collections
From the N1QL language perspective, both the Data Definition Language(DDL) and Data Manipulation Language (DML) will be simpler as you no longer need a type field on every document.
Query will be simpler and more intuitive
Bucket model | Collection model | ||||
|
|
The collection model provides you with the equivalent of the relational table, bridging the similarity between the logical and the physical data model, that RDBMS users have accustomed to.
Simpler N1QL Join condition
Bucket model | Collection model | ||||
|
|
The need to use the type
field to restrict to a specific table of documents with the bucket model for the JOIN syntax is not longer required in the collection model.
Backward compatibility with Couchbase bucket model
Bucket model | Collection model | ||||
|
|
The bucket model is still supported as before. However, the bucket data is actually stored in the default scope and collection. Please note that the namespace prefix is required, and should have the value default:
for all references to the scope or collection.
N1QL Query with Collections
With the Collection model, the query engine needs to be aware of the full path of the collection name. This is because a collection name does not need to be unique in a bucket, but only in its own scope. A fully qualified collection name has the following format.
1 2 |
Format: namespace:bucket.scope.collection Example: namespace:`travel-sample`.booking.hotel |
However, you could reference a collection with its relative path by setting the query_context
Query Workbench UI allows you to set the query context by selecting the bucket and scope from the drop-down box.
The query context is also supported in SDKs, REST, and cbq.
Do you need to migrate to the Collections?
You do not need to migrate if you don’t want to.
- Data – All existing data remains in the same bucket. You can reference your documents using the bucket query syntax, or using the new default scope and collection.
- Queries – N1QL query syntax for the DDL and DML continue to support the bucket model.
- Indexes – The existing indexes will remain at the bucket level, and will continue to be available to all of your queries as before.
Migrating from Bucket to Collections
For this section, I will use the travel-sample
bucket as an example of to how to migrate from the bucket model to the collection model.
Data Migration
If your documents already have a field to identify their groups, then use that to create the collection. For the travel-sample, we can use the type
field as the collection name. In addition, I will also create an inventory scope for all the travel samples collectioms
1 2 3 4 5 6 7 8 9 10 11 |
CREATE SCOPE `travel-sample`.inventory; CREATE COLLECTION `travel-sample`.inventory.route; CREATE COLLECTION `travel-sample.inventory.landmark; CREATE COLLECTION `travel-sample`.inventory.airline; CREATE COLLECTION `travel-sample`.inventory.hotel; CREATE COLLECTION `travel-sample`.inventory.airport; |
In the above example, we have created an ‘inventory’ scope and added new collections in the same travel-sample
bucket.
Document Key
Document key needs to be unique. The document key that you currently have for your bucket should already be unique because they all exist in the same bucket. For that reason, the same document key should be adequate for use as a collection document key.
Copy the data
Here I use INSERT SELECT to copy the data from the bucket into the individual collection. I also use the existing META().id bucket key for the collection key.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
INSERT INTO `travel-sample`.inventory.landmark (KEY k, VALUE val) SELECT META().id k, t val FROM `travel-sample` t WHERE t.type='landmark' ; INSERT INTO `travel-sample`.inventory.airline (KEY k, VALUE val) SELECT META().id k, t val FROM `travel-sample` t WHERE t.type='airline' ; INSERT INTO `travel-sample`.inventory.hotel (KEY k, VALUE val) ELECT META().id k, t val FROM `travel-sample` t WHERE t.type='hotel' ; INSERT INTO `travel-sample`.inventory.airport (KEY k, VALUE val) SELECT META().id k, t val FROM `travel-sample` t WHERE t.type='airport' ; INSERT INTO `travel-sample`.inventory.route (KEY k, VALUE val) SELECT META().id k, t val FROM `travel-sample` t WHERE t.type='route' ; |
Index conversion
It is highly likely that you will need to modify your existing bucket indexes for them to be effective for the collection model. This section lists out the common patterns for bucket indexes, and recommends the steps to convert them to a collection based index.
Bucket Index with type
predicate – i.e. partial index
For a partial index with a specific type
field filter, you can simply re-create the new index on the specific collection for the type.
Bucket Model | Collection Model | ||||
|
|
Bucket Index without type predicate
A Global Secondary Index (GSI) index can be created for fields that may or may not exist in the document. For instance, an index may include the ‘icao’ field, but not every document may have the ‘icao’ field. For such an index, the indexer will only include the document that has the field. For this kind of index, you may need to be more specific, and create such an index for the collection where the field exists.
Bucket Model | Collection Model | ||||
|
|
Bucket index for common field
A bucket index can also be created without specifying the specific type, if the field exists in multiple document types. Consider the example where the field city
exists in multiple document types, i.e. airport, landmark and hotel. But the bucket model only has a single def_city
index, which can cover all three document types. For this index, you will now need to create a separate index for each collection.
Bucket Model | Collection Model | ||||
|
|
Query conversion
Since the underlying data model has changed from a shared bucket to an individual collection based. The queries that you have will also need to be modified. Furthermore, after you have modified the query, you will also need to ensure that the queries are using the new collection based indexes.
Bucket Model | Collection Model | ||||||
|
|