Many organizations looking to modernize their legacy databases are evaluating fully managed hosted DBaaS offerings that provide flexibility, performance and scale without compromising the trusted strengths of relational databases (RDBMS). With the introduction of Couchbase Capella DBaaS, customers can now benefit from exceptional price-performance to bring down their total cost of ownership (TCO).
However, ensuring that established data integration tools and processes can be used to move data into Couchbase Capella can be an important dependency in transitioning to cloud services. This is why we are excited to introduce the support of Couchbase 7.x in Talend Big Data Platform 8.0, covering both Couchbase Server and Capella.
Talend offers a range of data management capabilities, ranging from business process management to ETL and master data management. Talend’s new Big Data Platform 8.0 release contains an updated Couchbase connector, enabling the modernization and migration of business-critical applications.
Creating an ETL workflow for Couchbase Capella
This blog shows how to connect Talend with Couchbase Capella and perform a simple ETL workflow. Talend Big Data Platform will be leveraged to export relational data from a MySQL database into Couchbase using the JSON document format.
The workflow will read from a MySQL database table and create corresponding JSON documents in Couchbase Server. You can build upon this example and augment it to support more complex scenarios involving data transformations, complex mappings, etc.
The following steps have been tested on macOS but can be adapted for other systems.
Couchbase Capella offers a free 30-day trial and allows you to create your first cluster in a few clicks in just minutes. You can start your trial with a single click.
Just enter your nome, e-maile organization name. Once your email is verified, you can begin your trial. Simply select your preferred AWS region and click “Start Trial”.
The trial will create a single node Couchbase cluster in AWS. Once your cluster is created, we need to configure a data bucket for this example.
Navegue até Agrupamentos > Trial-Cluster > Baldes > Criar balde. We will name the bucket mundo.
Clique em Próximo to continue, accept the defaults and click Criar balde. Also, create the amostra de viagem bucket by navigating to Ferramentas > Importação and scroll down to Travel Sample and select Importação.
We will now need to configure access to the cluster from a remote workstation.
Navegue até Agrupamentos > Trial-Cluster > Conectar and scroll down to Acesso ao banco de dados > Manage Credentials, then +Create Database Credential.
Make sure to use the dropdown under Bucket Level Access and enter mundo, All Scopes, e Read/Write for the settings, then click Criar. Repeat for the amostra de viagem bucket.
Allowing local IP access to Capella
Next, we need to configure access to your local workstation as we will be running Talend Studio locally. Navigate to Conectar > Gerenciar IPs permitidos. From there, you can click + Adicionar meu IP.
The Couchbase Capella configuration is complete for now.
Install MySQL sample database
Next, we need to install and/or configure the MySQL server which is also running on the local workstation. I am using macOS and Homebrew to install additional packages.
To install MySQL via Homebrew, open a terminal and type: brew install mysql .
Download and import the mundo database:
- Baixar world.sql
- Start MySQL server, if not started already, and connect to server:
sudo mysql –u root –p
- Criar mundo database:
CREATE DATABASE world; - Load content from mundo database:
USE world;
SOURCE world.sql; - Verify the world database has been imported correctly:
SHOW TABLES; - The output should show three tables: City, País e CountryLanguage
Now it is time to install Talend Studio. You can download Talend Studio 8.0.1 from Talend Cloud. Navigate to your user and select Downloads:
Select Talend Studio and your platform from the dropdown and version 8.0.1.
Note: Talend recommends Java 11. For this exercise, I am using OpenJDK-11 installed via Homebrew with:
% brew install openjdk@11
Accept the license agreement once the Talend Studio is installed and provide a license file or connection to your Talend Cloud.
Install Couchbase connector for Talend Studio
After launching Talend Studio, you will need to install the Couchbase connector to proceed. Navigate through Help para o Feature Manager e procure por Couchbase.
After launching Talend Studio, navigate to Job Designs > Standard and right-click on New Job, enter the name MySQLToCouchbaseIntegration and click Acabamento.
This will create a panel where job artifacts can be dragged and dropped and should display the palette to the right. From the palette, drag and drop the tMyslInput widget to the left. Find the widget under Databases/MySQL.
From the palette, drag and drop tCouchbaseOutput widget (can be found under NoSQL Databases/Couchbase) to the right.
Double click on tMysqlInput on the panel; this will display the Component panel at the bottom.
Enter the MySQL database connection info.
- Clique no botão ... button, located to the right of Table Name, expand mundo e selecione o City entry and click on the OK botão.
- Remove the default query and replace with:
SELECT * FROM City
Below is an example screenshot of what it should look like:
Configure target schema
Clique no botão Guess schema button. This will bring up the Schema window; move and resize the window as appropriate.
Edit the schema with the following changes:
-
- Check Nullable for ID
- Check Nullable for Population
- Change ID type to String
Set DB types to the following and click on OK:
-
- INT for ID
- VARCHAR for fields Name, CountryCode, and District
- INT for Population
This screenshot shows all these schema settings:
Left-click on tMysqlInput and then right-click to select Row/Main and drag an arrow from the tMysqlInput component to tCouchbaseOutput. An arrow should be created between the two.
We will now navigate back to the Couchbase Capella console to get our connection endpoint. Navigate to Agrupamentos > Trial Cluster > Connections and copy the Rede de longa distância endpoint address:
Left-click on tCouchbaseOutput on the panel; this will display the Component panel at the bottom.
Enter Couchbase database connection info in the panel. Note that all fields must be surrounded by double quotes (i.e., “…”):
-
- O Bootstrap Nodes field is the value collected in the previous step preceded by couchbases://
- Provide credentials previously created in the Capella console for access.
- Set the Bucket name to world.
- Leave the default Document Type as JSON.
- Leave the field to use as ID as “ID”.
Clique no botão Executar panel to build the job and execute it. You can see progress in the Job window.
Once complete, the job will have added 4079 documents to the mundo balde:
You can access your Capella console directly or through the web control plane to verify they have loaded into the world bucket.
O Documentos view shows what is now loaded:
Talend component setup for Couchbase input
Next, we will use Couchbase as an input and run a N1QL query. This will allow you to interact with the Couchbase Scopes and Collections.
Navegue até Job Designs > Standard and right-click on New Job – enter the name CouchbaseInput and click Acabamento.
From the palette, drag and drop the tCouchbaseInput widget to the left. Find the widget under Databases NoSQL/Couchbase.
From the palette, drag and drop the tLogRow widget (found under Logs and Errors) to the right.
Double-click on tCouchbaseInput in the panel; this will display the Component panel at the bottom. Enter the Couchbase database connection info. Do Query Type dropdown, select N1QL and enter the query as follows:
1 2 |
SELECIONAR a.país DE padrão:`viagens-amostra`.inventário.companhia aérea a ONDE a.nome = "Excel Airways"; |
Below is an example screenshot of how it should look:
Left-click on tLogRow on the panel; this will display the Component panel at the bottom.
Change the field separator from “|” para “” and select Print Header.
Connect the two components.
Run the job. If successful, it should return the value Reino Unido.
Continue Talend & Couchbase learning
Here are some further resources to help you dig deeper:
-
-
Saiba mais sobre Talend Data Fabric.
-
Acesse o Talend Couchbase connector.
-
Comece a usar Couchbase Capella.
- For questions, reach out to us at partners@couchbase.com.
-