Many organizations looking to modernize their legacy databases are evaluating fully managed hosted DBaaS offerings that provide flexibility, performance and scale without compromising the trusted strengths of relational databases (RDBMS). With the introduction of Couchbase Capella DBaaS, customers can now benefit from exceptional price-performance to bring down their total cost of ownership (TCO).

However, ensuring that established data integration tools and processes can be used to move data into Couchbase Capella can be an important dependency in transitioning to cloud services. This is why we are excited to introduce the support of Couchbase 7.x in Talend Big Data Platform 8.0, covering both Couchbase Server and Capella.

Talend offers a range of data management capabilities, ranging from business process management to ETL and master data management. Talend’s new Big Data Platform 8.0 release contains an updated Couchbase connector, enabling the modernization and migration of business-critical applications.

Creating an ETL workflow for Couchbase Capella

This blog shows how to connect Talend with Couchbase Capella and perform a simple ETL workflow. Talend Big Data Platform will be leveraged to export relational data from a MySQL database into Couchbase using the JSON document format.

The workflow will read from a MySQL database table and create corresponding JSON documents in Couchbase Server. You can build upon this example and augment it to support more complex scenarios involving data transformations, complex mappings, etc.

The following steps have been tested on macOS but can be adapted for other systems.

Couchbase Capella offers a free 30-day trial and allows you to create your first cluster in a few clicks in just minutes. You can start your trial with a single click.

talend-couchbase-connector10

Just enter your nome, e-maile organization name. Once your email is verified, you can begin your trial. Simply select your preferred AWS region and click “Start Trial”.

Couchbase Capella signup

The trial will create a single node Couchbase cluster in AWS. Once your cluster is created, we need to configure a data bucket for this example.

Navegue até Agrupamentos > Trial-Cluster > Baldes > Criar balde.  We will name the bucket mundo.

Clique em Próximo to continue, accept the defaults and click Criar balde. Also, create the amostra de viagem bucket by navigating to Ferramentas > Importação and scroll down to Travel Sample and select Importação.

Import Couchbase sample data

We will now need to configure access to the cluster from a remote workstation.

Navegue até Agrupamentos > Trial-Cluster > Conectar and scroll down to Acesso ao banco de dados > Manage Credentials, then +Create Database Credential.

Create database credentials couchbase cloud

Make sure to use the dropdown under Bucket Level Access and enter mundo, All Scopes, e Read/Write for the settings, then click Criar. Repeat for the amostra de viagem bucket.

Allowing local IP access to Capella

Next, we need to configure access to your local workstation as we will be running Talend Studio locally. Navigate to Conectar > Gerenciar IPs permitidos. From there, you can click + Adicionar meu IP.

Add allowed IP for Couchbase cloud capella

The Couchbase Capella configuration is complete for now.

Install MySQL sample database

Next, we need to install and/or configure the MySQL server which is also running on the local workstation. I am using macOS and Homebrew to install additional packages.

To install MySQL via Homebrew, open a terminal and type: brew install mysql . 

Download and import the mundo database:

  • Baixar world.sql
  • Start MySQL server, if not started already, and connect to server:
    sudo mysql –u root –p
  • Criar mundo database:
    CREATE DATABASE world;
  • Load content from mundo database:
    USE world;
    SOURCE world.sql;
  • Verify the world database has been imported correctly:
    SHOW TABLES;
  • The output should show three tables: City, País e CountryLanguage

Now it is time to install Talend Studio. You can download Talend Studio 8.0.1 from Talend Cloud. Navigate to your user and select Downloads:

Install Talend Studio

Select Talend Studio and your platform from the dropdown and version 8.0.1.

Download Talend Studio from Talend Cloud

Note: Talend recommends Java 11. For this exercise, I am using OpenJDK-11 installed via Homebrew with:

% brew install openjdk@11

Accept the license agreement once the Talend Studio is installed and provide a license file or connection to your Talend Cloud.

Install Couchbase connector for Talend Studio

After launching Talend Studio, you will need to install the Couchbase connector to proceed. Navigate through Help para o Feature Manager e procure por Couchbase.

Talend feature manager with Couchbase

After launching Talend Studio, navigate to Job Designs > Standard and right-click on New Job, enter the name MySQLToCouchbaseIntegration and click Acabamento.

Create new job in Talend for Couchbase integration

This will create a panel where job artifacts can be dragged and dropped and should display the palette to the right. From the palette, drag and drop the tMyslInput widget to the left. Find the widget under Databases/MySQL.

couchbase input for talend operator

From the palette, drag and drop tCouchbaseOutput widget (can be found under NoSQL Databases/Couchbase) to the right.

Double click on tMysqlInput on the panel; this will display the Component panel at the bottom.

Enter the MySQL database connection info.

  • Clique no botão ... button, located to the right of Table Name, expand mundo e selecione o City entry and click on the OK botão.
  • Remove the default query and replace with:
    SELECT * FROM City

Below is an example screenshot of what it should look like:

Couchbase input to Talend

Configure target schema

Clique no botão Guess schema button. This will bring up the Schema window; move and resize the window as appropriate.

Edit the schema with the following changes:

    • Check Nullable for ID
    • Check Nullable for Population
    • Change ID type to String

Set DB types to the following and click on OK:

    • INT for ID
    • VARCHAR for fields Name, CountryCode, and District
    • INT for Population

This screenshot shows all these schema settings:

Couchbase schema in Talend loader

Left-click on tMysqlInput and then right-click to select Row/Main and drag an arrow from the tMysqlInput component to tCouchbaseOutput. An arrow should be created between the two.

tcouchbaseoutput in Talend

We will now navigate back to the Couchbase Capella console to get our connection endpoint. Navigate to Agrupamentos > Trial Cluster > Connections and copy the Rede de longa distância endpoint address:

Couchbase cloud network endpoint setup

Left-click on tCouchbaseOutput on the panel; this will display the Component panel at the bottom.

Enter Couchbase database connection info in the panel. Note that all fields must be surrounded by double quotes (i.e., “…”):

    • O Bootstrap Nodes field is the value collected in the previous step preceded by couchbases://
    • Provide credentials previously created in the Capella console for access.
    • Set the Bucket name to world.
    • Leave the default Document Type as JSON.
    • Leave the field to use as ID as “ID”.

talend-couchbase-connector16

Clique no botão Executar panel to build the job and execute it. You can see progress in the Job window.

talend-couchbase-connector3

Once complete, the job will have added 4079 documents to the mundo balde:

tcouchbaseoutput in Talend

You can access your Capella console directly or through the web control plane to verify they have loaded into the world bucket.

Viewing documents in Couchbase

O Documentos view shows what is now loaded:

JSON document list in Couchbase web console

Talend component setup for Couchbase input

Next, we will use Couchbase as an input and run a N1QL query. This will allow you to interact with the Couchbase Scopes and Collections.

Navegue até Job Designs > Standard and right-click on New Job – enter the name CouchbaseInput and click Acabamento.

Couchbase input job in Talend

From the palette, drag and drop the tCouchbaseInput widget to the left. Find the widget under Databases NoSQL/Couchbase.

talend-couchbase-connector9

From the palette, drag and drop the tLogRow widget (found under Logs and Errors) to the right.

talend-couchbase-connector18

Double-click on tCouchbaseInput in the panel; this will display the Component panel at the bottom. Enter the Couchbase database connection info. Do Query Type dropdown, select N1QL and enter the query as follows:

Below is an example screenshot of how it should look:

tcouchbaseinput component in Talend

Left-click on tLogRow on the panel; this will display the Component panel at the bottom.

Change the field separator from “|” para “” and select Print Header.

talend-couchbase-connector7

Connect the two components.

talend-couchbase-connector1

Run the job. If successful, it should return the value Reino Unido.

talend-couchbase-connector23

Continue Talend & Couchbase learning

Here are some further resources to help you dig deeper:

Autor

A ordem da postagem em relação a outras postagens. David Schexnayder, arquiteto de soluções de parceiros principais

Posição vertical a partir do topo para iniciar o corte como uma porcentagem da altura da imagem.