+ All Categories
Home > Documents > Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER...

Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER...

Date post: 30-Dec-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
62
Transcript
Page 1: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes
Page 2: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Operations Manual for LxDB

Page 3: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  1

2. Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  2

2.1. Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  2

2.2. Unpacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  2

2.3. Simple Configuration. The inventory file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  3

2.4. Completing the Bare Metal Installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  6

2.5. Advanced Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  6

3. High Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  8

3.1. Configuring High Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  8

3.2. Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  8

3.3. Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  10

4. Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  12

4.1. Start, Stop & Check your LeanXcale Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  12

4.2. Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  14

4.3. LOGs & Alerts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  16

4.4. Backup & Recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  17

4.5. Data Partitioning & Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  20

4.6. Authentication, SSL & Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  21

4.7. System DB Tables & Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  22

4.8. LXSYSMETA.TABLE_CHECKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  24

4.9. LXSYSMETA.TRANSACTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  24

4.10. LXSYSMETA.CONNECTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  25

4.11. Cancelling Transactions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  26

4.12. Closing a Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  26

5. The Admin Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  27

5.1. Common functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  27

5.2. Data Distribution & Replication Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  28

5.3. Advanced Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  28

6. LeanXcale Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  30

6.1. lxClient. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  30

6.2. lxBackup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  32

6.3. lxCSVLoad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  34

6.4. lxDataGenerator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  35

6.5. lxSplitFromCSV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  40

6.6. checkdb.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  41

7. Query Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  42

Page 4: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

7.1. Enable hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  42

7.2. Disable hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  42

7.3. List hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  43

7.4. Define hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  43

7.5. Force Join Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  44

7.6. Or to Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  44

7.7. Parallel Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  44

7.8. Remove Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  45

7.9. Clean Query Plan Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  46

8. Annex I. Advanced Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  47

8.1. Inventory Configuration File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  47

8.2. Resource Allocation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  50

8.3. Filesystems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  52

9. Annex II. Alerts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  54

9.1. Events from the Query Engine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  54

9.2. Events regarding Health Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  57

9.3. Configuration Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  57

9.4. Transaction Log Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  58

Page 5: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Chapter 1. Introduction

Lx-DB is an ultrascalable distributed database. It is built of several components that can scale

out to cope with your workload. Speaking of a database you can get better performance using

a bare metal installation, but Docker has significant advantages to deploy a technology so this

document is targeted at explaining how to deploy Lx-DB using docker and start working with

it.

Since being distributed, Lx-DB can be deployed in very different ways and getting the most

efficient depends on the workload. Lx-DB provides some interesting elasticity capabilities,

but this is not currently automated. LeanXcale team is working on a Elastic Manager that can

automatically adapt your deployment to the workload. Right now, for the sake of simplicity,

the deployment is based on a single Docker image in which you can start just one component

or a set of components.

This document assumes you have already pulled Lx-DB docker image and you are ready to

run it in your machines. Different instances of the docker image running different components

can be orchestrated through Kubernetes, Docker Compose, … but we are assuming you are

familiar with those technologies so the document is focused on the general docker concepts.

To get into the details and try to take the most of the deployment you should read the

Architecture and Concepts documents and then go to the section Docker Deployment Details.

LeanXcale Concepts

Component Architecture

1

Page 6: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Chapter 2. Installation

2.1. Dependencies

In order for the installation to be able to execute, the host and all target machines must meet

some prerequisites that need to be ensured before the installation takes place. These are

summarized as follows:

• The machines can use either an Ubuntu, CentoOS, or RHEL Linux operating system.

• You need the following (very common) components to be installed:

- Ansible (>= 2.6)

- JAVA release 8

- Python 2 (>= 2.7)

- bash

- screen

- libpthread

- numactl (unless you don’t want to take advantage of NUMA advantages)

- nc (netcat) To check ports are bound

• All the machines should have the LeanXcale user created (USER) and the base folder for

the install (BASEDIR). This folder must be owned by USER.

• All the machines should have SSH connectivity and the right ssh keys are set for the user

controlling the cluster (USER) so It can get access to any machine in the cluster without

any further authentication mechanism. This way SCRIPTs can deploy, configure, start and

stop remotely not having to ask for a myriad of passwords for each action.

2.2. Unpacking

The first step for installation is choosing a machine from the cluster to be the master or

orchestrator of the cluster (Usually It is a good idea to have at least two machines that can

play this role). Connect to that machine and set BASEDIR environment variable preferably in

your .profile.

export BASEDIR=/ssd/lxdb

2

Page 7: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Now you can unpack the TAR installation package in the base directory in the MASTER

server and define the BASEDIR:

cd $BASEDIRtar xvfz LeanXcale_v{Version}.tgzsource ./env.sh

Now the master server of the cluster is ready to be configured so you can later deploy the

installation to the rest of the servers.

2.3. Simple Configuration. The inventory file

For configuring the cluster the basic information you need to know is just the hostnames of

the machines you want to deploy on.

With that information you can just configure the servers of the cluster in the inventory file

vim conf/inventory

Following you can see an example of a simple configuration file:

3

Page 8: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

[all:vars]#BASEDIR for the installation is taken from the environmentvariable $BASEDIRBASEDIR="{{ lookup('env','BASEDIR') }}"#FOLDER to store BACKUPs inBACKUPDIR=/ssd/leandata/backup/#Login account for the clusterUSER="{{ lookup('env','USER') }}"# If {{USER}} has SUDO permissions add SUDO=sudo. Otherwise you mayneed to# create some folders and grant access to the USER for some actionsto workSUDO="sudo"#SUDO="echo"

# If you want to use NUMA say yesNUMA=noUSEIP=yesHA=noHAPROXY=yes

netiface=eth0forcenkvds=1forcememkvds=2Gforcencflm=1forcecflm=1

[defaults]#The following line tells the configuration SCRIPT to get thevalues from the#machine. All resources will be allocated to LX-DBsizing mem=- sockets=- ncores=- nthreadscore=- fsdata=. fslog=.

#Metadata servers. Currently, you can configure only one[meta]bladeMETA sockets=- ansible_connection=local

#Datastore Servers. You can have multiple data store instances[datastores]bladeDATA1 sockets=0-1:0-1bladeDATA2 sockets=0-1:0-1

2.3.1. Default Resource Allocation

The line starting with sizing defines the default resource allocation. The line assumes that

all machines in the cluster have the same HW configuration. However, the parameters in the

line could be defined for each machine if you need to override the default sizing. The line has

the following parameters:

4

Page 9: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

• mem: Memory available in the machine to run the components (in GB). If not set it

defaults to take all the memory in each machine and distribute it for LeanXcale.

• sockets: Lists of sockets in the machine to be used to run the components. Values like 0-2

or 0,1,2 or 0,4 are allowed. In DataStore machines, there should be two socket lists

definition: The socket list for KVDS and the socket list for QE. Those list must be

separated by colon ':'

Again, if not set all sockets in the machine will be used.

• ncores: This is the number of physical cores to be used in each machine to run Lx-DB

components. Take care that this is physical cores so hyperthreads should not be counted

on.

• nthreadscore: Number of hyperthreads per core

• fslog: Folder where the transaction logging files are located. These are not the components

logs, but the transaction logging that guarantees database durability. If SUDO is not

granted, the folder with RW permissions for `USER`should have been created by the

administrator

• fsdata: Parent Folder where database is storing data. If section [DSFs] is available, then

those folders will be used.

It is highly recommended to separate transaction redo logging into different disks from

database data.

An example of this configuration is the following:

...sizing mem=128G sockets=0-1 ncores=12 nthreadscore=0 fslog=/txnlogsfsdata=/lxdata...

This means: The machines have 128GB available and 2 sockets (0 to 1), there are 12 physical

cpus in total and there is no hyperthreading.

Then, redo logging will be written to the filesystem mounted at /txnlogs and data will go to

/lxdata.

5

Page 10: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

2.3.2. Default Component Deployment

Components are typically classified into 2 classes:

• Metadata components:

- Zookeeper

- Configuration Manager

- Snapshot Server

- Commit Sequencer

- KiVi Metadata Server

• Datastore components:

- Query Engine

- Transactional Loggers

- Kivi Datastore Server

• Conflict Managers: This are not in either of the two classes, but for the sake of simplicity

are - by default - deployed with metadata components unless High Availability is set up.

Giving this classification, the metadata components are deployed in the metadata server while

datastore components will be distributed considering the number of datastore servers and their

HW capacity.

2.4. Completing the Bare Metal Installation

Once you have finished changing the configuration file you can complete the installation. This

last step will build your detailed configuration file including the resource allocation, and

deploy the installation in the rest of the servers of the cluster.

From this point the cluster is installed and set up to be started.

2.5. Advanced Configuration

There are quite a few detail parameters that can be set to fine tune your installation. Those

parameters are further discussed in Annex I. Advanced Configuration.

• Resource allocation

6

Page 11: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

• Filesystems

• Detailed component deployment

7

Page 12: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Chapter 3. High Availability

LeanXcale can be set-up as a high availability cluster. High availability means that in case of a

single component failure or single machine failure the system will keep working non-stop.

You can further define the configuration parameters so the system can keep working in case of

two-machine failure or even stronger high availability situations.

3.1. Configuring High Availability

Configuring High Availability is pretty simple, It just requires that you set the following

configuration parameter in the inventory file:

HA=yes

3.2. Implications

The implications of configuring this parameter are:

• The install SCRIPT will check there are at least three machines configured for the cluster.

There has to be at least 2 machines defined as metadata servers and 2 machines defined as

datastore servers. One machine can act both as metadata and datastore and that’s the reason

there has to be at least 3 machines in the cluster.

The following configuration could be a valid HA configuration:

8

Page 13: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

[all:vars]BASEDIR="{{ lookup('env','BASEDIR') }}"BACKUPDIR=/ssd/leandata/backup/USER="{{ lookup('env','USER') }}"

HA=yesnetiface=eth0

[defaults]sizing mem=- sockets=- ncores=- nthreadscore=- fsdata=. fslog=.

[meta]metaserver1dataandmetaserver1

[datastores]dataandmetaserver1dataserver2

• Components will be installed as follows:

• Zookeeper: Zookeeper is the master for keeping the global configuration, health

information and arbitration. The install SCRIPT will configure a Zookeeper cluster with

three Zookeeper members replicated in different machines.

• Transaction Loggers: The loggers are components that log all transaction updates (called

writesets) to persistent storage to guarantee durability of transactions. The loggers will be

configured and replicated in groups so there is a different logger component on a different

machine. Therefore the logging information will be replicated in order to recover

information in case of a crash.

• Conflict Manager: Conflict Managers are in charge of detecting write-write conflicts

among concurrent transactions. They are high available "per se", because If one Conflict

Manager fails the conflict key buckets are transferred to another Conflict Manager. No

special configuration is needed for High Availability except guaranteeing that there are at

least 2 Conflict Managers configured in 2 different machines.

• Commit Sequencer: The commit sequencer is in charge of distributing commit timestamps

to the Local Transactional Managers. Two Commit Sequencers will be configured, but

only one will be acting as master and the other as follower. The follower will take over in

case the master fails.

• Snapshot Server: The Snapshot Sever provides the most fresh coherent snapshot on which

new transactions can be started. As in the case of the Commit Sequencer, 2 snapshot

9

Page 14: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

servers will be started one will be acting as master and the other as follower. The follower

will take over in case the master fails.

• Configuration Manager: The configuration manager handles system configuration and

deployment information. It also monitors the other components. Two configuration

managers will be started and one will be the master configuration manager.

• Query Engine: The Query Engine parses SQL queries and transforms them into a query

plan which will derive in a set of actions to the datastores. The query engines are usually

configured in a cluster in which any of them can take part of the load so High availability

has no special requirement on them.

• Datastores: There will usually be several datastores in different machines. The important

point in datastores is replication.

3.3. Replication

So far, High availability configuration ensures the components are configured in a way that no

component failure or machine crash will cause the system to fail.

While components can be available data replication is about data availability so, If a machine

or a disk crashes there is another copy of the data so you can keep working on that copy of the

data. Data replication can be enabled regardless whether you want also to have high

availability or not.

LeanXcale provides a full synchronous replication solution and you can configure as many

replicas as you want. Basically, you can define a mirroring datastore configuration in the

inventory file:

...[datastores]mirror-1:  server1  server2mirror-2:  server3  server4

This doesn’t mean every server needs to be mirrored. You may also have unmirrored servers

to keep data that doesn’t need to be replicated.

Besides, there are 2 kinds of replicas:

10

Page 15: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

• Replicas to minimize the risk of losing data and being able to recover service as soon as

possible.

• Replicas to improve performance. This are usually small tables that are used really

frequently (usually read) so the application can benefit from having several copies that can

be read in parallel.

11

Page 16: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Chapter 4. Operations

The SCRIPTs in this section wrap ansible playbooks. The playbooks are in

$BASEDIR/admin/yml. The playbooks coordinate scripts which are run at each node level.

Before running the SCRIPT It is recommended that you load the environment variables.

Therefore all operations are preceded by:

> cd $BASEDIR$BASEDIR> source env.sh

4.1. Start, Stop & Check your LeanXcale Cluster

• Start Cluster: Starting the cluster just needs to run a SCRIPT

> cd $BASEDIR$BASEDIR> source env.sh$BASEDIR> ./admin/startcluster.sh

The script will check if zookeeper is up and running or not. If it is not running it will start it.

Then it will try to start all the components.

• Stop Cluster: Current stopcluster.sh SCRIPT, uses lxConsole by default to do a graceful

stop. Allows 3 modifiers which cannot be combined:

- force: Tries to shutdown gracefully all components. Then, after 10 seconds kills all

components.

- timeout=N: Tries to shutdown gracefully all components. But - for each component -

if the stop procedure has not finished, it kills the component.

There is a subtle difference between force and timeout parameters. force

waits for all processes to do a complete stop and, then, kills any process that

could have remained up. On the other hand, when using timeout,

components are killed after N seconds even if the stop process is still trying

to stop the component.

I usually use ``force'' to be sure no component keeps running if there is any

problem in the stop chain and timeout if a previous stop command got stuck

for some time.

12

Page 17: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

stopcluster never stops zookeeper unless specifically requested

• Check the Cluster: To check whether the components are really running and ports are

bind.

$BASEDIR/admin> ./checkcluster.sh

checkcluster doesn’t check if zookeeper is running directly but It does check that

zookeeper port is listening

• Restart components: This will do the previous check and restart all components that are

found not to be running.

$BASEDIR/admin> ./checkandrestart.sh

• Start/Stop a certain component: The best way to start a certain component is usually to use

checkandrestart as described previously. However, there is a way to specifically do so by

adding the component group to the start or stop SCRIPT. (* Note that currently this can be

done only for one group at a time*)

#stop query engines 1 to 4 in all Datastores$BASEDIR/admin> ./stopcluster.sh DS QE-1..4#start zookeeper in Metadata server$BASEDIR/admin> ./startcluster.sh MS ZK#Try to stop CommitSequencer gracefully, but kill after 20 seconds$BASEDIR/admin> ./stopcluster.sh timeout=20 MS CmS#Start all Conflict Managers(you don't need to know how many thereare) in MetaServer$BASEDIR/admin> ./startcluster.sh MS CflM-0

All component groups configured for the cluster can be seen in $BASEDIR/scripts/env.sh, in

variable `components`

Order of parameters in the SCRIPTs is important

• This is the recommended sequence to be sure that the system is completely stopped

including Zookeeper:

##First stop all the workload$BASEDIR/admin> ./stopcluster.sh$BASEDIR/admin> ./stopcluster.sh timeout=1$BASEDIR/admin> ./stopcluster.sh MS ZK

13

Page 18: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

4.2. Monitoring

LeanXcale monitoring is build on two open-source solutions: Prometheus and Grafana.

The following image shows the architecture of components:

LeanXcale Monitoring and Alerting System consist on the following elements:

• Prometheus: Centralizes and stores metrics in a time series database.

• Alert Manager: Triggers alarms based on Prometheus stored information (it is part of

Prometheus software also).

• Exporters: Running in monitorized hosts, export metrics sets to feed Prometheus:

- Mastermind exporter: Publishes metrics from the following Lx components: Snapshot

Server, Commit Sequencer and Config Manager.

- node-exporter: Comes by default with Prometheus instalation, exports a lot of metrics

about machine performance.

- metricsExporter.sh: It exports metrics about Lx components at two levels:

· Lx Component running java code inside JVM.

· Lx Component processes as OS level (I/O, memory,..)

In addition, it’s possible to feed Prometheus with another metrics. It only needs key-value

files placed in /tmp/scrap path. It’s common to use these to export metrics resulting from

executing specific processes. For example, this is the file generated in the TPCC

benchmarking tests:

14

Page 19: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

tpccavglatency 101tpmCs 28910aborttpmCs 124

Monitoring is started when the cluster is started and the central components (Prometheus and

Grafana) are deployed in the metadata server. In order to access the monitoring dashboard you

have open the following URL in your browser:

http://{Metadata server hostname}:3000/

The following images shows an example of a working monitoring dashboard:

15

Page 20: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

4.3. LOGs & Alerts

In this section we will be discussing about log files, but not referring to the transactional log

files needed to guarantee durability, but to the information provided by the components about

their activity.

In LeanXcale all components log their activity to their log files. Default logging level is INFO

which includes also information on WARNINGs and ERRORs. However, the log level can be

increased or decreased according the situation.

Log files are stored in each machine in the folder:

$BASEDIR/LX-DATA/logs

Besides each component logging, there is a central alert system. All components send

messages considered relevant as alerts for the central system.

Alerts can be accessed through the The Admin Console or through a web browser accessing

URL:

http://{Metadata server hostname or IP address}:9093

Annex II. Alerts provides detailed information on the alerts that the system can show and their

description.

16

Page 21: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

You can get the LOGs from all machines in the cluster with the following command:

> cd $BASEDIR$BASEDIR> source env.sh$BASEDIR> ./admin/getlogs.sh

The SCRIPT gets the logs and stores them creating one folder for each of the hostnames of

the cluster: $BASEDIR/tmp/\{hostname of each of the nodes}.

You can also get a summary of the ERROR messages shown in every LOG file in the cluster

node.

> cd $BASEDIR$BASEDIR> source env.sh$BASEDIR> ./admin/geterrors.sh

This looks for the 3 first and 3 latest ERROR messages in each log file in each machine and

then shows the summary.

4.4. Backup & Recovery

LeanXcale provides a hot backup mechanism that allows a full backup of your distributed

database guaranteeing a consistent point in time recovery. The system will be fully available

during the backup. To get the point in time recovery you need to backup and recover not only

the database datastore files, but the transaction log files that hold transaction information

during the backup period up to the point you want to recover.

This means you need to size your transaction log filesystems to keep the files needed for

backup rather than just to guarantee simple database persistence. This size will depend on

your workload, but It is quite simple to estimate with the first tests.

Currently the backup process compresses all backup information in one folder per machine so

you can later move all the backup data to the backup device available in your organization.

The folder where compressed backup data will be stored is defined in the inventory file. The

name of the parameter is BACKUPDIR. We suggest you keep backup in some kind of central

platform according to a backup policy and remove them from local folders.

To perform the backup run the administrator tool admin/hotbackupcluster.sh. This

action will create a backup meta file in the backup destination folder and tar all the

17

Page 22: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

data(loggers and persisted data) and database relevant status. All this information will be

named using the timestamp when the hot backup started. This is how the backup folder will

look like:

$ ls -1 backup/kivi_ds_data_dir_1-kvds.tar.gzkivi_ms_data_dir.20190827-130652.tar.gzkivi_ms_data_dir.tar.gzmeta.backup.20190827-130652tm_logger_ltm_1.00000000.log.20190827-130652.tar.gzzookeeper.20190827-130652.tar.gzzookeeper.tar.gz

4.4.1. Hot backup meta file

The backup process will generate a backup meta file that includes all the necessary

information to perform the recovey froma hot backup. This information includes the last Sts

persisted in the data storage files and transaction logs status along with the range of

timestamps recorded.

kiviSts=88000kvds[0].serviceIp=127.0.0.1,9992,kvds-XPS-LX-1kvds.length=1loggers[0].type=LgLTMloggers[0].serviceIp=127.0.0.1,13422,logger_ltm-0loggers[0][0].file=tm_logger_ltm_1/00000000.logloggers[0][0].firstTs=997001loggers[0][0].lastTs=10914004loggers[0].length=1loggers[1].type=LgCmSloggers[1].serviceIp=127.0.0.1,13400,logger_cs-0loggers[1].length=0loggers.length=2

4.4.2. Recovery from a hot backup

To recover from a hot backup just run the recovery with the hot backup option as follow:

admin/recovercluster.sh hotbackup. This action will stop Leanxcale, find a

backup meta file in BACKUPDIR and restart again Leanxcale with the related backup data. It

may take some minutes depending on the amount of non persisted data from the loggers.

18

Page 23: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

4.4.3. Cold Backup

Sometimes you need to do a backup of a non critical installation or to move data from a

preproduction environment to a development backup. To do a cold backup you need to stop

LeanXcale first, but the backup process will be faster and won’t require any transaction log

file so the recover will be also faster and easier.

To perform this kind of backup run the administrator tool admin/backupcluster.sh

but remember you will need to stop Leanxcale first. This action will save all the data,

metadata and zookeeper files in BACKUPDIR.

4.4.4. Cold Backup recovery

Similar as in the case of hot backup, just execute the administrator tool

admin/recovercluster.sh with no arguments. This will stop Leanxcale, restore the

backup from BACKUPDIR and start Leanxcale again.

4.4.5. Data Export

When you want only to move some data from one environment to another, for example in

testing environments, you can perform a data backup with the tool LX-

BIN/bin/lxBackup. This will dump all database data and metadata to the desired folder.

In case you are planning to change the Leanxcale version or want to tests Leanxcale against

other database system, we suggest to use the csv option for flexibility. This will dump data in

CSV files so It can be easily moved to any other system.

For importing the backup data just use the --import option. Find below other useful

options:

$ LX-BIN/bin/lxBackup --helpusage: cmd [options] -bs,--blocksize <arg> Block size when reading from binaryfile.  Default is 4096 -csv,--csvformat Import/Export in csv format -f,--folder <arg> Folder to import/export datafrom/to. It  is created when exporting.Mandatory when  importing. Default is /tmp/lxbackup -h,--dsaddress <arg> Datastore address. Not needed when  connecting through Zookeeper.

19

Page 24: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Default is  localhost -help,--help show usage -i,--import Import action. Default is export -p,--dsport <arg> Datastore port. Not needed whenconnecting  through Zookeeper. Default is 1529 -sep,--recordseparator <arg> Record separator. Default is '@\n' -skd,--skipdata Skip Importing/exporting tabledata. -skp,--skipperms Skip Importing/exportingpermissions -sksq,--skipsequences Skip importing/exporting sequence  definitions -sktdef,--skiptabledefs Skip Importing/exporting tabledefinitions -skti,--skiptableindexes Skip Importing/exporting tableindexes -skts,--skiptablesplits Skip Importing/exporting tableregions  definitions -tgz,--tarzip Compress and remove target folder -th,--datathreads <arg> Number of threads to handle data.Default  is 4 -tl,--tablelist <arg> JSON with the list of tables andlimits to  be imported/exported. Example:  {"backupFileTableInfos": [{  "tableName":"t1", "limits":[{"min": "3",  "max": "10"},{"min": "10", "max":  "20"}]}]} -ts,--timestamp <arg> Fixed Timestamp. Not needed when  connecting through Zookeeper -tsplit,--tablesplit Export table data regions todifferent  files. Overriden by tablelist file -zh,--zkaddress <arg> ZooKeeper address -zp,--zkport <arg> ZooKeeper port. Default is 2181

4.5. Data Partitioning & Distribution

One key element of a distributed database is how data is distributed among the different

datastores.

LeanXcale provides 3 ways of partitioning data:

20

Page 25: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

• Key range partitioning. This is the most efficient way of partitioning If you know how to

balance your data.

You can use the admin console to partition your data and move it to the right datastore. See

The Admin Console to get further information on how to do it.

If you have a CSV file with a significant sample of your data, there is also a tool that will

sample the file and recommend the split points for balanced key ranges according to the

information in the CSV file.

This tool is lxSplitFromCSV and there is further information following the link to the section

describing the tools.

• Hashing. Hashing is a very simple partitioning scheme and has the advantage of doing

very good at balancing data across datastores. The system will calculate a hash value from

the key and it will distribute data considering the modulus of the has value. The

disadvantage of hashing is that - for scans - the system has to do the scan in all datastores

because data can be stored in any of them. Thus, scans require more resources than if you

do a key range partitioning.

• Bidimensional partitioning: Bidimensional partitioning is an automatic partitioning

schema that is normally set up on top of one of the previous partitioning schemas. You

need to define a time evolving parameter and a criteria that will cause the system to

automatically do partition your data to get the best of resources.

4.6. Authentication, SSL & Permissions

There are two options to configure LeanXcale authentication

• LDAP based authentication. You can set up an LDAP server just for LeanXcale, but this is

most interesting for integrating LeanXcale in your organization’s LDAP (or Active

Directory) and provide some kind of Single Sign On or at least the chance to use a

common password within the organization.

• Open shared access. This is not to be used in production except for shared data. Access

level can be set in the firewall based on IP rules, but all users accessing will be granted

access and they will be able to use a user schema. This is very easy to set-up for

development and testing environments.

Permissions are granted through roles.

21

Page 26: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Communications can be configured so SSL is used for connections between components

though It is recommended to run all components behind a firewall and use JDBC over SSL

and the clients' connections over SSL.

4.7. System DB Tables & Views

LeanXcale’s query engine provides a series of virtual tables that represent system information

that can be useful for the end user or the administrator.

The following sections provide some information of the system virtual tables available.

4.7.1. SYSMETA.TABLES

It shows all the tables in the database instance.

> select * from sysmeta.tables;+----------+------------+--------------+--------------+---------+---------+-----------+----------+------------------------+---------------+| tableCat | tableSchem | tableName | tableType | remarks |typeCat | typeSchem | typeName | selfReferencingColName |refGeneration |+----------+------------+--------------+--------------+---------+---------+-----------+----------+------------------------+---------------+| | APP | CUSTOMER | TABLE | || | | | || | APP | DISTRICT | TABLE | || | | | || | APP | HISTORY | TABLE | || | | | |+----------+------------+--------------+--------------+---------+---------+-----------+----------+------------------------+---------------+

4.7.2. SYSMETA.COLUMNS

It shows all the column information related to the tables

22

Page 27: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

> select * from sysmeta.columns;+----------+------------+--------------+------------------------+----------+-----------------------------------+------------+--------------+---------------+--------------+----------+-------+| tableCat | tableSchem | tableName | columnName |dataType | typeName | columnSize |bufferLength | decimalDigits | numPrecRadix | nullable | remar |+----------+------------+--------------+------------------------+----------+-----------------------------------+------------+--------------+---------------+--------------+----------+-------+| | APP | CUSTOMER | C_ID | 4| INTEGER | -1 | null |null | 10 | 1 | || | APP | CUSTOMER | C_D_ID | 4| INTEGER | -1 | null |null | 10 | 1 | || | APP | CUSTOMER | C_W_ID | 4| INTEGER | -1 | null |null | 10 | 1 | || | APP | CUSTOMER | C_FIRST |12 | VARCHAR | -1 | null| null | 10 | 1 | || | APP | CUSTOMER | C_MIDDLE |12 | VARCHAR | -1 | null| null | 10 | 1 | |...

4.7.3. LXSYSMETA.PRIMARY_KEYS

It shows all the primary keys related to the tables in the system.

4.7.4. LXSYSMETA.FOREIGN_KEYS

It shows all the foreign keys related to the tables in the system.

4.7.5. LXSYSMETA.INDEXES

It shows all the indexes created for the tables in the system.

23

Page 28: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

> select * from lxsysmeta.indexes;+-----------+----------------+-------------+------+-----------------+------------+-----------+-------------+-------+-----------------+----------+------------+-----------+| nonUnique | indexQualifier | indexName | type | ordinalPosition| columnName | ascOrDesc | cardinality | pages | filterCondition |tableCat | tableSchem | tableName |+-----------+----------------+-------------+------+-----------------+------------+-----------+-------------+-------+-----------------+----------+------------+-----------+| false | APP | IX_ORDERS | 2 | 2| O_W_ID | A | 0 | 0 | |tpcc | APP | ORDERS || false | APP | IX_ORDERS | 2 | 1| O_D_ID | A | 0 | 0 | |tpcc | APP | ORDERS || false | APP | IX_ORDERS | 2 | 3| O_C_ID | A | 0 | 0 | |tpcc | APP | ORDERS || false | APP | IX_CUSTOMER | 2 | 2| C_W_ID | A | 0 | 0 | |tpcc | APP | CUSTOMER || false | APP | IX_CUSTOMER | 2 | 1| C_D_ID | A | 0 | 0 | |tpcc | APP | CUSTOMER || false | APP | IX_CUSTOMER | 2 | 5| C_LAST | A | 0 | 0 | |tpcc | APP | CUSTOMER |+-----------+----------------+-------------+------+-----------------+------------+-----------+-------------+-------+-----------------+----------+------------+-----------+

4.8. LXSYSMETA.TABLE_CHECKS

It shows all the table checks (NOT NULL, …), autoincremented columns and geohashed

fields.

Geohash fields are fields indexed geographically for GIS applications

4.9. LXSYSMETA.TRANSACTIONS

It shows all active transactions in the Query Engine relating the information to the connection.

The following fields are showed:

• txnId: The internal unique identifier of the transaction

24

Page 29: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

• state: The state of the transaction

• startTs: This is the timestamp that controls the visibility of the transaction according to

snapshot isolation principles.

• startTime: This is the time from the epoch in milliseconds when the transaction was

started.

• commitTimeStamp: It is usually -1, meaning that the transaction is not doing COMMIT,

yet. Since COMMIT is a short time phase in the transaction you could seldomly see a

transaction with a meaningful COMMIT timestamp.

• sessionId: This is the internal session identifier in the Query Engine

• connectionId: This allows to relate the transaction with the connection.

• uid: Identifier of the user who owns the session in which the transaction is being done.

• connectionMode: Mode of the connection.

• numPendingScans: Number of SCANs the transaction started but are not finished

• numTotalScans: Total number of SCANs for the transaction

• hasWrite: True if the transaction did any write operation (UPDATE, INSERT)

> select * from lxsysmeta.transactions;+-------------+--------+-------------+-----------------+-----------------+-----------+--------------------------------------+-----+----------------+-----------------+---------------+-------+| txnId | state | startTs | startTime |commitTimeStamp | sessionId | connectionId| uid | connectionMode | numPendingScans | numTotalScans | hasWr |+-------------+--------+-------------+-----------------+-----------------+-----------+--------------------------------------+-----+----------------+-----------------+---------------+-------+| 44011298001 | ACTIVE | 44011298000 | 526814361087000 | -1| 1217 | 835e1f1a-cd0a-4766-9e53-182ed1bb39d2 | | RUN| 0 | 0 | false |+-------------+--------+-------------+-----------------+-----------------+-----------+--------------------------------------+-----+----------------+-----------------+---------------+-------+

4.10. LXSYSMETA.CONNECTIONS

It shows information about all existing connections

25

Page 30: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

> select * from lxsysmeta.connections;+--------------------------------------+--------------------------------------+-----------+-------+----------------+-------------------------------------------------------------------------+| connectionId | kiviConnectionId| isCurrent | dsuid | connectionMode ||+--------------------------------------+--------------------------------------+-----------+-------+----------------+-------------------------------------------------------------------------+| 0b7ef86b-924f-4a1b-bf28-2e0661cd47a2 | 835e1f1a-cd0a-4766-9e53-182ed1bb39d2 | true | app | RUN |{parserFactory=com.leanxcale.calcite.ddl.SqlDdlParserImpl#FACTORY,sche |+--------------------------------------+--------------------------------------+-----------+-------+----------------+-------------------------------------------------------------------------+

4.11. Cancelling Transactions

There are some situation when you run the wrong query and you realize it will take too long

because there is a missing index or condition and you don’t want to let the transaction running

and taking up resources.

In those situations, it is possible to cancel the transaction from the tool lxClient [lxClient]

usint the command CANCEL TRANSACTION <transaction id>. This operation will

rollback the connection which the transaction belongs to if it is performed by the same

connection’s owner. It will cancel the transaction unless it is already commiting or rolling

back.

4.12. Closing a Connection

Every user can close its own existing connection with CLOSE CONNECTION

<connection id>. Closing a connection will mean rolling back all running transactions.

In case of long running transactions we recommend you cancel them first.

26

Page 31: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Chapter 5. The Admin Console

The admin console can be started running

lxConsole

This is the main administration console for the LXDB. The console allows the user to check

and manage the configuration of a LXDB instance. The console can be run through full

command names or by the short numbers that appear between brackets.

These are the main functions you can do through the console:

5.1. Common functions

• [0] quit: quits the current lxConsole

• [1] help: prints the list of available commands

• [2] components: Shows a list of all the components and its acronyms

• [3] running [component]: Shows a table with the current running instances for each

component. If the argument component is given, it prints the instnaces running for that

component

• [4] deads [component]: It is not common to have dead components, but in case of a

failure or machine crash, this option can show components that were already up, running

and registered and then are lost. If the argument component is provided, then the

command will give the dead instances for that specific kind of component.

• [6] halt [component [instanceId]]: halts LeanXcale, a set of components or a specific

component. This command accepts component, to kill all the running instances of a given

component. In addition to the first argument, another argument can be added to halt only

an instance of a component.

• [8] trace [nLines]: prints the nLast traces of LeanXcale. This trace contains the

registered, failed, recovered and stopped instances.

• [15] listServers: List all Datastore available servers.

• [16] listRegions [table]: List all data regions (different partitions) of a table.

• [38] getAlerts: Show alerts stored at LXDB Alert Manager

27

Page 32: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

5.2. Data Distribution & Replication Functions

• [17] moveRegion [table regionId DSsource DStarget]: Move a data region of a table

from one Datastore server to another.

• [18] splitTable [table splitKey DStarget]: Split a table by the split key and move the new

region to another Datastore server.

• [19] splitTableUniform [table splitKeyMin splitKeyMax NumOfSplits]: Split a table in

NumOfSplits points between a min and max key range, and distribute (round robin) the

partitions among all the available Datastore servers.

• [25] replicateRegion [table minKey DStarget]: Replicate a data region into another

Datastore server, starting from the suplied minKey.

• [26] removeReplica [table minKey DStarget]: Remove a replicated data region from a

Datastore server.

• [27] joinRegion [table joinKey]: Join two different data regions, by the specified

joinKey.

• [28] increaseTable [table DStarget stategy min max]: Add a new data region to a table.

Defining the strategy for just add or re-size all the existing data regions.

• [29] decreaseTable [table DStarget stretegy]: Delete a data region of a table. Defining

the strategy for just remove it, or re-size all the left partitions.

• [30] listServer [DStarget]: List all the data regions available on a given Datastore server.

• [31] cloneKVDS [DSsource DStarget]: Replicate all the data regions of a Datastore

server into another.

5.3. Advanced Functions

• [5] buckets: The concept of bucket is related with the way table keys are distributed

among conflict managers to scale conflict management. The command gives the current

assignation of buckets.

• [7] loggers: Shows how transaction logs are assigned to Query Engine and local

transaction managers.

• [9] zkTree [path]: this is a tool to recursively print the Apache ZooKeeper tree from the

path provided. If path is not provided, the default value is /leanxcale

• [10] zksts: Prints the current persisted snapshot. This command also return the current

snapshot of the system.

28

Page 33: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

• [12] markAsRecovered [component [instanceId]]: Marks a dead instance as recovered.

• [13] csDiscardLtm [instanceId]: Discard the selected LTM for the commit sequencer.

Disable a client.

• [14] setLogger [component [instanceId]]: Manually assign a transaction logger for a

LTM.

• [20] startGraph: Show the dependencies between components at start time.

• [21] kiviSts: Show the latest STS persisted at the Datastore.

• [22] cgmState: Show the Configuration Manager component operative state.

• [23] persistKiviSts: Persist STS manually to the Datastore.

• [24] getEpoch: Show the different system configurations through the time

• [32] setKvconPath [path]: Set the path for the kvcon (direct datastore console)

command.

• [33] kvcon [command]: Sent a command to the kvcon console.

• [34] kiviRecoveryStats: Show the stats for the recovery process over each Datastore.

• [35] syncDS [DStarget]: Force sync data to disc for the specified Datastore servers.

• [36] getProperty: Show LXDB startup properties

• [37] getOsts: Show the oldest STS of the actual transaction ongoing in the LXDB.

• [39] kvmsEndpoint: Show the actual Datastore metadata server connection endpoint.

29

Page 34: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Chapter 6. LeanXcale Tools

The LeanXcale tools are stored at $BASEDIR/LX-BIN/bin and $BASEDIR/utils

directories.

6.1. lxClient

Though we recommend you use a JDBC SQL GUI opensource like SquirrelSQL, DBeaver or

commercial, LeanXcale installs a console text client so you can do some administration tasks

or scripting over it. This client is lxClient: The main SQL terminal connection server for

LXDB.

The SQL dialect implemented is ANSI SQL which is the most standard one. It allows the

following parameters:

  -u <database url> the JDBC URL to connect to  -n <username> the username to connect as  -p <password> the password to connect as  -d <driver class> the driver class to use  -e <command> the command to execute  -nn <nickname> nickname for the connection  -ch <command handler>[,<command handler>]* a custom commandhandler to use  -f <file> script file to execute (same as--run)  -log <file> file to write output  -ac <class name> application configuration classname  -ph <class name> prompt handler class name  --color=[true/false] control whether color is usedfor display --colorScheme=[chester/dark/dracula/light/obsidian/solarized/vs2010] Syntax highlight schema  --confirm=[true/false] confirm before executingcommands specified in confirmPattern  --confirmPattern=[pattern] pattern of commands to promptconfirmation  --csvDelimiter=[delimiter] Delimiter in csv outputFormat  --csvQuoteCharacter=[char] Quote character in csvoutputFormat  --escapeOutput=[true/false] escape control symbols in output  --showHeader=[true/false] show column names in queryresults  --headerInterval=ROWS the interval between whichheaders are displayed

30

Page 35: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

  --fastConnect=[true/false] skip building table/column listfor tab-completion  --autoCommit=[true/false] enable/disable automatictransaction commit  --verbose=[true/false] show verbose error messages anddebug info  --showTime=[true/false] display execution time whenverbose  --showWarnings=[true/false] display connection warnings  --showNestedErrs=[true/false] display nested errors  --strictJdbc=[true/false] use strict JDBC  --nullValue=[string] use string in place of NULLvalues  --numberFormat=[pattern] format numbers usingDecimalFormat pattern  --dateFormat=[pattern] format dates usingSimpleDateFormat pattern  --timeFormat=[pattern] format times usingSimpleDateFormat pattern  --timestampFormat=[pattern] format timestamps usingSimpleDateFormat pattern  --force=[true/false] continue running script evenafter errors  --maxWidth=MAXWIDTH the maximum width of theterminal  --maxColumnWidth=MAXCOLWIDTH the maximum width to use whendisplaying columns  --maxHistoryFileRows=ROWS the maximum number of historyrows to store in history file  --maxHistoryRows=ROWS the maximum number of historyrows to store in memory  --mode=[emacs/vi] the editing mode  --silent=[true/false] be more silent  --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv/tsv/xmlattrs/xmlelements/json]  format mode for result display  --isolation=LEVEL set the transaction isolationlevel  --run=/path/to/file run one script and then exit  --historyfile=/path/to/file use or create history file inspecified path  --useLineContinuation=[true/false] Use line continuation  --incremental=[true/false] display result rows immediatelyas they are  fetched, yielding lower latencyand memory  usage at the price of extradisplay column padding  --incrementalBufferRows integer threshold at which to switch toincremental mode  --help display this message

31

Page 36: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Once inside lxClient, if you just type

!help

(the ! is important)

6.2. lxBackup

This is the tool you can use to do a data export. A data export may not give you a consistent

snapshot of the data because It is not using a consistent start timestamp for all the tables you

can configure to be dumped.

It will store all data exported in a temporal folder the structure of all database metadata and

also will backup the data itself.

The backup can be restored to a new (clean) database.

32

Page 37: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

 -bs,--blocksize <arg> Block size when reading from binaryfile.  Default is 4096 -csv,--csvformat Import/Export in csv format -f,--folder <arg> Folder to import/export datafrom/to. It  is created when exporting.Mandatory when  importing. Default is /tmp/lxbackup -h,--dsaddress <arg> Datastore address. Not needed when  connecting through Zookeeper.Default is  localhost -help,--help show usage -i,--import Import action. Default is export -p,--dsport <arg> Datastore port. Not needed whenconnecting  through Zookeeper. Default is 1529 -sep,--recordseparator <arg> Record separator. Default is '@\n' -skd,--skipdata Skip Importing/exporting tabledata. -skp,--skipperms Skip Importing/exportingpermissions -sksq,--skipsequences Skip importing/exporting sequence  definitions -sktdef,--skiptabledefs Skip Importing/exporting tabledefinitions -skti,--skiptableindexes Skip Importing/exporting tableindexes -skts,--skiptablesplits Skip Importing/exporting tableregions  definitions -tgz,--tarzip Compress and remove target folder -th,--datathreads <arg> Number of threads to handle data.Default  is 4 -tl,--tablelist <arg> JSON with the list of tables andlimits to  be imported/exported. Example:  {"backupFileTableInfos": [{  "tableName":"t1", "limits":[{"min": "3",  "max": "10"},{"min": "10", "max":  "20"}]}]} -ts,--timestamp <arg> Fixed Timestamp. Not needed when  connecting through Zookeeper -tsplit,--tablesplit Export table data regions todifferent  files. Overriden by tablelist file -zh,--zkaddress <arg> ZooKeeper address -zp,--zkport <arg> ZooKeeper port. Default is 2181

33

Page 38: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

6.3. lxCSVLoad

CSV files loader. In order to increase the performance this tool will load data directly over the

datastore.

Tables need to be defined before load data on them.

 -c,--connection_string <arg> KiviDB connection string.Theproper  format is kvmsHost!kvmsPort. -t,--table <arg> Name of the table -f,--csv_file <arg> Path to the csv file -y,--column_separator <arg> [Column separator used on thecsv file.  Default '|'] -z,--zookeeper_connection <arg> [Zookeeper connection string.Theproper  format is host:port] -ts,--timestamp <arg> [Value of timestamp in casethere isn't  a Zookeeper connection. Defaultvalue  is 1] -v,--verbose Verbose output -th,--threads <arg> [Number of writers threads.Default  value is 5] -split,--split Read file by regions as an splitwould  do. Don't use when fields havemultiple  lines -sz,--bufferSize <arg> [File region size per iteration.To use  along with split option] -params,--params <arg> [path of the file with theparameters  for the loader]

This can be used in the following way:

./lxCSVLoad -c 125.11.22.30\!44000 -z 125.11.22.30:2181 -t tpcc-APP-ORDER_LINE -f /lx/LX-DATA/CSVs/1-20/order_line-1_20.csv &

Usually there are with a huge amount of data. To improve the performance in those scenarios

is able to split It is recommended to split the file in smaller parts and load them in parallel

34

Page 39: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

with several instances of the process, each one with one file, but all of them loading data over

the same table.

It can be done in this way:

./lxCSVLoad -c 125.11.22.30\!44000 -z 125.11.22.30:2181 -t tpcc-APP-ORDER_LINE -f /lx/LX-DATA/CSVs/1-20/order_line-1_20.csv &./lxCSVLoad -c 125.11.22.30\!44000 -z 125.11.22.30:2181 -t tpcc-APP-ORDER_LINE -f /lx/LX-DATA/CSVs/21-40/order_line-21_40.csv &

6.4. lxDataGenerator

A random data generator used for tests proposes.

 -bs,--batchsize <arg> Batch size. Default is 100 -ct,--createtable Creates the table too -d,--driver <arg> Driver class name. Default  com.leanxcale.client.Driver -dl,--delay <arg> Next row Milliseconds delay. Defaultis 0 -f,--file <arg> Generator Info. CheckDataGeneratorInfo.  Example: {"totalRows":"3",  "fieldInfoList":[{"position":"0", "generateInfo":{"isRowNumber":"true"}},{"po  sition":"1", "generateInfo":{"isRowNumberSummation":"tru  e"}},{"position":"2", "generateInfo":{"constantValue":"0"}}]} -fc,--forcekvconnect Force to dial Kivi in case there isKivi  input but forced also to use onethread -h,--dsaddress <arg> Datastore address. Not needed when  connecting through Zookeeper.Default is  localhost -help,--help show usage -jdbc,--jdbc Insert data into table y jdbc.Zookeeper  and DS host/ports params areignored. -p,--dsport <arg> Datastore port. Not needed whenconnecting  through Zookeeper. Default is 1529

35

Page 40: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

 -sep,--fieldseparator <arg> Field separator. Default is '|' -t,--tablename <arg> Table name. Dump in CSV format ifnot  present -th,--datathreads <arg> Number of threads to handle data.Default  is 4 -ts,--timestamp <arg> Fixed Timestamp. Not needed whenconnecting  through Zookeeper -u,--url <arg> the URL of the database to which to  connect. Default is jdbc:leanxcale://localhost:1529/adhoc;user=  app -v,--verbose Show progress -zh,--zkaddress <arg> ZooKeeper address -zp,--zkport <arg> ZooKeeper port. Default is 2181

6.4.1. Data generator file

General Info

• totalRows: Number of rows to dump/insert.

• fieldInfoList: List of Field Info. All field values are calculated and can be used as a param

to generate other field infos. Take into account that values are calculated in order(order in

list not field position), so make sure that a `field-param' is calculated before needed.

• tableColumns: List of table column names to generate data for. The generated data will

map the same order here. All the fields are columns if none is specified.

• primaryKeys: List of table column names that compose th primary key

Field Info

• name: field name. Either name or nameList is mandatory

• nameList: list of field names when the field generates a list of values

• jSqlType: field type(java.sql.Types). Needed when inserting through jdbc and when

creating the table

• jSqlTypeList: list of field types when the field generates a list of values. Needed when

inserting through jdbc and when creating the table

• generateInfo: info to generate field/s value/s

36

Page 41: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Generate Info

• constantValue: generate the indicated constant

• isRowNumber: indicate to assign the row number to the field

• isRowNumberSummation: indicate to assign the row number summation to the field

• randomValue: list of constants/params to get the value from randomly

• randomStringLength: legnth of the string to be generated randomly. It can be combined

with randomValue to concatenate `n' random values. For example: if randomValue = [0,1]

and randomStringLength = 3, it could generate 000 or 001 or …

• isTimestamp: indicate to assign the current timestamp in millis

• isNull: indicate to generate a null value

• composedString: list of constants/params to compose a string. For example if it is

[preffix'', $1, suffix''] and the field with position 1 is a row number summattion it

will generate preffix0suffix'', preffix1suffix'', ``preffix3suffix'',…

• functionInfo: Use reflection to generate the value. Check below

• queryInfo: Get input data from a Jdbc query. Execute a query at initialization and each

query-row are used in each generated-row.

• dynamicQueryInfo: Same as previous but execute the query for every generated row so it

just read the first query-row. Before executing, it replace the field references with their

values. For example: select a from t where b = $10, having a row number

field in position 10, it will execute select a from t where b = 0, select a

from t where b = 1, select a from t where b = 2,…

• csvInfo: Get input data from a csv file

• tableInfo: Get input data from a Kivi table through Kivi API. Use along with

--forcekvconnect if the target is not a `Kivi-table'

Reflection Info

• className: for example java.util.Random

• method: for example nextInt. It invokes the constructor when null.

• instance: param to invoke the method from. If null, the method should be static or the

class should have a no arguments constructor.

• argumentTypes: for primitives types use the same type name as in java, otherwise use the

37

Page 42: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

full class name. For example: [``int'']

• params: list of constants/params

Query Info

• query: query string

• url: the URL of the database to which to connect to execute the quey. Default is the one

from the command line

• driver: Driver class name. Default is the one from the command line

• referenceChar: char to indicate field reference in case you can’t use `$' (only for dynamic

queries)

CSV Info

• fileName: Csv file name.

• separator: Csv separator. Default is `|'.

Table Info

• databaseName: database name

• schemaName: schema name

• tableName: table name

• limits: list of limits/regions to scan

Limits/region

• min: Minimun region value. List of primary key values

• max: Maximun region value. List of primary key values

6.4.2. Example Configuration for the Data Generator

{  "totalRows": "10",  "tableColumns":["rowNumber","rowSummattion","nullField","tstamp","constantquery","dynamicquery"],  "fieldInfoList": [  {

38

Page 43: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

  "name": "rowNumber",  "jSqlType": "4",  "generateInfo": {  "isRowNumber": "true"  }  },  {  "name": "rowSummattion",  "jSqlType": "-5",  "generateInfo": {  "isRowNumberSummation": "true"  }  },  {  "name": "nullField",  "jSqlType": "4",  "generateInfo": {  "isNull": "true"  }  },  {  "name": "tstamp",  "jSqlType": "-5",  "generateInfo": {  "isTimestamp": "true"  }  },  {  "nameList": [  "constantquery"  ],  "jSqlTypeList": [  12  ],  "generateInfo": {  "queryInfo": {  "query": "select tableName from sysmeta.tables"  }  }  },  {  "nameList": [  "dynamicquery"  ],  "jSqlTypeList": [  12  ],  "generateInfo": {  "dynamicQueryInfo": {  "query": "select cast((TIMESTAMPADD(SECOND,cast($tstamp/1000 AS int), DATE '1970-01-01')) as date)",  "driver": "com.leanxcale.client.Driver",

39

Page 44: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

  "url": "jdbc:leanxcale://localhost:1529/adhoc;user=app"  }  }  },  {  "name": "randomString",  "jSqlType": "12",  "generateInfo": {  "randomStringLength": "12"  }  },  {  "name": "constantfloat",  "jSqlType": "6",  "generateInfo": {  "constantValue": "3.14159"  }  }  ]}

Generated data:

$ LX-BIN/bin/lxDataGenerator -th 1 -f ../datagenerator.json -jdbc0|0||1568094848499|CONNECTIONS|2019-09-10|mbQKJKUDkD6m|3.141591|1||1568094848607|FOREIGN_KEYS|2019-09-10|cm8s72CJYTap|3.141592|3||1568094848642|PRIMARY_KEYS|2019-09-10|qktz8WKcpiMs|3.141593|6||1568094848665|TABLE_CHECKS|2019-09-10|Xz6d8iYhLhgC|3.141594|10||1568094848695|TRANSACTIONS|2019-09-10|1THfF7b1FHjt|3.141595|15||1568094848724|COLUMNS|2019-09-10|jcwkubzUnG3H|3.141596|21||1568094848753|TABLES|2019-09-10|yO8ni68nlvSB|3.14159

6.5. lxSplitFromCSV

Split a table based on the data distribution of a CSV file.

Usually the data distribution are not homogeneous based on the defined primary key. Because

of that a uniform interval table partitioning is not the best approach in order to balance the

Datastores workload. With this tool you are able to scan the data distribution of CSV files

before load them at LXDB and split / distribute the table among all Datastore servers.

40

Page 45: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

$> ./lxSplitFromCSVusage: LX table split -c,--delimiter <arg> CSV char delimiter -d,--database <arg> database name -f,--file <arg> path to CSV file -h,--header flag specifying whether CSV has header row -j,--jdbcurl <arg> database JDBC URL -k,--kiviurl <arg> Kivi meta server URL. The proper format iskvmsHost!kvmsPort -r Use reservoir sampling (slower, but morereliable  with multi-line columns) -s,--schema <arg> schema name -t,--table <arg> table name

6.6. checkdb.py

This is the main script to do a full check in the database. If this script do not report any error,

everything is working fine. Otherwise it will inform the issue and suggest some actions in

order to solve it.

$> python checkdb.pyChecking Zookeeper Service :: . .................... OKChecking Lx Console Connects :: . ....................OKChecking Processes in Console :: . ....................OKChecking Component Ports are bound :: .............................. OKChecking Recovery Process :: .................... OKChecking KiVi Datastores :: .. .................... OKChecking Snapshot is working :: .. ....................OKChecking DB SQL Connection :: .................... OK

  The database seems to be running successfully

41

Page 46: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Chapter 7. Query Hints

As of now, hints are defined through the use of table functions, that must be used to enable,

create and remove the hints.

Starting from the point that a hint is a way to help the planner to build better execution plans,

the lifecycle of a hint should be the following:

1. Enable hints for the connection.

2. Create/list/remove hint

3. Refresh plan cache

4. Test query (which installs the new execution plan in the cache)

5. Go to 2 if the performance is not ok

6. Disable hints for the connection

Once the hints are set, they will be kept unless removed.

7.1. Enable hints

Enable the hints creates a context for the present connection where store the defined hints.

This can be done using enableHints function as follows:

select * from table(enableHints())

The result should be a row with the following message:

Hints enabled for connection %s

7.2. Disable hints

Disable the hints destroy the hints context for the present connection and removes all the hints

created. However, the execution plans that have been created using queries will remain in the

plan cache.

This can be done using disableHints function as follows:

select * from table(disableHints())

The result should be a row with the following message:

42

Page 47: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Hints disabled for connection %s

7.3. List hints

Lists hints can be useful to know the defined hints for the present connection, and to retrieve

their ID (which can be used to remove them).

This can be done using ``listHints'' function as follows:

select * from table(listHints())

The result will be a table with a row for each defined hint and three columns (ID, which

identifies the hint within the connection, the hint type and a description)

7.4. Define hints

To define hints you have to invoke the appropriate function for the hint you want to define.

The available hints are:

7.4.1. Force access

This hint forces the access to a table through a defined index. To use it invoke the function as

follows:

select * from table(forceAccess('TABLE_NAME','INDEX_NAME'))

Where TABLE_NAME is the qualified name of the table (i.e. tpcc-APP-CUSTOMER) and

INDEX_NAME is the qualified name of the index. This function does not check the existence

of the index nor the table so, if you force the access through an inexistent index, the queries

that use this table will fail.

7.4.2. Disable Pushdown

This hint disables the pushdown of predicates (filter or agregation programs) to the KiVi

datastores for a given table. This means that all the work will be done by the Query Engine

except for retrieving data from the persisten storage system.

To use it invoke the function as follows:

select * from table(disablePushdown('TABLE_NAME'))

43

Page 48: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Where TABLE_NAME is the qualified name of the table (i.e. tpcc-APP-CUSTOMER).

7.4.3. Fixed Join Order

This hint sets the order of the joins as they are written in the query. This hint works at query

level, so it will affect to all the joins present in the query.

To use it, invoke the function as follows:

select * from table(fixJoinOrder())

7.5. Force Join Type

This hint allows to force the kind of join which will be used between two given tables.

To use it, invoke the function as follows:

select * from table(forceJoinType('JOIN_TYPE', 'TABLE_1',

'TABLE_2'))

Where JOIN_TYPE is one of the available types (CORRELATE, MERGE_JOIN,

NESTED_LOOP) and TABLE_1 and TABLE_2 are the unqualified names of the affected

tables (i.e. CUSTOMER or WAREHOUSE). This function does not check that the chosen join

type is available for a given pair of tables so if not the query will fail.

7.6. Or to Union

This hints transforms OR conditions into UNION clauses. It accepts a boolean parameter

which indicates whether the UNION will be a UNION ALL or not.

To use it, invoke the function as follows:

select * from table(orToUnion(all))

Where all is true for a UNION ALL and false in another case.

7.7. Parallel Hints

The following hints are used to manage the parallel mode. In parallel mode, scans are

parallelized among workers that do a fraction of the scan depending on how data is distributed

among the KiVi datastores.

44

Page 49: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

7.7.1. Enable Parallel Mode

This hint is used to enable the planner rules that transform a kivi scan into a parallel scan

where possible, and propose it as an alternative way to traverse a table (which will compete

with the rest of alternatives in terms of cost).

To use it, invoke the function as follows:

select * from table(enableParallelMode())

7.7.2. Define Parallel Tables

This hint is used to define the set of tables over which the scans can be parallelized. This hint

accepts a Java regular expression as a table name, so you can use a regexp to cover several

tables or invoke this hint for each single one.

To use it, invoke the function as follows:

select * from table(enableParallelTable(REGEXP))

Where REGEXP is a Java regular expression which applies over the full qualified table name

(i.e 'tpcc-APP-.*' or '.*' for all the tables).

7.7.3. Enable Parallel on Aggregates that are pushed down

This hint enables the planner rules which allows to transform a kivi scan with an aggregation

program into a kivi parallel scan (with the same aggregation program). To be chosen, this

parallel scan will have to beat the rest of the alternatives in terms of cost.

To use it, invoke the function as follows:

select * from table(enableParallelAgg())

7.8. Remove Hints

This function allows to remove a hint for the present connection. But, if the hint has been used

to execute a query and an execution plan has been created and stored in the query plan cache,

remove the hint won’t remove the query plan and the following queries will use it for their

execution.

To use it, invoke the function as follows:

45

Page 50: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

select * from table(removeHint(HINT_ID))

Where HINT_ID (an integer, without quotes) is the identifier of the hint that wants to be

removed, and can be obtained from the listHints() function.

7.9. Clean Query Plan Cache

This function removes all the query plans that have been cached for a given table (not only for

this connection but for every query in the server). It can be useful to ensure that a new query

plan will be calculated when a hint is added or removed.

To do this, invoke the function as follows:

select * from table(cleanPlanCache('TABLE_NAME'))

Where TABLE_NAME is the qualified name of the table (i.e. tpcc-APP-CUSTOMER) from

which the query plans will be removed.

46

Page 51: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Chapter 8. Annex I. Advanced Configuration

8.1. Inventory Configuration File

A Lx-DB cluster is built from the following components. The acronyms in the left column are

used in the configuration files and in the SCRIPTs to manage the cluster.

component Description

MtM Mastermind component which groups

Configuration Manager, Commit Sequencer Server

and Snapshot Server. You will usually start this

component and none of the individual ones.

CflM-1..N Conflict Managers from 1 to N. Use CflM-0 for all.

If there is only one you can use CflM-1

KVMS KiVi Metadata Server

KVDS-1..N KiVi Datastore Servers from 1 to N. There will be

N KVDS en each DataEngine node. Use KVDS-0

for all KVDS in each machine.

QE-1..N Query Engine from 1 to N. There will be N QE in

each DataEngine node. Use QE-0 for all QE in each

machine

LgCmS Logger for Commit Sequencer Server

LgLTM-1..N Loggers for the LTM running with the Query

Engine. Use LgLTM-0 for all Loggers in each

machine

The main file for configuring a Lx-DB cluster is the inventory file. Following there is an

example with comments to explain the main parameters to be setup

[all:vars]#BASEDIR for the installation is taken from the environmentvariable $BASEDIRBASEDIR="{{ lookup('env','BASEDIR') }}"#FOLDER to store BACKUPs inBACKUPDIR=/ssd/leandata/backup/#Login account for the clusterUSER="{{ lookup('env','USER') }}"# If {{USER}} has SUDO permissions add SUDO=sudo. Otherwise you may

47

Page 52: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

need to# create some folders and grant access to the USER for some actionsto workSUDO="sudo"#SUDO="echo"

# If you want to use NUMA say yes# NUMA=yes# NUMA=noNUMA=yes#USEIP=yes to use IP address instead of hostnames to define themachinesUSEIP=no

[defaults]#This is a fake group for the default HW resources per machine#This info is mandatory and you don't need to add any more info ifall#machines is cluster have the same resources#mem in GB#sockets list of cores (0-2 / 0,1,2 / 0,4)#ncores number of cores available#nthreadscore number of hyperthreads per core#fslog folder where the log files are located. (dot means defaultlocation)#fsdata folder where the storing data is located. (dot meansdefault location)#Examples:#sizing mem=128G sockets=0-1 ncores=12 nthreadscore=0fslog=/ssd2/leandata/logs fsdata=/ssd/leandata/LX-DATA#sizing mem=128G sockets=0-1 ncores=12 nthreadscore=0 fsdata=.fslog=.sizing mem=- sockets=- ncores=- nthreadscore=- fsdata=. fslog=.

#Workload Percentages between KiVi direct API and SQL Interface.#Defaults considers all workload will go through the QE/SQLInterface#This is used to size the proportion of KVDS components over QEcomponents#workload direct=0 sql=100

#The following parameter - if exist - will force the number of KVDSper DS machine:#forcenkvds=8#The following parameter - if exist - will force the memoryallocated for each KVDS:#Memory must allways be in Gygabytes#forcememkvds=10G#The following parameter - if exist - will force the memoryallocated for each QE:#Memory must allways be in Gygabytes#forcememqe=20G

48

Page 53: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

#If the following lines exist they will be used. If not, these willbe the values:#In production environment, probably they shouldn't existDS components="KVDS LgLTM QE Mon"MS components="ZK-1 LgSnS LgCmS MtM CflM KVMS Mon"#Components:# 1- Use the command line arguments# 2- Use the specific info in the inventory# 3- Use the variable in env.sh# 4- Use the generic value in the inventory# 5- Use the generic default value

#Metadata servers. Currently, you can configure only one[meta]blade121 ansible_connection=local

#Datastore Servers. You can have multiple data store instances[datastores]blade121 passive="SNS-2 MtM-2 KVMS-2" fslog=/ssd2/lxsansible_connection=local#Machines can have specific sizing parameters:blade122 components="ZK-2 KVDS" mem=64Gblade123 addcomponents="ZK-3"

[DSFs]#This section should only exist if there is a special configurationwith a different#filesystem per KVDS.#List of file systems to be used for each KVDS storage. The listshould be ordered in terms of NUMA#These folders have to be already created before starting thesystem#Folder names should be the same in all machines/ssd/lxs2/kvdata_1/ssd/lxs2/kvdata_2/ssd/lxs2/kvdata_3/ssd/lxs2/kvdata_4/ssd/lxs2/kvdata_5/ssd/lxs2/kvdata_6/ssd/lxs2/kvdata_7/ssd/lxs2/kvdata_8

The main properties to modify are: * BASEDIR: folder where the Leanxcale database will be

installed. If the environment variable $BASEDIR is set then default action will take that

value. * BACKUPDIR: folder to store backups

• USER: specifying the user that will run the scripts of the LeanXcale’s cluster

administration.

49

Page 54: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

• SUDO: the accepted values are echo and sudo. If the user has sudo permissions this may

simplify the creation of folders when deploying from the master to other target machines.

• NUMA: specifying if the deployment should try to configure components to take the most

of NUMA for this deployment. Lx-DB can exploit the most of NUMA capabilities.

Frequency scaling can also interfere performance, but to disable it SUDO permissions are

required. The accepted values for this parameter are yes or no.

• USEIP: If you want to configure the machines with IP instead of its hostname. This value

has to be the first IP when you execute the command hostname -I

8.2. Resource Allocation

• sizing: This line holds the default sizing parameters to configure the layout of Lx-DB

components. The line assumes that all machines in the cluster have the same HW

configuration. However, the parameters in the line could be defined for each machine if

you need to override the default sizing. The line has the following parameters:

- mem: Memory available in the machine to run the components (in GB)

- sockets: Lists of sockets in the machine to be used to run the components. Values like

0-2'' or 0,1,2'' or 0,4'' are allowed. In DataStore machines,

there should be two socket lists definition: The socket

list for KVDS and the socket list for QE. Those list must

be separated by :''

- ncores: This is the number of physical cores to be used in each machine to run Lx-DB

components. Take care that this is physical cores so hyperthreads should not be

counted on.

- nthreadscore: Number of hyperthreads per core

- fslog: Folder where the transaction logging files are located. These are not the

components logs, but the transaction logging that guarantees database durability. If

SUDO is not granted, the folder with RW permissions for `USER`should have been

created by the administrator

- fsdata: Parent Folder where database is storing data.

It is highly recommended to separate transaction redo logging into different disks from

database data.

An example of this configuration is the following:

50

Page 55: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

sizing mem=128G sockets=0-1 ncores=12 nthreadscore=0 fslog=.fsdata=.[datastores]bladeDS1 mem=- sockets=0-1:2,4

This means: The machines have 128GB available and 2 sockets (0 to 1), there are 12 physical

cpus in total and there is no hyperthreading. Then redo logging and data should use the default

location. Even if fslog and/or fsdata are defined in another disk/filesystem $BASEDIR/LX-

DATA will contain symbolic links that can help to go to the files through a single folder

structure. For bladeDS1 instead of using 128G use the full memory of the machine and

sockets 0 to 1 for KVDS while sockets 2 and 4 for the QE. This means socket 3 will not be

used for Lx-DB if NUMA is enabled.

However, the default sizing configuration is:

sizing mem=- sockets=- ncores=- nthreadscore=- fslog=. fsdata=.

With this input, the configuration SCRIPT will take the information from each machine and

distribute to all the components so Lx-DB can take the full advantage of all machine

resources.

Consider that the distribution of resources among components should be adapted to the

workload so the most of resources is taken. Since there is no information about the workload,

configuration SCRIPT do a standard layout. For high performance installations, an analysis

of the workload is recommended.

There are some parameters that can allow some control over the distribution of components.

Specifically, the following parameters are available:

workload

direct=X sql=Y

X and Y are the respective % of workload that go through the direct

KiVi API and SQL. X+Y should yield 100 (100%). The higher the %

of workload that goes through the direct KiVi API the higher number

of KVDS related to the number of QE. By default, the installation

assumes all workload goes through SQL (X=0, Y=100)

forcenkvds=N If you know the number of KVDS needed per machine, you can

directly define it and override the configurator estimation.

51

Page 56: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

forcememkvds=M If you know how many memory each server has to allocate per

KVDS, you can define it through this parameter. M is the number of

Gigabytes.

forcememqe=M If you know how many memory each server has to allocate per QE,

you can define it through this parameter. M is the number of

Gigabytes.

The inventory configuration file also has the following parameters and sections:

• DS components: This parameter holds the default list of components for one DataStore

machine. Additional components can be specified for any of the machines.

• MS components: This parameter holds the default list of components for one

MetaStore machine.

• [meta]: The section contains a list of IPs or hostnames that will run as metadata server.

Currenty, this has to be only one instance.

• [datastores]: The section contains a list of IPs or hostnames that will run as

datastore servers.

8.3. Filesystems

Besides, the fsdata parameter, you can define one folder or filwsystem per kvds

configuring section [DSFs]. This section provides a list of folders for each KiVi datastore. The

folders root must exist and be the same for all datastore machines in the cluster.

It is possible to do a per machine configuration, but it requires to change the inventory file in

each machine.

52

Page 57: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

...[DSFs]#List of file systems to be used for each KVDS storage./cdba/fs1302/lxs/kvdata_1/cdba/fs1302/lxs/kvdata_2/cdba/fs1302/lxs/kvdata_3/cdba/fs1302/lxs/kvdata_4/cdba/fs1102/lxs/kvdata_5/cdba/fs1102/lxs/kvdata_6/cdba/fs1102/lxs/kvdata_7/cdba/fs1102/lxs/kvdata_8/cdba/fs1202/lxs/kvdata_9/cdba/fs1202/lxs/kvdata_10/cdba/fs1202/lxs/kvdata_11/cdba/fs1202/lxs/kvdata_12

53

Page 58: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Chapter 9. Annex II. Alerts

This is a detailed list of the events that are centralized in the alert system:

9.1. Events from the Query Engine

Alert Identifier Type Description

START_SERVER warning(info) Indicates a server have been

started. It shows the server

address as IP:PORT

START_SERVER_LTM warning(info) Indicates the LTM is started and

the QE is connected to a

Zookeeper instance

START_SERVER_ERROR critical There has been an error while

starting the server and/or the

LTM

STOP_SERVER warning(info) Indicates the server has been

stopped

SESSION_CONSISTENC

Y_ERROR

warning Error when setting session

consistency at LTM.

DIRTY_METADATA warning The metadata has been updated

but the QE couldn’t read the

new changes

CANNOT_PLAN warning The QE couldn’t come up with

an execution plan for the query

FORCE_CLOSE_CONNE

CTION

warning(info) A connection was explicitly

closed by an user

FORCE_ROLLBACK warning(info) A connection was explicitly

rollbacked by an user

FORCED_ROLLBACK_F

AILED

warning Forced rollback failed

54

Page 59: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Alert Identifier Type Description

FORCE_CANCEL_TRAN

SACTION

warning(info) Forced transaction cancellation.

The transaction was not

associated to any connection

KV_DIAL_BEFORE critical QE is not connected to the DS

KV_NOT_NOW warning DS error: not now

KV_ABORT warning(info) DS error: aborted by user

KV_ADDR warning DS error: bad address

KV_ARG warning DS error: bad argument

KV_BAD warning DS error: corrupt

KV_AUTH warning DS error: auth failed

KV_BUG warning DS error: not implemented

KV_CHANGED warning DS error: resource removed or

format changed

KV_CLOSED warning DS error: stream closed

KV_CTL warning DS error: bad control request

KV_DISKIO critical DS error: disk i/o error

KV_EOF warning DS error: premature EOF

KV_FMT warning DS error: bad format

KV_FULL critical DS error: resource full

KV_HALT critical DS error: system is halting

KV_INTR warning DS error: interrupted

KV_IO critical DS error: i/o error

KV_JAVA warning DS error: java error

KV_LOW critical DS error: low on resources

KV_LTM warning DS error: LTM error

KV_REC warning DS error: Rec error

KV_LOG warning DS error: Log error

55

Page 60: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Alert Identifier Type Description

KV_MAN critical DS error: please, read the

manual

KV_NOAUTH warning DS error: auth disabled

KV_NOBLKS critical DS error: no more mem blocks

KV_NOBLOB warning(info) DS error: no such blob. QE

might not being handling blobs

correctly

KV_NOIDX warning(info) DS error: no such index. QE

might not being handling index

correctly

KV_NOMETA critical DS error: no metadata

KV_NOREG warning DS error: no such region

KV_NOTOP warning DS error: no such tid operation

KV_NOSERVER critical DS error: no server

KV_NOTAVAIL critical DS error: not available

KV_NOTID warning DS error: no such tid

KV_NOTUPLE warning DS error: no such tuple

KV_NOTUPLES warning DS error: no tuples

KV_NOUSR warning(info) DS error: no such user

KV_OUT critical DS error: out of resources

KV_PERM warning(info) DS error: permission denied

KV_PROTO warning DS error: protocol error

KV_RDONLY warning DS error: read only

KV_RECOVER warning DS error: system is recovering

KV_RECOVERED warning DS error: system recovered

KV_SERVER critical DS error: server

KV_TOOLARGE warning DS error: too large for me

KV_TOOMANY warning DS error: too many for me

56

Page 61: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Alert Identifier Type Description

KV_TOUT warning DS error: timed out

KV_REPL warning DS error: replica error

9.2. Events regarding Health Monitor

Alert Identifier Type Description

COMPONENT_FAILURE warning Failure from the indicated

service

TIMEOUT_TOBEREGISTE

RED

warning Waiting to register the

indicated service

CANNOT_REGISTER critical Cannot register the indicated

service. Restart it

9.3. Configuration Events

Alert Identifier Type Description

COMPONENT_FAILURE warning The indicated service is down

HAPROXY_NO_CONNE

CTION

warning No connection with HA proxy

while stopping

TIMEOUT_BUCKET_UN

ASSIGNED

critical The bucket reconfiguration

couldn’t be done

TIMEOUT_TOBEREGIS

TERED

warning The indicated servipe still has

dependencies to solve

CANNOT_REGISTER critical

CONSOLE_SERVER_DO

WN

warning Couldn’t start the console server

RESTART_COMPONENT warning Coudn’t restart the indicated

service

SETTING_EPOCH warning Waiting until STS > RSts

RECOVERY_FAILURE warning The indicated service has not

been recovered

57

Page 62: Operations Manual for LxDB · 2020. 2. 14. · # create some folders and grant access to the USER for some actions to work SUDO="sudo" #SUDO="echo" # If you want to use NUMA say yes

Alert Identifier Type Description

RECOVERY_TIMEOUT warning The indicated service is bein

recovered

HOTBACKUP warning Pending recovery from

hotbackup

9.4. Transaction Log Events

Alert Identifier Type Description

LOGDIR_ERROR critical Cannot create folder or is not a

folder

LOGDIR_ALLOCATOR_

ERROR

critical Error managing disk in logger

LOGGER_FILE_ERROR warning IO file error

LOGNET_CONNECTION

_FAILED

critical Cannot dial logger

LOGSRV_CONNECTION

_ERROR

critical Network error

KVDS_RECOVERY_FAI

LED

critical The kvds instance couldn’t be

recovered from logger

FLUSH_FAILED critical Unexpected exception flushing

to logger

58


Recommended