Post on 30-Dec-2020
transcript
Operations Manual for LxDB
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1. Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2. Unpacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.3. Simple Configuration. The inventory file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.4. Completing the Bare Metal Installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.5. Advanced Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3. High Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1. Configuring High Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2. Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3. Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4. Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.1. Start, Stop & Check your LeanXcale Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2. Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3. LOGs & Alerts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4. Backup & Recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.5. Data Partitioning & Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.6. Authentication, SSL & Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.7. System DB Tables & Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.8. LXSYSMETA.TABLE_CHECKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.9. LXSYSMETA.TRANSACTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.10. LXSYSMETA.CONNECTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.11. Cancelling Transactions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.12. Closing a Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5. The Admin Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1. Common functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2. Data Distribution & Replication Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3. Advanced Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6. LeanXcale Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.1. lxClient. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.2. lxBackup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.3. lxCSVLoad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.4. lxDataGenerator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.5. lxSplitFromCSV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.6. checkdb.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7. Query Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.1. Enable hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.2. Disable hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.3. List hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.4. Define hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.5. Force Join Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.6. Or to Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.7. Parallel Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.8. Remove Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.9. Clean Query Plan Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8. Annex I. Advanced Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.1. Inventory Configuration File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.2. Resource Allocation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
8.3. Filesystems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
9. Annex II. Alerts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
9.1. Events from the Query Engine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
9.2. Events regarding Health Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.3. Configuration Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.4. Transaction Log Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Chapter 1. Introduction
Lx-DB is an ultrascalable distributed database. It is built of several components that can scale
out to cope with your workload. Speaking of a database you can get better performance using
a bare metal installation, but Docker has significant advantages to deploy a technology so this
document is targeted at explaining how to deploy Lx-DB using docker and start working with
it.
Since being distributed, Lx-DB can be deployed in very different ways and getting the most
efficient depends on the workload. Lx-DB provides some interesting elasticity capabilities,
but this is not currently automated. LeanXcale team is working on a Elastic Manager that can
automatically adapt your deployment to the workload. Right now, for the sake of simplicity,
the deployment is based on a single Docker image in which you can start just one component
or a set of components.
This document assumes you have already pulled Lx-DB docker image and you are ready to
run it in your machines. Different instances of the docker image running different components
can be orchestrated through Kubernetes, Docker Compose, … but we are assuming you are
familiar with those technologies so the document is focused on the general docker concepts.
To get into the details and try to take the most of the deployment you should read the
Architecture and Concepts documents and then go to the section Docker Deployment Details.
LeanXcale Concepts
Component Architecture
1
Chapter 2. Installation
2.1. Dependencies
In order for the installation to be able to execute, the host and all target machines must meet
some prerequisites that need to be ensured before the installation takes place. These are
summarized as follows:
• The machines can use either an Ubuntu, CentoOS, or RHEL Linux operating system.
• You need the following (very common) components to be installed:
- Ansible (>= 2.6)
- JAVA release 8
- Python 2 (>= 2.7)
- bash
- screen
- libpthread
- numactl (unless you don’t want to take advantage of NUMA advantages)
- nc (netcat) To check ports are bound
• All the machines should have the LeanXcale user created (USER) and the base folder for
the install (BASEDIR). This folder must be owned by USER.
• All the machines should have SSH connectivity and the right ssh keys are set for the user
controlling the cluster (USER) so It can get access to any machine in the cluster without
any further authentication mechanism. This way SCRIPTs can deploy, configure, start and
stop remotely not having to ask for a myriad of passwords for each action.
2.2. Unpacking
The first step for installation is choosing a machine from the cluster to be the master or
orchestrator of the cluster (Usually It is a good idea to have at least two machines that can
play this role). Connect to that machine and set BASEDIR environment variable preferably in
your .profile.
export BASEDIR=/ssd/lxdb
2
Now you can unpack the TAR installation package in the base directory in the MASTER
server and define the BASEDIR:
cd $BASEDIRtar xvfz LeanXcale_v{Version}.tgzsource ./env.sh
Now the master server of the cluster is ready to be configured so you can later deploy the
installation to the rest of the servers.
2.3. Simple Configuration. The inventory file
For configuring the cluster the basic information you need to know is just the hostnames of
the machines you want to deploy on.
With that information you can just configure the servers of the cluster in the inventory file
vim conf/inventory
Following you can see an example of a simple configuration file:
3
[all:vars]#BASEDIR for the installation is taken from the environmentvariable $BASEDIRBASEDIR="{{ lookup('env','BASEDIR') }}"#FOLDER to store BACKUPs inBACKUPDIR=/ssd/leandata/backup/#Login account for the clusterUSER="{{ lookup('env','USER') }}"# If {{USER}} has SUDO permissions add SUDO=sudo. Otherwise you mayneed to# create some folders and grant access to the USER for some actionsto workSUDO="sudo"#SUDO="echo"
# If you want to use NUMA say yesNUMA=noUSEIP=yesHA=noHAPROXY=yes
netiface=eth0forcenkvds=1forcememkvds=2Gforcencflm=1forcecflm=1
[defaults]#The following line tells the configuration SCRIPT to get thevalues from the#machine. All resources will be allocated to LX-DBsizing mem=- sockets=- ncores=- nthreadscore=- fsdata=. fslog=.
#Metadata servers. Currently, you can configure only one[meta]bladeMETA sockets=- ansible_connection=local
#Datastore Servers. You can have multiple data store instances[datastores]bladeDATA1 sockets=0-1:0-1bladeDATA2 sockets=0-1:0-1
2.3.1. Default Resource Allocation
The line starting with sizing defines the default resource allocation. The line assumes that
all machines in the cluster have the same HW configuration. However, the parameters in the
line could be defined for each machine if you need to override the default sizing. The line has
the following parameters:
4
• mem: Memory available in the machine to run the components (in GB). If not set it
defaults to take all the memory in each machine and distribute it for LeanXcale.
• sockets: Lists of sockets in the machine to be used to run the components. Values like 0-2
or 0,1,2 or 0,4 are allowed. In DataStore machines, there should be two socket lists
definition: The socket list for KVDS and the socket list for QE. Those list must be
separated by colon ':'
Again, if not set all sockets in the machine will be used.
• ncores: This is the number of physical cores to be used in each machine to run Lx-DB
components. Take care that this is physical cores so hyperthreads should not be counted
on.
• nthreadscore: Number of hyperthreads per core
• fslog: Folder where the transaction logging files are located. These are not the components
logs, but the transaction logging that guarantees database durability. If SUDO is not
granted, the folder with RW permissions for `USER`should have been created by the
administrator
• fsdata: Parent Folder where database is storing data. If section [DSFs] is available, then
those folders will be used.
It is highly recommended to separate transaction redo logging into different disks from
database data.
An example of this configuration is the following:
...sizing mem=128G sockets=0-1 ncores=12 nthreadscore=0 fslog=/txnlogsfsdata=/lxdata...
This means: The machines have 128GB available and 2 sockets (0 to 1), there are 12 physical
cpus in total and there is no hyperthreading.
Then, redo logging will be written to the filesystem mounted at /txnlogs and data will go to
/lxdata.
5
2.3.2. Default Component Deployment
Components are typically classified into 2 classes:
• Metadata components:
- Zookeeper
- Configuration Manager
- Snapshot Server
- Commit Sequencer
- KiVi Metadata Server
• Datastore components:
- Query Engine
- Transactional Loggers
- Kivi Datastore Server
• Conflict Managers: This are not in either of the two classes, but for the sake of simplicity
are - by default - deployed with metadata components unless High Availability is set up.
Giving this classification, the metadata components are deployed in the metadata server while
datastore components will be distributed considering the number of datastore servers and their
HW capacity.
2.4. Completing the Bare Metal Installation
Once you have finished changing the configuration file you can complete the installation. This
last step will build your detailed configuration file including the resource allocation, and
deploy the installation in the rest of the servers of the cluster.
From this point the cluster is installed and set up to be started.
2.5. Advanced Configuration
There are quite a few detail parameters that can be set to fine tune your installation. Those
parameters are further discussed in Annex I. Advanced Configuration.
• Resource allocation
6
• Filesystems
• Detailed component deployment
7
Chapter 3. High Availability
LeanXcale can be set-up as a high availability cluster. High availability means that in case of a
single component failure or single machine failure the system will keep working non-stop.
You can further define the configuration parameters so the system can keep working in case of
two-machine failure or even stronger high availability situations.
3.1. Configuring High Availability
Configuring High Availability is pretty simple, It just requires that you set the following
configuration parameter in the inventory file:
HA=yes
3.2. Implications
The implications of configuring this parameter are:
• The install SCRIPT will check there are at least three machines configured for the cluster.
There has to be at least 2 machines defined as metadata servers and 2 machines defined as
datastore servers. One machine can act both as metadata and datastore and that’s the reason
there has to be at least 3 machines in the cluster.
The following configuration could be a valid HA configuration:
8
[all:vars]BASEDIR="{{ lookup('env','BASEDIR') }}"BACKUPDIR=/ssd/leandata/backup/USER="{{ lookup('env','USER') }}"
HA=yesnetiface=eth0
[defaults]sizing mem=- sockets=- ncores=- nthreadscore=- fsdata=. fslog=.
[meta]metaserver1dataandmetaserver1
[datastores]dataandmetaserver1dataserver2
• Components will be installed as follows:
• Zookeeper: Zookeeper is the master for keeping the global configuration, health
information and arbitration. The install SCRIPT will configure a Zookeeper cluster with
three Zookeeper members replicated in different machines.
• Transaction Loggers: The loggers are components that log all transaction updates (called
writesets) to persistent storage to guarantee durability of transactions. The loggers will be
configured and replicated in groups so there is a different logger component on a different
machine. Therefore the logging information will be replicated in order to recover
information in case of a crash.
• Conflict Manager: Conflict Managers are in charge of detecting write-write conflicts
among concurrent transactions. They are high available "per se", because If one Conflict
Manager fails the conflict key buckets are transferred to another Conflict Manager. No
special configuration is needed for High Availability except guaranteeing that there are at
least 2 Conflict Managers configured in 2 different machines.
• Commit Sequencer: The commit sequencer is in charge of distributing commit timestamps
to the Local Transactional Managers. Two Commit Sequencers will be configured, but
only one will be acting as master and the other as follower. The follower will take over in
case the master fails.
• Snapshot Server: The Snapshot Sever provides the most fresh coherent snapshot on which
new transactions can be started. As in the case of the Commit Sequencer, 2 snapshot
9
servers will be started one will be acting as master and the other as follower. The follower
will take over in case the master fails.
• Configuration Manager: The configuration manager handles system configuration and
deployment information. It also monitors the other components. Two configuration
managers will be started and one will be the master configuration manager.
• Query Engine: The Query Engine parses SQL queries and transforms them into a query
plan which will derive in a set of actions to the datastores. The query engines are usually
configured in a cluster in which any of them can take part of the load so High availability
has no special requirement on them.
• Datastores: There will usually be several datastores in different machines. The important
point in datastores is replication.
3.3. Replication
So far, High availability configuration ensures the components are configured in a way that no
component failure or machine crash will cause the system to fail.
While components can be available data replication is about data availability so, If a machine
or a disk crashes there is another copy of the data so you can keep working on that copy of the
data. Data replication can be enabled regardless whether you want also to have high
availability or not.
LeanXcale provides a full synchronous replication solution and you can configure as many
replicas as you want. Basically, you can define a mirroring datastore configuration in the
inventory file:
...[datastores]mirror-1: server1 server2mirror-2: server3 server4
This doesn’t mean every server needs to be mirrored. You may also have unmirrored servers
to keep data that doesn’t need to be replicated.
Besides, there are 2 kinds of replicas:
10
• Replicas to minimize the risk of losing data and being able to recover service as soon as
possible.
• Replicas to improve performance. This are usually small tables that are used really
frequently (usually read) so the application can benefit from having several copies that can
be read in parallel.
11
Chapter 4. Operations
The SCRIPTs in this section wrap ansible playbooks. The playbooks are in
$BASEDIR/admin/yml. The playbooks coordinate scripts which are run at each node level.
Before running the SCRIPT It is recommended that you load the environment variables.
Therefore all operations are preceded by:
> cd $BASEDIR$BASEDIR> source env.sh
4.1. Start, Stop & Check your LeanXcale Cluster
• Start Cluster: Starting the cluster just needs to run a SCRIPT
> cd $BASEDIR$BASEDIR> source env.sh$BASEDIR> ./admin/startcluster.sh
The script will check if zookeeper is up and running or not. If it is not running it will start it.
Then it will try to start all the components.
• Stop Cluster: Current stopcluster.sh SCRIPT, uses lxConsole by default to do a graceful
stop. Allows 3 modifiers which cannot be combined:
- force: Tries to shutdown gracefully all components. Then, after 10 seconds kills all
components.
- timeout=N: Tries to shutdown gracefully all components. But - for each component -
if the stop procedure has not finished, it kills the component.
There is a subtle difference between force and timeout parameters. force
waits for all processes to do a complete stop and, then, kills any process that
could have remained up. On the other hand, when using timeout,
components are killed after N seconds even if the stop process is still trying
to stop the component.
I usually use ``force'' to be sure no component keeps running if there is any
problem in the stop chain and timeout if a previous stop command got stuck
for some time.
12
stopcluster never stops zookeeper unless specifically requested
• Check the Cluster: To check whether the components are really running and ports are
bind.
$BASEDIR/admin> ./checkcluster.sh
checkcluster doesn’t check if zookeeper is running directly but It does check that
zookeeper port is listening
• Restart components: This will do the previous check and restart all components that are
found not to be running.
$BASEDIR/admin> ./checkandrestart.sh
• Start/Stop a certain component: The best way to start a certain component is usually to use
checkandrestart as described previously. However, there is a way to specifically do so by
adding the component group to the start or stop SCRIPT. (* Note that currently this can be
done only for one group at a time*)
#stop query engines 1 to 4 in all Datastores$BASEDIR/admin> ./stopcluster.sh DS QE-1..4#start zookeeper in Metadata server$BASEDIR/admin> ./startcluster.sh MS ZK#Try to stop CommitSequencer gracefully, but kill after 20 seconds$BASEDIR/admin> ./stopcluster.sh timeout=20 MS CmS#Start all Conflict Managers(you don't need to know how many thereare) in MetaServer$BASEDIR/admin> ./startcluster.sh MS CflM-0
All component groups configured for the cluster can be seen in $BASEDIR/scripts/env.sh, in
variable `components`
Order of parameters in the SCRIPTs is important
• This is the recommended sequence to be sure that the system is completely stopped
including Zookeeper:
##First stop all the workload$BASEDIR/admin> ./stopcluster.sh$BASEDIR/admin> ./stopcluster.sh timeout=1$BASEDIR/admin> ./stopcluster.sh MS ZK
13
4.2. Monitoring
LeanXcale monitoring is build on two open-source solutions: Prometheus and Grafana.
The following image shows the architecture of components:
LeanXcale Monitoring and Alerting System consist on the following elements:
• Prometheus: Centralizes and stores metrics in a time series database.
• Alert Manager: Triggers alarms based on Prometheus stored information (it is part of
Prometheus software also).
• Exporters: Running in monitorized hosts, export metrics sets to feed Prometheus:
- Mastermind exporter: Publishes metrics from the following Lx components: Snapshot
Server, Commit Sequencer and Config Manager.
- node-exporter: Comes by default with Prometheus instalation, exports a lot of metrics
about machine performance.
- metricsExporter.sh: It exports metrics about Lx components at two levels:
· Lx Component running java code inside JVM.
· Lx Component processes as OS level (I/O, memory,..)
In addition, it’s possible to feed Prometheus with another metrics. It only needs key-value
files placed in /tmp/scrap path. It’s common to use these to export metrics resulting from
executing specific processes. For example, this is the file generated in the TPCC
benchmarking tests:
14
tpccavglatency 101tpmCs 28910aborttpmCs 124
Monitoring is started when the cluster is started and the central components (Prometheus and
Grafana) are deployed in the metadata server. In order to access the monitoring dashboard you
have open the following URL in your browser:
http://{Metadata server hostname}:3000/
The following images shows an example of a working monitoring dashboard:
15
4.3. LOGs & Alerts
In this section we will be discussing about log files, but not referring to the transactional log
files needed to guarantee durability, but to the information provided by the components about
their activity.
In LeanXcale all components log their activity to their log files. Default logging level is INFO
which includes also information on WARNINGs and ERRORs. However, the log level can be
increased or decreased according the situation.
Log files are stored in each machine in the folder:
$BASEDIR/LX-DATA/logs
Besides each component logging, there is a central alert system. All components send
messages considered relevant as alerts for the central system.
Alerts can be accessed through the The Admin Console or through a web browser accessing
URL:
http://{Metadata server hostname or IP address}:9093
Annex II. Alerts provides detailed information on the alerts that the system can show and their
description.
16
You can get the LOGs from all machines in the cluster with the following command:
> cd $BASEDIR$BASEDIR> source env.sh$BASEDIR> ./admin/getlogs.sh
The SCRIPT gets the logs and stores them creating one folder for each of the hostnames of
the cluster: $BASEDIR/tmp/\{hostname of each of the nodes}.
You can also get a summary of the ERROR messages shown in every LOG file in the cluster
node.
> cd $BASEDIR$BASEDIR> source env.sh$BASEDIR> ./admin/geterrors.sh
This looks for the 3 first and 3 latest ERROR messages in each log file in each machine and
then shows the summary.
4.4. Backup & Recovery
LeanXcale provides a hot backup mechanism that allows a full backup of your distributed
database guaranteeing a consistent point in time recovery. The system will be fully available
during the backup. To get the point in time recovery you need to backup and recover not only
the database datastore files, but the transaction log files that hold transaction information
during the backup period up to the point you want to recover.
This means you need to size your transaction log filesystems to keep the files needed for
backup rather than just to guarantee simple database persistence. This size will depend on
your workload, but It is quite simple to estimate with the first tests.
Currently the backup process compresses all backup information in one folder per machine so
you can later move all the backup data to the backup device available in your organization.
The folder where compressed backup data will be stored is defined in the inventory file. The
name of the parameter is BACKUPDIR. We suggest you keep backup in some kind of central
platform according to a backup policy and remove them from local folders.
To perform the backup run the administrator tool admin/hotbackupcluster.sh. This
action will create a backup meta file in the backup destination folder and tar all the
17
data(loggers and persisted data) and database relevant status. All this information will be
named using the timestamp when the hot backup started. This is how the backup folder will
look like:
$ ls -1 backup/kivi_ds_data_dir_1-kvds.tar.gzkivi_ms_data_dir.20190827-130652.tar.gzkivi_ms_data_dir.tar.gzmeta.backup.20190827-130652tm_logger_ltm_1.00000000.log.20190827-130652.tar.gzzookeeper.20190827-130652.tar.gzzookeeper.tar.gz
4.4.1. Hot backup meta file
The backup process will generate a backup meta file that includes all the necessary
information to perform the recovey froma hot backup. This information includes the last Sts
persisted in the data storage files and transaction logs status along with the range of
timestamps recorded.
kiviSts=88000kvds[0].serviceIp=127.0.0.1,9992,kvds-XPS-LX-1kvds.length=1loggers[0].type=LgLTMloggers[0].serviceIp=127.0.0.1,13422,logger_ltm-0loggers[0][0].file=tm_logger_ltm_1/00000000.logloggers[0][0].firstTs=997001loggers[0][0].lastTs=10914004loggers[0].length=1loggers[1].type=LgCmSloggers[1].serviceIp=127.0.0.1,13400,logger_cs-0loggers[1].length=0loggers.length=2
4.4.2. Recovery from a hot backup
To recover from a hot backup just run the recovery with the hot backup option as follow:
admin/recovercluster.sh hotbackup. This action will stop Leanxcale, find a
backup meta file in BACKUPDIR and restart again Leanxcale with the related backup data. It
may take some minutes depending on the amount of non persisted data from the loggers.
18
4.4.3. Cold Backup
Sometimes you need to do a backup of a non critical installation or to move data from a
preproduction environment to a development backup. To do a cold backup you need to stop
LeanXcale first, but the backup process will be faster and won’t require any transaction log
file so the recover will be also faster and easier.
To perform this kind of backup run the administrator tool admin/backupcluster.sh
but remember you will need to stop Leanxcale first. This action will save all the data,
metadata and zookeeper files in BACKUPDIR.
4.4.4. Cold Backup recovery
Similar as in the case of hot backup, just execute the administrator tool
admin/recovercluster.sh with no arguments. This will stop Leanxcale, restore the
backup from BACKUPDIR and start Leanxcale again.
4.4.5. Data Export
When you want only to move some data from one environment to another, for example in
testing environments, you can perform a data backup with the tool LX-
BIN/bin/lxBackup. This will dump all database data and metadata to the desired folder.
In case you are planning to change the Leanxcale version or want to tests Leanxcale against
other database system, we suggest to use the csv option for flexibility. This will dump data in
CSV files so It can be easily moved to any other system.
For importing the backup data just use the --import option. Find below other useful
options:
$ LX-BIN/bin/lxBackup --helpusage: cmd [options] -bs,--blocksize <arg> Block size when reading from binaryfile. Default is 4096 -csv,--csvformat Import/Export in csv format -f,--folder <arg> Folder to import/export datafrom/to. It is created when exporting.Mandatory when importing. Default is /tmp/lxbackup -h,--dsaddress <arg> Datastore address. Not needed when connecting through Zookeeper.
19
Default is localhost -help,--help show usage -i,--import Import action. Default is export -p,--dsport <arg> Datastore port. Not needed whenconnecting through Zookeeper. Default is 1529 -sep,--recordseparator <arg> Record separator. Default is '@\n' -skd,--skipdata Skip Importing/exporting tabledata. -skp,--skipperms Skip Importing/exportingpermissions -sksq,--skipsequences Skip importing/exporting sequence definitions -sktdef,--skiptabledefs Skip Importing/exporting tabledefinitions -skti,--skiptableindexes Skip Importing/exporting tableindexes -skts,--skiptablesplits Skip Importing/exporting tableregions definitions -tgz,--tarzip Compress and remove target folder -th,--datathreads <arg> Number of threads to handle data.Default is 4 -tl,--tablelist <arg> JSON with the list of tables andlimits to be imported/exported. Example: {"backupFileTableInfos": [{ "tableName":"t1", "limits":[{"min": "3", "max": "10"},{"min": "10", "max": "20"}]}]} -ts,--timestamp <arg> Fixed Timestamp. Not needed when connecting through Zookeeper -tsplit,--tablesplit Export table data regions todifferent files. Overriden by tablelist file -zh,--zkaddress <arg> ZooKeeper address -zp,--zkport <arg> ZooKeeper port. Default is 2181
4.5. Data Partitioning & Distribution
One key element of a distributed database is how data is distributed among the different
datastores.
LeanXcale provides 3 ways of partitioning data:
20
• Key range partitioning. This is the most efficient way of partitioning If you know how to
balance your data.
You can use the admin console to partition your data and move it to the right datastore. See
The Admin Console to get further information on how to do it.
If you have a CSV file with a significant sample of your data, there is also a tool that will
sample the file and recommend the split points for balanced key ranges according to the
information in the CSV file.
This tool is lxSplitFromCSV and there is further information following the link to the section
describing the tools.
• Hashing. Hashing is a very simple partitioning scheme and has the advantage of doing
very good at balancing data across datastores. The system will calculate a hash value from
the key and it will distribute data considering the modulus of the has value. The
disadvantage of hashing is that - for scans - the system has to do the scan in all datastores
because data can be stored in any of them. Thus, scans require more resources than if you
do a key range partitioning.
• Bidimensional partitioning: Bidimensional partitioning is an automatic partitioning
schema that is normally set up on top of one of the previous partitioning schemas. You
need to define a time evolving parameter and a criteria that will cause the system to
automatically do partition your data to get the best of resources.
4.6. Authentication, SSL & Permissions
There are two options to configure LeanXcale authentication
• LDAP based authentication. You can set up an LDAP server just for LeanXcale, but this is
most interesting for integrating LeanXcale in your organization’s LDAP (or Active
Directory) and provide some kind of Single Sign On or at least the chance to use a
common password within the organization.
• Open shared access. This is not to be used in production except for shared data. Access
level can be set in the firewall based on IP rules, but all users accessing will be granted
access and they will be able to use a user schema. This is very easy to set-up for
development and testing environments.
Permissions are granted through roles.
21
Communications can be configured so SSL is used for connections between components
though It is recommended to run all components behind a firewall and use JDBC over SSL
and the clients' connections over SSL.
4.7. System DB Tables & Views
LeanXcale’s query engine provides a series of virtual tables that represent system information
that can be useful for the end user or the administrator.
The following sections provide some information of the system virtual tables available.
4.7.1. SYSMETA.TABLES
It shows all the tables in the database instance.
> select * from sysmeta.tables;+----------+------------+--------------+--------------+---------+---------+-----------+----------+------------------------+---------------+| tableCat | tableSchem | tableName | tableType | remarks |typeCat | typeSchem | typeName | selfReferencingColName |refGeneration |+----------+------------+--------------+--------------+---------+---------+-----------+----------+------------------------+---------------+| | APP | CUSTOMER | TABLE | || | | | || | APP | DISTRICT | TABLE | || | | | || | APP | HISTORY | TABLE | || | | | |+----------+------------+--------------+--------------+---------+---------+-----------+----------+------------------------+---------------+
4.7.2. SYSMETA.COLUMNS
It shows all the column information related to the tables
22
> select * from sysmeta.columns;+----------+------------+--------------+------------------------+----------+-----------------------------------+------------+--------------+---------------+--------------+----------+-------+| tableCat | tableSchem | tableName | columnName |dataType | typeName | columnSize |bufferLength | decimalDigits | numPrecRadix | nullable | remar |+----------+------------+--------------+------------------------+----------+-----------------------------------+------------+--------------+---------------+--------------+----------+-------+| | APP | CUSTOMER | C_ID | 4| INTEGER | -1 | null |null | 10 | 1 | || | APP | CUSTOMER | C_D_ID | 4| INTEGER | -1 | null |null | 10 | 1 | || | APP | CUSTOMER | C_W_ID | 4| INTEGER | -1 | null |null | 10 | 1 | || | APP | CUSTOMER | C_FIRST |12 | VARCHAR | -1 | null| null | 10 | 1 | || | APP | CUSTOMER | C_MIDDLE |12 | VARCHAR | -1 | null| null | 10 | 1 | |...
4.7.3. LXSYSMETA.PRIMARY_KEYS
It shows all the primary keys related to the tables in the system.
4.7.4. LXSYSMETA.FOREIGN_KEYS
It shows all the foreign keys related to the tables in the system.
4.7.5. LXSYSMETA.INDEXES
It shows all the indexes created for the tables in the system.
23
> select * from lxsysmeta.indexes;+-----------+----------------+-------------+------+-----------------+------------+-----------+-------------+-------+-----------------+----------+------------+-----------+| nonUnique | indexQualifier | indexName | type | ordinalPosition| columnName | ascOrDesc | cardinality | pages | filterCondition |tableCat | tableSchem | tableName |+-----------+----------------+-------------+------+-----------------+------------+-----------+-------------+-------+-----------------+----------+------------+-----------+| false | APP | IX_ORDERS | 2 | 2| O_W_ID | A | 0 | 0 | |tpcc | APP | ORDERS || false | APP | IX_ORDERS | 2 | 1| O_D_ID | A | 0 | 0 | |tpcc | APP | ORDERS || false | APP | IX_ORDERS | 2 | 3| O_C_ID | A | 0 | 0 | |tpcc | APP | ORDERS || false | APP | IX_CUSTOMER | 2 | 2| C_W_ID | A | 0 | 0 | |tpcc | APP | CUSTOMER || false | APP | IX_CUSTOMER | 2 | 1| C_D_ID | A | 0 | 0 | |tpcc | APP | CUSTOMER || false | APP | IX_CUSTOMER | 2 | 5| C_LAST | A | 0 | 0 | |tpcc | APP | CUSTOMER |+-----------+----------------+-------------+------+-----------------+------------+-----------+-------------+-------+-----------------+----------+------------+-----------+
4.8. LXSYSMETA.TABLE_CHECKS
It shows all the table checks (NOT NULL, …), autoincremented columns and geohashed
fields.
Geohash fields are fields indexed geographically for GIS applications
4.9. LXSYSMETA.TRANSACTIONS
It shows all active transactions in the Query Engine relating the information to the connection.
The following fields are showed:
• txnId: The internal unique identifier of the transaction
24
• state: The state of the transaction
• startTs: This is the timestamp that controls the visibility of the transaction according to
snapshot isolation principles.
• startTime: This is the time from the epoch in milliseconds when the transaction was
started.
• commitTimeStamp: It is usually -1, meaning that the transaction is not doing COMMIT,
yet. Since COMMIT is a short time phase in the transaction you could seldomly see a
transaction with a meaningful COMMIT timestamp.
• sessionId: This is the internal session identifier in the Query Engine
• connectionId: This allows to relate the transaction with the connection.
• uid: Identifier of the user who owns the session in which the transaction is being done.
• connectionMode: Mode of the connection.
• numPendingScans: Number of SCANs the transaction started but are not finished
• numTotalScans: Total number of SCANs for the transaction
• hasWrite: True if the transaction did any write operation (UPDATE, INSERT)
> select * from lxsysmeta.transactions;+-------------+--------+-------------+-----------------+-----------------+-----------+--------------------------------------+-----+----------------+-----------------+---------------+-------+| txnId | state | startTs | startTime |commitTimeStamp | sessionId | connectionId| uid | connectionMode | numPendingScans | numTotalScans | hasWr |+-------------+--------+-------------+-----------------+-----------------+-----------+--------------------------------------+-----+----------------+-----------------+---------------+-------+| 44011298001 | ACTIVE | 44011298000 | 526814361087000 | -1| 1217 | 835e1f1a-cd0a-4766-9e53-182ed1bb39d2 | | RUN| 0 | 0 | false |+-------------+--------+-------------+-----------------+-----------------+-----------+--------------------------------------+-----+----------------+-----------------+---------------+-------+
4.10. LXSYSMETA.CONNECTIONS
It shows information about all existing connections
25
> select * from lxsysmeta.connections;+--------------------------------------+--------------------------------------+-----------+-------+----------------+-------------------------------------------------------------------------+| connectionId | kiviConnectionId| isCurrent | dsuid | connectionMode ||+--------------------------------------+--------------------------------------+-----------+-------+----------------+-------------------------------------------------------------------------+| 0b7ef86b-924f-4a1b-bf28-2e0661cd47a2 | 835e1f1a-cd0a-4766-9e53-182ed1bb39d2 | true | app | RUN |{parserFactory=com.leanxcale.calcite.ddl.SqlDdlParserImpl#FACTORY,sche |+--------------------------------------+--------------------------------------+-----------+-------+----------------+-------------------------------------------------------------------------+
4.11. Cancelling Transactions
There are some situation when you run the wrong query and you realize it will take too long
because there is a missing index or condition and you don’t want to let the transaction running
and taking up resources.
In those situations, it is possible to cancel the transaction from the tool lxClient [lxClient]
usint the command CANCEL TRANSACTION <transaction id>. This operation will
rollback the connection which the transaction belongs to if it is performed by the same
connection’s owner. It will cancel the transaction unless it is already commiting or rolling
back.
4.12. Closing a Connection
Every user can close its own existing connection with CLOSE CONNECTION
<connection id>. Closing a connection will mean rolling back all running transactions.
In case of long running transactions we recommend you cancel them first.
26
Chapter 5. The Admin Console
The admin console can be started running
lxConsole
This is the main administration console for the LXDB. The console allows the user to check
and manage the configuration of a LXDB instance. The console can be run through full
command names or by the short numbers that appear between brackets.
These are the main functions you can do through the console:
5.1. Common functions
• [0] quit: quits the current lxConsole
• [1] help: prints the list of available commands
• [2] components: Shows a list of all the components and its acronyms
• [3] running [component]: Shows a table with the current running instances for each
component. If the argument component is given, it prints the instnaces running for that
component
• [4] deads [component]: It is not common to have dead components, but in case of a
failure or machine crash, this option can show components that were already up, running
and registered and then are lost. If the argument component is provided, then the
command will give the dead instances for that specific kind of component.
• [6] halt [component [instanceId]]: halts LeanXcale, a set of components or a specific
component. This command accepts component, to kill all the running instances of a given
component. In addition to the first argument, another argument can be added to halt only
an instance of a component.
• [8] trace [nLines]: prints the nLast traces of LeanXcale. This trace contains the
registered, failed, recovered and stopped instances.
• [15] listServers: List all Datastore available servers.
• [16] listRegions [table]: List all data regions (different partitions) of a table.
• [38] getAlerts: Show alerts stored at LXDB Alert Manager
27
5.2. Data Distribution & Replication Functions
• [17] moveRegion [table regionId DSsource DStarget]: Move a data region of a table
from one Datastore server to another.
• [18] splitTable [table splitKey DStarget]: Split a table by the split key and move the new
region to another Datastore server.
• [19] splitTableUniform [table splitKeyMin splitKeyMax NumOfSplits]: Split a table in
NumOfSplits points between a min and max key range, and distribute (round robin) the
partitions among all the available Datastore servers.
• [25] replicateRegion [table minKey DStarget]: Replicate a data region into another
Datastore server, starting from the suplied minKey.
• [26] removeReplica [table minKey DStarget]: Remove a replicated data region from a
Datastore server.
• [27] joinRegion [table joinKey]: Join two different data regions, by the specified
joinKey.
• [28] increaseTable [table DStarget stategy min max]: Add a new data region to a table.
Defining the strategy for just add or re-size all the existing data regions.
• [29] decreaseTable [table DStarget stretegy]: Delete a data region of a table. Defining
the strategy for just remove it, or re-size all the left partitions.
• [30] listServer [DStarget]: List all the data regions available on a given Datastore server.
• [31] cloneKVDS [DSsource DStarget]: Replicate all the data regions of a Datastore
server into another.
5.3. Advanced Functions
• [5] buckets: The concept of bucket is related with the way table keys are distributed
among conflict managers to scale conflict management. The command gives the current
assignation of buckets.
• [7] loggers: Shows how transaction logs are assigned to Query Engine and local
transaction managers.
• [9] zkTree [path]: this is a tool to recursively print the Apache ZooKeeper tree from the
path provided. If path is not provided, the default value is /leanxcale
• [10] zksts: Prints the current persisted snapshot. This command also return the current
snapshot of the system.
28
• [12] markAsRecovered [component [instanceId]]: Marks a dead instance as recovered.
• [13] csDiscardLtm [instanceId]: Discard the selected LTM for the commit sequencer.
Disable a client.
• [14] setLogger [component [instanceId]]: Manually assign a transaction logger for a
LTM.
• [20] startGraph: Show the dependencies between components at start time.
• [21] kiviSts: Show the latest STS persisted at the Datastore.
• [22] cgmState: Show the Configuration Manager component operative state.
• [23] persistKiviSts: Persist STS manually to the Datastore.
• [24] getEpoch: Show the different system configurations through the time
• [32] setKvconPath [path]: Set the path for the kvcon (direct datastore console)
command.
• [33] kvcon [command]: Sent a command to the kvcon console.
• [34] kiviRecoveryStats: Show the stats for the recovery process over each Datastore.
• [35] syncDS [DStarget]: Force sync data to disc for the specified Datastore servers.
• [36] getProperty: Show LXDB startup properties
• [37] getOsts: Show the oldest STS of the actual transaction ongoing in the LXDB.
• [39] kvmsEndpoint: Show the actual Datastore metadata server connection endpoint.
29
Chapter 6. LeanXcale Tools
The LeanXcale tools are stored at $BASEDIR/LX-BIN/bin and $BASEDIR/utils
directories.
6.1. lxClient
Though we recommend you use a JDBC SQL GUI opensource like SquirrelSQL, DBeaver or
commercial, LeanXcale installs a console text client so you can do some administration tasks
or scripting over it. This client is lxClient: The main SQL terminal connection server for
LXDB.
The SQL dialect implemented is ANSI SQL which is the most standard one. It allows the
following parameters:
-u <database url> the JDBC URL to connect to -n <username> the username to connect as -p <password> the password to connect as -d <driver class> the driver class to use -e <command> the command to execute -nn <nickname> nickname for the connection -ch <command handler>[,<command handler>]* a custom commandhandler to use -f <file> script file to execute (same as--run) -log <file> file to write output -ac <class name> application configuration classname -ph <class name> prompt handler class name --color=[true/false] control whether color is usedfor display --colorScheme=[chester/dark/dracula/light/obsidian/solarized/vs2010] Syntax highlight schema --confirm=[true/false] confirm before executingcommands specified in confirmPattern --confirmPattern=[pattern] pattern of commands to promptconfirmation --csvDelimiter=[delimiter] Delimiter in csv outputFormat --csvQuoteCharacter=[char] Quote character in csvoutputFormat --escapeOutput=[true/false] escape control symbols in output --showHeader=[true/false] show column names in queryresults --headerInterval=ROWS the interval between whichheaders are displayed
30
--fastConnect=[true/false] skip building table/column listfor tab-completion --autoCommit=[true/false] enable/disable automatictransaction commit --verbose=[true/false] show verbose error messages anddebug info --showTime=[true/false] display execution time whenverbose --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --strictJdbc=[true/false] use strict JDBC --nullValue=[string] use string in place of NULLvalues --numberFormat=[pattern] format numbers usingDecimalFormat pattern --dateFormat=[pattern] format dates usingSimpleDateFormat pattern --timeFormat=[pattern] format times usingSimpleDateFormat pattern --timestampFormat=[pattern] format timestamps usingSimpleDateFormat pattern --force=[true/false] continue running script evenafter errors --maxWidth=MAXWIDTH the maximum width of theterminal --maxColumnWidth=MAXCOLWIDTH the maximum width to use whendisplaying columns --maxHistoryFileRows=ROWS the maximum number of historyrows to store in history file --maxHistoryRows=ROWS the maximum number of historyrows to store in memory --mode=[emacs/vi] the editing mode --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv/tsv/xmlattrs/xmlelements/json] format mode for result display --isolation=LEVEL set the transaction isolationlevel --run=/path/to/file run one script and then exit --historyfile=/path/to/file use or create history file inspecified path --useLineContinuation=[true/false] Use line continuation --incremental=[true/false] display result rows immediatelyas they are fetched, yielding lower latencyand memory usage at the price of extradisplay column padding --incrementalBufferRows integer threshold at which to switch toincremental mode --help display this message
31
Once inside lxClient, if you just type
!help
(the ! is important)
6.2. lxBackup
This is the tool you can use to do a data export. A data export may not give you a consistent
snapshot of the data because It is not using a consistent start timestamp for all the tables you
can configure to be dumped.
It will store all data exported in a temporal folder the structure of all database metadata and
also will backup the data itself.
The backup can be restored to a new (clean) database.
32
-bs,--blocksize <arg> Block size when reading from binaryfile. Default is 4096 -csv,--csvformat Import/Export in csv format -f,--folder <arg> Folder to import/export datafrom/to. It is created when exporting.Mandatory when importing. Default is /tmp/lxbackup -h,--dsaddress <arg> Datastore address. Not needed when connecting through Zookeeper.Default is localhost -help,--help show usage -i,--import Import action. Default is export -p,--dsport <arg> Datastore port. Not needed whenconnecting through Zookeeper. Default is 1529 -sep,--recordseparator <arg> Record separator. Default is '@\n' -skd,--skipdata Skip Importing/exporting tabledata. -skp,--skipperms Skip Importing/exportingpermissions -sksq,--skipsequences Skip importing/exporting sequence definitions -sktdef,--skiptabledefs Skip Importing/exporting tabledefinitions -skti,--skiptableindexes Skip Importing/exporting tableindexes -skts,--skiptablesplits Skip Importing/exporting tableregions definitions -tgz,--tarzip Compress and remove target folder -th,--datathreads <arg> Number of threads to handle data.Default is 4 -tl,--tablelist <arg> JSON with the list of tables andlimits to be imported/exported. Example: {"backupFileTableInfos": [{ "tableName":"t1", "limits":[{"min": "3", "max": "10"},{"min": "10", "max": "20"}]}]} -ts,--timestamp <arg> Fixed Timestamp. Not needed when connecting through Zookeeper -tsplit,--tablesplit Export table data regions todifferent files. Overriden by tablelist file -zh,--zkaddress <arg> ZooKeeper address -zp,--zkport <arg> ZooKeeper port. Default is 2181
33
6.3. lxCSVLoad
CSV files loader. In order to increase the performance this tool will load data directly over the
datastore.
Tables need to be defined before load data on them.
-c,--connection_string <arg> KiviDB connection string.Theproper format is kvmsHost!kvmsPort. -t,--table <arg> Name of the table -f,--csv_file <arg> Path to the csv file -y,--column_separator <arg> [Column separator used on thecsv file. Default '|'] -z,--zookeeper_connection <arg> [Zookeeper connection string.Theproper format is host:port] -ts,--timestamp <arg> [Value of timestamp in casethere isn't a Zookeeper connection. Defaultvalue is 1] -v,--verbose Verbose output -th,--threads <arg> [Number of writers threads.Default value is 5] -split,--split Read file by regions as an splitwould do. Don't use when fields havemultiple lines -sz,--bufferSize <arg> [File region size per iteration.To use along with split option] -params,--params <arg> [path of the file with theparameters for the loader]
This can be used in the following way:
./lxCSVLoad -c 125.11.22.30\!44000 -z 125.11.22.30:2181 -t tpcc-APP-ORDER_LINE -f /lx/LX-DATA/CSVs/1-20/order_line-1_20.csv &
Usually there are with a huge amount of data. To improve the performance in those scenarios
is able to split It is recommended to split the file in smaller parts and load them in parallel
34
with several instances of the process, each one with one file, but all of them loading data over
the same table.
It can be done in this way:
./lxCSVLoad -c 125.11.22.30\!44000 -z 125.11.22.30:2181 -t tpcc-APP-ORDER_LINE -f /lx/LX-DATA/CSVs/1-20/order_line-1_20.csv &./lxCSVLoad -c 125.11.22.30\!44000 -z 125.11.22.30:2181 -t tpcc-APP-ORDER_LINE -f /lx/LX-DATA/CSVs/21-40/order_line-21_40.csv &
6.4. lxDataGenerator
A random data generator used for tests proposes.
-bs,--batchsize <arg> Batch size. Default is 100 -ct,--createtable Creates the table too -d,--driver <arg> Driver class name. Default com.leanxcale.client.Driver -dl,--delay <arg> Next row Milliseconds delay. Defaultis 0 -f,--file <arg> Generator Info. CheckDataGeneratorInfo. Example: {"totalRows":"3", "fieldInfoList":[{"position":"0", "generateInfo":{"isRowNumber":"true"}},{"po sition":"1", "generateInfo":{"isRowNumberSummation":"tru e"}},{"position":"2", "generateInfo":{"constantValue":"0"}}]} -fc,--forcekvconnect Force to dial Kivi in case there isKivi input but forced also to use onethread -h,--dsaddress <arg> Datastore address. Not needed when connecting through Zookeeper.Default is localhost -help,--help show usage -jdbc,--jdbc Insert data into table y jdbc.Zookeeper and DS host/ports params areignored. -p,--dsport <arg> Datastore port. Not needed whenconnecting through Zookeeper. Default is 1529
35
-sep,--fieldseparator <arg> Field separator. Default is '|' -t,--tablename <arg> Table name. Dump in CSV format ifnot present -th,--datathreads <arg> Number of threads to handle data.Default is 4 -ts,--timestamp <arg> Fixed Timestamp. Not needed whenconnecting through Zookeeper -u,--url <arg> the URL of the database to which to connect. Default is jdbc:leanxcale://localhost:1529/adhoc;user= app -v,--verbose Show progress -zh,--zkaddress <arg> ZooKeeper address -zp,--zkport <arg> ZooKeeper port. Default is 2181
6.4.1. Data generator file
General Info
• totalRows: Number of rows to dump/insert.
• fieldInfoList: List of Field Info. All field values are calculated and can be used as a param
to generate other field infos. Take into account that values are calculated in order(order in
list not field position), so make sure that a `field-param' is calculated before needed.
• tableColumns: List of table column names to generate data for. The generated data will
map the same order here. All the fields are columns if none is specified.
• primaryKeys: List of table column names that compose th primary key
Field Info
• name: field name. Either name or nameList is mandatory
• nameList: list of field names when the field generates a list of values
• jSqlType: field type(java.sql.Types). Needed when inserting through jdbc and when
creating the table
• jSqlTypeList: list of field types when the field generates a list of values. Needed when
inserting through jdbc and when creating the table
• generateInfo: info to generate field/s value/s
36
Generate Info
• constantValue: generate the indicated constant
• isRowNumber: indicate to assign the row number to the field
• isRowNumberSummation: indicate to assign the row number summation to the field
• randomValue: list of constants/params to get the value from randomly
• randomStringLength: legnth of the string to be generated randomly. It can be combined
with randomValue to concatenate `n' random values. For example: if randomValue = [0,1]
and randomStringLength = 3, it could generate 000 or 001 or …
• isTimestamp: indicate to assign the current timestamp in millis
• isNull: indicate to generate a null value
• composedString: list of constants/params to compose a string. For example if it is
[preffix'', $1, suffix''] and the field with position 1 is a row number summattion it
will generate preffix0suffix'', preffix1suffix'', ``preffix3suffix'',…
• functionInfo: Use reflection to generate the value. Check below
• queryInfo: Get input data from a Jdbc query. Execute a query at initialization and each
query-row are used in each generated-row.
• dynamicQueryInfo: Same as previous but execute the query for every generated row so it
just read the first query-row. Before executing, it replace the field references with their
values. For example: select a from t where b = $10, having a row number
field in position 10, it will execute select a from t where b = 0, select a
from t where b = 1, select a from t where b = 2,…
• csvInfo: Get input data from a csv file
• tableInfo: Get input data from a Kivi table through Kivi API. Use along with
--forcekvconnect if the target is not a `Kivi-table'
Reflection Info
• className: for example java.util.Random
• method: for example nextInt. It invokes the constructor when null.
• instance: param to invoke the method from. If null, the method should be static or the
class should have a no arguments constructor.
• argumentTypes: for primitives types use the same type name as in java, otherwise use the
37
full class name. For example: [``int'']
• params: list of constants/params
Query Info
• query: query string
• url: the URL of the database to which to connect to execute the quey. Default is the one
from the command line
• driver: Driver class name. Default is the one from the command line
• referenceChar: char to indicate field reference in case you can’t use `$' (only for dynamic
queries)
CSV Info
• fileName: Csv file name.
• separator: Csv separator. Default is `|'.
Table Info
• databaseName: database name
• schemaName: schema name
• tableName: table name
• limits: list of limits/regions to scan
Limits/region
• min: Minimun region value. List of primary key values
• max: Maximun region value. List of primary key values
6.4.2. Example Configuration for the Data Generator
{ "totalRows": "10", "tableColumns":["rowNumber","rowSummattion","nullField","tstamp","constantquery","dynamicquery"], "fieldInfoList": [ {
38
"name": "rowNumber", "jSqlType": "4", "generateInfo": { "isRowNumber": "true" } }, { "name": "rowSummattion", "jSqlType": "-5", "generateInfo": { "isRowNumberSummation": "true" } }, { "name": "nullField", "jSqlType": "4", "generateInfo": { "isNull": "true" } }, { "name": "tstamp", "jSqlType": "-5", "generateInfo": { "isTimestamp": "true" } }, { "nameList": [ "constantquery" ], "jSqlTypeList": [ 12 ], "generateInfo": { "queryInfo": { "query": "select tableName from sysmeta.tables" } } }, { "nameList": [ "dynamicquery" ], "jSqlTypeList": [ 12 ], "generateInfo": { "dynamicQueryInfo": { "query": "select cast((TIMESTAMPADD(SECOND,cast($tstamp/1000 AS int), DATE '1970-01-01')) as date)", "driver": "com.leanxcale.client.Driver",
39
"url": "jdbc:leanxcale://localhost:1529/adhoc;user=app" } } }, { "name": "randomString", "jSqlType": "12", "generateInfo": { "randomStringLength": "12" } }, { "name": "constantfloat", "jSqlType": "6", "generateInfo": { "constantValue": "3.14159" } } ]}
Generated data:
$ LX-BIN/bin/lxDataGenerator -th 1 -f ../datagenerator.json -jdbc0|0||1568094848499|CONNECTIONS|2019-09-10|mbQKJKUDkD6m|3.141591|1||1568094848607|FOREIGN_KEYS|2019-09-10|cm8s72CJYTap|3.141592|3||1568094848642|PRIMARY_KEYS|2019-09-10|qktz8WKcpiMs|3.141593|6||1568094848665|TABLE_CHECKS|2019-09-10|Xz6d8iYhLhgC|3.141594|10||1568094848695|TRANSACTIONS|2019-09-10|1THfF7b1FHjt|3.141595|15||1568094848724|COLUMNS|2019-09-10|jcwkubzUnG3H|3.141596|21||1568094848753|TABLES|2019-09-10|yO8ni68nlvSB|3.14159
6.5. lxSplitFromCSV
Split a table based on the data distribution of a CSV file.
Usually the data distribution are not homogeneous based on the defined primary key. Because
of that a uniform interval table partitioning is not the best approach in order to balance the
Datastores workload. With this tool you are able to scan the data distribution of CSV files
before load them at LXDB and split / distribute the table among all Datastore servers.
40
$> ./lxSplitFromCSVusage: LX table split -c,--delimiter <arg> CSV char delimiter -d,--database <arg> database name -f,--file <arg> path to CSV file -h,--header flag specifying whether CSV has header row -j,--jdbcurl <arg> database JDBC URL -k,--kiviurl <arg> Kivi meta server URL. The proper format iskvmsHost!kvmsPort -r Use reservoir sampling (slower, but morereliable with multi-line columns) -s,--schema <arg> schema name -t,--table <arg> table name
6.6. checkdb.py
This is the main script to do a full check in the database. If this script do not report any error,
everything is working fine. Otherwise it will inform the issue and suggest some actions in
order to solve it.
$> python checkdb.pyChecking Zookeeper Service :: . .................... OKChecking Lx Console Connects :: . ....................OKChecking Processes in Console :: . ....................OKChecking Component Ports are bound :: .............................. OKChecking Recovery Process :: .................... OKChecking KiVi Datastores :: .. .................... OKChecking Snapshot is working :: .. ....................OKChecking DB SQL Connection :: .................... OK
The database seems to be running successfully
41
Chapter 7. Query Hints
As of now, hints are defined through the use of table functions, that must be used to enable,
create and remove the hints.
Starting from the point that a hint is a way to help the planner to build better execution plans,
the lifecycle of a hint should be the following:
1. Enable hints for the connection.
2. Create/list/remove hint
3. Refresh plan cache
4. Test query (which installs the new execution plan in the cache)
5. Go to 2 if the performance is not ok
6. Disable hints for the connection
Once the hints are set, they will be kept unless removed.
7.1. Enable hints
Enable the hints creates a context for the present connection where store the defined hints.
This can be done using enableHints function as follows:
select * from table(enableHints())
The result should be a row with the following message:
Hints enabled for connection %s
7.2. Disable hints
Disable the hints destroy the hints context for the present connection and removes all the hints
created. However, the execution plans that have been created using queries will remain in the
plan cache.
This can be done using disableHints function as follows:
select * from table(disableHints())
The result should be a row with the following message:
42
Hints disabled for connection %s
7.3. List hints
Lists hints can be useful to know the defined hints for the present connection, and to retrieve
their ID (which can be used to remove them).
This can be done using ``listHints'' function as follows:
select * from table(listHints())
The result will be a table with a row for each defined hint and three columns (ID, which
identifies the hint within the connection, the hint type and a description)
7.4. Define hints
To define hints you have to invoke the appropriate function for the hint you want to define.
The available hints are:
7.4.1. Force access
This hint forces the access to a table through a defined index. To use it invoke the function as
follows:
select * from table(forceAccess('TABLE_NAME','INDEX_NAME'))
Where TABLE_NAME is the qualified name of the table (i.e. tpcc-APP-CUSTOMER) and
INDEX_NAME is the qualified name of the index. This function does not check the existence
of the index nor the table so, if you force the access through an inexistent index, the queries
that use this table will fail.
7.4.2. Disable Pushdown
This hint disables the pushdown of predicates (filter or agregation programs) to the KiVi
datastores for a given table. This means that all the work will be done by the Query Engine
except for retrieving data from the persisten storage system.
To use it invoke the function as follows:
select * from table(disablePushdown('TABLE_NAME'))
43
Where TABLE_NAME is the qualified name of the table (i.e. tpcc-APP-CUSTOMER).
7.4.3. Fixed Join Order
This hint sets the order of the joins as they are written in the query. This hint works at query
level, so it will affect to all the joins present in the query.
To use it, invoke the function as follows:
select * from table(fixJoinOrder())
7.5. Force Join Type
This hint allows to force the kind of join which will be used between two given tables.
To use it, invoke the function as follows:
select * from table(forceJoinType('JOIN_TYPE', 'TABLE_1',
'TABLE_2'))
Where JOIN_TYPE is one of the available types (CORRELATE, MERGE_JOIN,
NESTED_LOOP) and TABLE_1 and TABLE_2 are the unqualified names of the affected
tables (i.e. CUSTOMER or WAREHOUSE). This function does not check that the chosen join
type is available for a given pair of tables so if not the query will fail.
7.6. Or to Union
This hints transforms OR conditions into UNION clauses. It accepts a boolean parameter
which indicates whether the UNION will be a UNION ALL or not.
To use it, invoke the function as follows:
select * from table(orToUnion(all))
Where all is true for a UNION ALL and false in another case.
7.7. Parallel Hints
The following hints are used to manage the parallel mode. In parallel mode, scans are
parallelized among workers that do a fraction of the scan depending on how data is distributed
among the KiVi datastores.
44
7.7.1. Enable Parallel Mode
This hint is used to enable the planner rules that transform a kivi scan into a parallel scan
where possible, and propose it as an alternative way to traverse a table (which will compete
with the rest of alternatives in terms of cost).
To use it, invoke the function as follows:
select * from table(enableParallelMode())
7.7.2. Define Parallel Tables
This hint is used to define the set of tables over which the scans can be parallelized. This hint
accepts a Java regular expression as a table name, so you can use a regexp to cover several
tables or invoke this hint for each single one.
To use it, invoke the function as follows:
select * from table(enableParallelTable(REGEXP))
Where REGEXP is a Java regular expression which applies over the full qualified table name
(i.e 'tpcc-APP-.*' or '.*' for all the tables).
7.7.3. Enable Parallel on Aggregates that are pushed down
This hint enables the planner rules which allows to transform a kivi scan with an aggregation
program into a kivi parallel scan (with the same aggregation program). To be chosen, this
parallel scan will have to beat the rest of the alternatives in terms of cost.
To use it, invoke the function as follows:
select * from table(enableParallelAgg())
7.8. Remove Hints
This function allows to remove a hint for the present connection. But, if the hint has been used
to execute a query and an execution plan has been created and stored in the query plan cache,
remove the hint won’t remove the query plan and the following queries will use it for their
execution.
To use it, invoke the function as follows:
45
select * from table(removeHint(HINT_ID))
Where HINT_ID (an integer, without quotes) is the identifier of the hint that wants to be
removed, and can be obtained from the listHints() function.
7.9. Clean Query Plan Cache
This function removes all the query plans that have been cached for a given table (not only for
this connection but for every query in the server). It can be useful to ensure that a new query
plan will be calculated when a hint is added or removed.
To do this, invoke the function as follows:
select * from table(cleanPlanCache('TABLE_NAME'))
Where TABLE_NAME is the qualified name of the table (i.e. tpcc-APP-CUSTOMER) from
which the query plans will be removed.
46
Chapter 8. Annex I. Advanced Configuration
8.1. Inventory Configuration File
A Lx-DB cluster is built from the following components. The acronyms in the left column are
used in the configuration files and in the SCRIPTs to manage the cluster.
component Description
MtM Mastermind component which groups
Configuration Manager, Commit Sequencer Server
and Snapshot Server. You will usually start this
component and none of the individual ones.
CflM-1..N Conflict Managers from 1 to N. Use CflM-0 for all.
If there is only one you can use CflM-1
KVMS KiVi Metadata Server
KVDS-1..N KiVi Datastore Servers from 1 to N. There will be
N KVDS en each DataEngine node. Use KVDS-0
for all KVDS in each machine.
QE-1..N Query Engine from 1 to N. There will be N QE in
each DataEngine node. Use QE-0 for all QE in each
machine
LgCmS Logger for Commit Sequencer Server
LgLTM-1..N Loggers for the LTM running with the Query
Engine. Use LgLTM-0 for all Loggers in each
machine
The main file for configuring a Lx-DB cluster is the inventory file. Following there is an
example with comments to explain the main parameters to be setup
[all:vars]#BASEDIR for the installation is taken from the environmentvariable $BASEDIRBASEDIR="{{ lookup('env','BASEDIR') }}"#FOLDER to store BACKUPs inBACKUPDIR=/ssd/leandata/backup/#Login account for the clusterUSER="{{ lookup('env','USER') }}"# If {{USER}} has SUDO permissions add SUDO=sudo. Otherwise you may
47
need to# create some folders and grant access to the USER for some actionsto workSUDO="sudo"#SUDO="echo"
# If you want to use NUMA say yes# NUMA=yes# NUMA=noNUMA=yes#USEIP=yes to use IP address instead of hostnames to define themachinesUSEIP=no
[defaults]#This is a fake group for the default HW resources per machine#This info is mandatory and you don't need to add any more info ifall#machines is cluster have the same resources#mem in GB#sockets list of cores (0-2 / 0,1,2 / 0,4)#ncores number of cores available#nthreadscore number of hyperthreads per core#fslog folder where the log files are located. (dot means defaultlocation)#fsdata folder where the storing data is located. (dot meansdefault location)#Examples:#sizing mem=128G sockets=0-1 ncores=12 nthreadscore=0fslog=/ssd2/leandata/logs fsdata=/ssd/leandata/LX-DATA#sizing mem=128G sockets=0-1 ncores=12 nthreadscore=0 fsdata=.fslog=.sizing mem=- sockets=- ncores=- nthreadscore=- fsdata=. fslog=.
#Workload Percentages between KiVi direct API and SQL Interface.#Defaults considers all workload will go through the QE/SQLInterface#This is used to size the proportion of KVDS components over QEcomponents#workload direct=0 sql=100
#The following parameter - if exist - will force the number of KVDSper DS machine:#forcenkvds=8#The following parameter - if exist - will force the memoryallocated for each KVDS:#Memory must allways be in Gygabytes#forcememkvds=10G#The following parameter - if exist - will force the memoryallocated for each QE:#Memory must allways be in Gygabytes#forcememqe=20G
48
#If the following lines exist they will be used. If not, these willbe the values:#In production environment, probably they shouldn't existDS components="KVDS LgLTM QE Mon"MS components="ZK-1 LgSnS LgCmS MtM CflM KVMS Mon"#Components:# 1- Use the command line arguments# 2- Use the specific info in the inventory# 3- Use the variable in env.sh# 4- Use the generic value in the inventory# 5- Use the generic default value
#Metadata servers. Currently, you can configure only one[meta]blade121 ansible_connection=local
#Datastore Servers. You can have multiple data store instances[datastores]blade121 passive="SNS-2 MtM-2 KVMS-2" fslog=/ssd2/lxsansible_connection=local#Machines can have specific sizing parameters:blade122 components="ZK-2 KVDS" mem=64Gblade123 addcomponents="ZK-3"
[DSFs]#This section should only exist if there is a special configurationwith a different#filesystem per KVDS.#List of file systems to be used for each KVDS storage. The listshould be ordered in terms of NUMA#These folders have to be already created before starting thesystem#Folder names should be the same in all machines/ssd/lxs2/kvdata_1/ssd/lxs2/kvdata_2/ssd/lxs2/kvdata_3/ssd/lxs2/kvdata_4/ssd/lxs2/kvdata_5/ssd/lxs2/kvdata_6/ssd/lxs2/kvdata_7/ssd/lxs2/kvdata_8
The main properties to modify are: * BASEDIR: folder where the Leanxcale database will be
installed. If the environment variable $BASEDIR is set then default action will take that
value. * BACKUPDIR: folder to store backups
• USER: specifying the user that will run the scripts of the LeanXcale’s cluster
administration.
49
• SUDO: the accepted values are echo and sudo. If the user has sudo permissions this may
simplify the creation of folders when deploying from the master to other target machines.
• NUMA: specifying if the deployment should try to configure components to take the most
of NUMA for this deployment. Lx-DB can exploit the most of NUMA capabilities.
Frequency scaling can also interfere performance, but to disable it SUDO permissions are
required. The accepted values for this parameter are yes or no.
• USEIP: If you want to configure the machines with IP instead of its hostname. This value
has to be the first IP when you execute the command hostname -I
8.2. Resource Allocation
• sizing: This line holds the default sizing parameters to configure the layout of Lx-DB
components. The line assumes that all machines in the cluster have the same HW
configuration. However, the parameters in the line could be defined for each machine if
you need to override the default sizing. The line has the following parameters:
- mem: Memory available in the machine to run the components (in GB)
- sockets: Lists of sockets in the machine to be used to run the components. Values like
0-2'' or 0,1,2'' or 0,4'' are allowed. In DataStore machines,
there should be two socket lists definition: The socket
list for KVDS and the socket list for QE. Those list must
be separated by :''
- ncores: This is the number of physical cores to be used in each machine to run Lx-DB
components. Take care that this is physical cores so hyperthreads should not be
counted on.
- nthreadscore: Number of hyperthreads per core
- fslog: Folder where the transaction logging files are located. These are not the
components logs, but the transaction logging that guarantees database durability. If
SUDO is not granted, the folder with RW permissions for `USER`should have been
created by the administrator
- fsdata: Parent Folder where database is storing data.
It is highly recommended to separate transaction redo logging into different disks from
database data.
An example of this configuration is the following:
50
sizing mem=128G sockets=0-1 ncores=12 nthreadscore=0 fslog=.fsdata=.[datastores]bladeDS1 mem=- sockets=0-1:2,4
This means: The machines have 128GB available and 2 sockets (0 to 1), there are 12 physical
cpus in total and there is no hyperthreading. Then redo logging and data should use the default
location. Even if fslog and/or fsdata are defined in another disk/filesystem $BASEDIR/LX-
DATA will contain symbolic links that can help to go to the files through a single folder
structure. For bladeDS1 instead of using 128G use the full memory of the machine and
sockets 0 to 1 for KVDS while sockets 2 and 4 for the QE. This means socket 3 will not be
used for Lx-DB if NUMA is enabled.
However, the default sizing configuration is:
sizing mem=- sockets=- ncores=- nthreadscore=- fslog=. fsdata=.
With this input, the configuration SCRIPT will take the information from each machine and
distribute to all the components so Lx-DB can take the full advantage of all machine
resources.
Consider that the distribution of resources among components should be adapted to the
workload so the most of resources is taken. Since there is no information about the workload,
configuration SCRIPT do a standard layout. For high performance installations, an analysis
of the workload is recommended.
There are some parameters that can allow some control over the distribution of components.
Specifically, the following parameters are available:
workload
direct=X sql=Y
X and Y are the respective % of workload that go through the direct
KiVi API and SQL. X+Y should yield 100 (100%). The higher the %
of workload that goes through the direct KiVi API the higher number
of KVDS related to the number of QE. By default, the installation
assumes all workload goes through SQL (X=0, Y=100)
forcenkvds=N If you know the number of KVDS needed per machine, you can
directly define it and override the configurator estimation.
51
forcememkvds=M If you know how many memory each server has to allocate per
KVDS, you can define it through this parameter. M is the number of
Gigabytes.
forcememqe=M If you know how many memory each server has to allocate per QE,
you can define it through this parameter. M is the number of
Gigabytes.
The inventory configuration file also has the following parameters and sections:
• DS components: This parameter holds the default list of components for one DataStore
machine. Additional components can be specified for any of the machines.
• MS components: This parameter holds the default list of components for one
MetaStore machine.
• [meta]: The section contains a list of IPs or hostnames that will run as metadata server.
Currenty, this has to be only one instance.
• [datastores]: The section contains a list of IPs or hostnames that will run as
datastore servers.
8.3. Filesystems
Besides, the fsdata parameter, you can define one folder or filwsystem per kvds
configuring section [DSFs]. This section provides a list of folders for each KiVi datastore. The
folders root must exist and be the same for all datastore machines in the cluster.
It is possible to do a per machine configuration, but it requires to change the inventory file in
each machine.
52
...[DSFs]#List of file systems to be used for each KVDS storage./cdba/fs1302/lxs/kvdata_1/cdba/fs1302/lxs/kvdata_2/cdba/fs1302/lxs/kvdata_3/cdba/fs1302/lxs/kvdata_4/cdba/fs1102/lxs/kvdata_5/cdba/fs1102/lxs/kvdata_6/cdba/fs1102/lxs/kvdata_7/cdba/fs1102/lxs/kvdata_8/cdba/fs1202/lxs/kvdata_9/cdba/fs1202/lxs/kvdata_10/cdba/fs1202/lxs/kvdata_11/cdba/fs1202/lxs/kvdata_12
53
Chapter 9. Annex II. Alerts
This is a detailed list of the events that are centralized in the alert system:
9.1. Events from the Query Engine
Alert Identifier Type Description
START_SERVER warning(info) Indicates a server have been
started. It shows the server
address as IP:PORT
START_SERVER_LTM warning(info) Indicates the LTM is started and
the QE is connected to a
Zookeeper instance
START_SERVER_ERROR critical There has been an error while
starting the server and/or the
LTM
STOP_SERVER warning(info) Indicates the server has been
stopped
SESSION_CONSISTENC
Y_ERROR
warning Error when setting session
consistency at LTM.
DIRTY_METADATA warning The metadata has been updated
but the QE couldn’t read the
new changes
CANNOT_PLAN warning The QE couldn’t come up with
an execution plan for the query
FORCE_CLOSE_CONNE
CTION
warning(info) A connection was explicitly
closed by an user
FORCE_ROLLBACK warning(info) A connection was explicitly
rollbacked by an user
FORCED_ROLLBACK_F
AILED
warning Forced rollback failed
54
Alert Identifier Type Description
FORCE_CANCEL_TRAN
SACTION
warning(info) Forced transaction cancellation.
The transaction was not
associated to any connection
KV_DIAL_BEFORE critical QE is not connected to the DS
KV_NOT_NOW warning DS error: not now
KV_ABORT warning(info) DS error: aborted by user
KV_ADDR warning DS error: bad address
KV_ARG warning DS error: bad argument
KV_BAD warning DS error: corrupt
KV_AUTH warning DS error: auth failed
KV_BUG warning DS error: not implemented
KV_CHANGED warning DS error: resource removed or
format changed
KV_CLOSED warning DS error: stream closed
KV_CTL warning DS error: bad control request
KV_DISKIO critical DS error: disk i/o error
KV_EOF warning DS error: premature EOF
KV_FMT warning DS error: bad format
KV_FULL critical DS error: resource full
KV_HALT critical DS error: system is halting
KV_INTR warning DS error: interrupted
KV_IO critical DS error: i/o error
KV_JAVA warning DS error: java error
KV_LOW critical DS error: low on resources
KV_LTM warning DS error: LTM error
KV_REC warning DS error: Rec error
KV_LOG warning DS error: Log error
55
Alert Identifier Type Description
KV_MAN critical DS error: please, read the
manual
KV_NOAUTH warning DS error: auth disabled
KV_NOBLKS critical DS error: no more mem blocks
KV_NOBLOB warning(info) DS error: no such blob. QE
might not being handling blobs
correctly
KV_NOIDX warning(info) DS error: no such index. QE
might not being handling index
correctly
KV_NOMETA critical DS error: no metadata
KV_NOREG warning DS error: no such region
KV_NOTOP warning DS error: no such tid operation
KV_NOSERVER critical DS error: no server
KV_NOTAVAIL critical DS error: not available
KV_NOTID warning DS error: no such tid
KV_NOTUPLE warning DS error: no such tuple
KV_NOTUPLES warning DS error: no tuples
KV_NOUSR warning(info) DS error: no such user
KV_OUT critical DS error: out of resources
KV_PERM warning(info) DS error: permission denied
KV_PROTO warning DS error: protocol error
KV_RDONLY warning DS error: read only
KV_RECOVER warning DS error: system is recovering
KV_RECOVERED warning DS error: system recovered
KV_SERVER critical DS error: server
KV_TOOLARGE warning DS error: too large for me
KV_TOOMANY warning DS error: too many for me
56
Alert Identifier Type Description
KV_TOUT warning DS error: timed out
KV_REPL warning DS error: replica error
9.2. Events regarding Health Monitor
Alert Identifier Type Description
COMPONENT_FAILURE warning Failure from the indicated
service
TIMEOUT_TOBEREGISTE
RED
warning Waiting to register the
indicated service
CANNOT_REGISTER critical Cannot register the indicated
service. Restart it
9.3. Configuration Events
Alert Identifier Type Description
COMPONENT_FAILURE warning The indicated service is down
HAPROXY_NO_CONNE
CTION
warning No connection with HA proxy
while stopping
TIMEOUT_BUCKET_UN
ASSIGNED
critical The bucket reconfiguration
couldn’t be done
TIMEOUT_TOBEREGIS
TERED
warning The indicated servipe still has
dependencies to solve
CANNOT_REGISTER critical
CONSOLE_SERVER_DO
WN
warning Couldn’t start the console server
RESTART_COMPONENT warning Coudn’t restart the indicated
service
SETTING_EPOCH warning Waiting until STS > RSts
RECOVERY_FAILURE warning The indicated service has not
been recovered
57
Alert Identifier Type Description
RECOVERY_TIMEOUT warning The indicated service is bein
recovered
HOTBACKUP warning Pending recovery from
hotbackup
9.4. Transaction Log Events
Alert Identifier Type Description
LOGDIR_ERROR critical Cannot create folder or is not a
folder
LOGDIR_ALLOCATOR_
ERROR
critical Error managing disk in logger
LOGGER_FILE_ERROR warning IO file error
LOGNET_CONNECTION
_FAILED
critical Cannot dial logger
LOGSRV_CONNECTION
_ERROR
critical Network error
KVDS_RECOVERY_FAI
LED
critical The kvds instance couldn’t be
recovered from logger
FLUSH_FAILED critical Unexpected exception flushing
to logger
58