Cong Li

Posted on Jun 26

GBase 8a Implementation Guide: Performance Optimization

#database

1. Hardware Configuration Recommendations

CPU

Ensure the BIOS settings are in non-power-saving mode to prevent the CPU from throttling. For servers using Intel CPUs that are not deployed in a multi-instance environment, it is recommended to disable the vm.zone_reclaim_mode parameter. To disable the NUMA parameter:

A. Modify the current system value: echo 0 > /proc/sys/vm/zone_reclaim_mode
B. Modify or add to the configuration file /etc/sysctl.conf: vm.zone_reclaim_mode = 0

Memory

Configure the memory usage with the PCT parameter (gbase_memory_pct_target), which determines the percentage of the OS memory the gbased process can use. After disabling swap, reserve 20% for other programs. Configure the data heap, operator heap, and TEMP heap sizes as needed.

SWAP

For servers with small memory, configure SWAP as twice the physical memory. For servers with large memory, configure SWAP size between 64GB and 128GB.

Network

Use a 10 Gigabit network and pre-test the bandwidth between nodes.

Hard Disk

Calculate disk I/O and test disk performance based on the number of hard disk blocks and RAID method. For multi-instance deployment, if there are enough disk blocks, consider using different disks for each instance (currently requires manual mapping).

2. Operating System Configuration Recommendations

2.1. OS Parameter Adjustment Recommendations

GBase 8a Cluster will automatically adjust required OS parameters during installation. Version 8.6 does this via the InstallTar.py program, while version 9.5 requires running SetSysEnv.py before installation. If you need to modify or configure OS parameters, contact a GBase engineer or refer to the product manual. During the operation of the GBase 8a cluster, if parameters are modified, refer to the product manual for adjustments.

In some OS versions (e.g., RedHat 6.7), modifying the open file limit requires changes not only to limits.conf but also to the /etc/security/limits.d/90-nproc.conf file. Add the following to 90-nproc.conf: * soft nproc 655360.

2.2. Disk Scheduling Strategy Recommendations

Since databases are I/O-intensive, it is recommended to set the disk I/O scheduling strategy for data storage on GBase cluster nodes as follows:

For mechanical disks, use the deadline scheduling strategy. Modify the disk I/O scheduling strategy with:
echo deadline > /sys/block/$data_disk/queue/scheduler
This change is for the current system configuration and needs to be reset after a system reboot. Alternatively,

modify the /etc/default/grub file, find the GRUB_CMDLINE_LINUX line, add elevator=deadline transparent_hugepage=never within the quotes, then run as root:
grub2-mkconfig -o /boot/grub2/grub.cfg

This change is permanent, affecting the global setting at OS startup.

Note: For mechanical disks, the CentOS/RedHat 8.0 series uses mq_deadline. For SSDs, use the noop scheduling strategy.

2.3. Cache Parameter Settings Recommendations

Set the OS to prefer reclaiming cache to avoid poor memory allocation performance when the cache is full:

Method 1:

echo 1024 > /proc/sys/vm/vfs_cache_pressure
echo 8388608 > /proc/sys/vm/min_free_kbytes

Method 2: Edit the /etc/sysctl.conf configuration file:

vm.vfs_cache_pressure = 1024
vm.min_free_kbytes = 8388608

The /proc/sys/vm/min_free_kbytes file specifies the minimum free memory (in KB) Linux VM should retain. The size should be set to 1/12 of the physical memory. The above setting is for a server with 96GB memory, setting the value to 8GB.

2.4. Transparent Huge Page Management Settings Recommendations

GBase databases are not optimized for transparent huge page management, so disable this feature. As root, modify the /sys/kernel/mm/transparent_hugepage/enabled file with:
echo never > /sys/kernel/mm/transparent_hugepage/enabled

2.5. Maximum Task Number Limit Recommendations

For RedHat7, Suse11, and later OS versions, modify the DefaultTasksMax parameter in the /etc/systemd/system.conf file to:
DefaultTasksMax=infinity

2.6. File System Cache Settings Recommendations

By default, Linux can use up to 40% of available memory for file system cache. When this threshold is exceeded, the file system writes all cached content to disk, causing subsequent I/O requests to be synchronous. This can affect I/O system responsiveness and cause memory to be fully occupied, making the system unresponsive. Adjust file system cache parameters to alleviate SQL task blocking based on application requirements:

For vm.dirty_ratio and vm.dirty_background_ratio, adjust the parameters as needed. For example:

Recommended settings:

sysctl -w vm.dirty_ratio=10
sysctl -w vm.dirty_background_ratio=5
sysctl -p

To make these changes permanent, modify the /etc/sysctl.conf file by adding:

vm.dirty_background_ratio = 5
vm.dirty_ratio = 10

3. Data Distribution Planning

The performance of the GBase 8a MPP cluster depends on the overall performance of each node. The data volume stored on each node significantly impacts cluster performance. To achieve optimal performance, all data nodes should store an equal amount of data. During the database table planning and definition phase, consider whether the table is a replication table or a distribution table, and set some columns on the distribution table as distribution columns for hash distribution.

For example, based on the data distribution characteristics, you can:

Store dictionary tables or dimension tables as replication tables across nodes, without sharding the data. Although this causes data redundancy, it allows local JOIN operations with fact tables, avoiding data movement between nodes.
Distribute fact tables (large tables) across different nodes using methods such as random distribution (rarely used), single-column hash distribution, or multi-column hash distribution. When SQL query conditions are met by only some nodes, the query optimizer can decide to execute SQL only on those nodes.

3.1. Distribution Column Selection Principles

Prioritize JOINs between large tables, making the columns used in JOIN conditions hash distribution columns to enable distributed execution of JOIN operations across nodes.
Consider GROUP BY, making the GROUP BY columns hash distribution columns for one-step aggregation.
Choose columns with a high number of unique values (high count(distinct)) as hash distribution columns to ensure even data distribution.
Frequently used columns in equality queries should be considered as hash distribution columns.

3.2. Notes

Hash distribution columns should be numeric or string types.
Columns used as hash distribution columns should not be updated (including fast update mode).
Ensure hash join equality columns have identical type definitions to avoid issues. For example, joins between char and varchar types can yield empty results due to padding differences, and joins between decimal and bigint types require type conversion, preventing optimal execution plans. Use the same type, such as bigint, for hash distribution columns.

4. Data Sorting Optimization

Sorting data by a query column groups similar values in limited data blocks, reducing I/O and improving compression. This enhances the filtering effect of smart indexes, significantly boosting overall query performance. When possible, sort data by frequently used query columns.

For example, in the telecommunications industry, queries often use phone numbers. Sorting data by phone numbers within a specific time range allows smart indexes to enhance query performance.

5. Compression Strategy Selection

In most applications, the performance bottleneck is disk I/O, so modern database designs aim to reduce disk I/O. Compression reduces I/O time and improves performance, and 8a is no exception. Compression is a key technology for performance enhancement. The 8a parallel executor can handle decompression through upper-level parallel scheduling, significantly improving the suitability of decompression. In many scenarios, especially those involving large data volumes, using compressed data can provide better performance than uncompressed data.

5.1. Compression Methods

Version 86:

gbase_compression_num_method=<num_method>
gbase_compression_str_method=<str_method>

Table-level compression: COMPRESS(,)
Column-level int compression options: 0, 1, 5
Column-level varchar compression options: 0, 3, 5
Table-level combined compression: 00, 13, 55

Version 95:

gbase_compress_method=< 'method' >
gbase_compress_level=<level>

Table-level compression: COMPRESS(< 'method' >,)

method specifies the compression algorithm, with possible values as follows (case insensitive):

No zip: No compression
High z: High compression ratio
Rapid z: Fast compression
New Rapid z:
STDZ: level specifies the compression level, ranging from 0 to 9, where 1 offers the lowest compression ratio and the fastest speed, and 9 offers the highest compression ratio and the slowest

Compatibility Mapping between Version 86 and Version 95

The compression algorithms in version 95 are compatible with the usage of version 86. When both gbase_compression_num_method and gbase_compression_str_method parameters coexist with gbase_compress_method and gbase_compress_level parameters, the latter takes precedence. The mapping is as follows:

New Compression Algorithm	Old Compression Algorithm
gbase_compress_method=’NoZip ’ gbase_compress_level=0	gbase_compression_str_method=0 gbase_compression_num_method=0
gbase_compress_method=’RapidZ ’ gbase_compress_level=0	gbase_compression_str_method=5 gbase_compression_num_method=5

New Compression Algorithm	Old Compression Algorithm
gbase_compress_method=’HighZ ’ gbase_compress_level=0	gbase_compression_str_method=3 gbase_compression_num_method=1
COMPRESS(’NoZip’,0)	COMPRESS(0,0)
COMPRESS(’RapidZ’,0)	COMPRESS(5,5)
COMPRESS(’HighZ’,0)	COMPRESS(1,3)

5.2. Selection Principles

The advantage of 31 compression is its high compression ratio, which is twice as high as 55 compression. However, its execution efficiency is average. If storage space is a priority and performance is not, 31 compression is recommended. Conversely, if storage space is not a concern and performance is critical, 55 compression is advisable.

6. Hash Index Selection

Hash Index can typically improve the efficiency of locating equality queries, especially in applications focused on precise single-table queries. For instance, in telecom services for concurrent call detail records queries (especially when sufficient memory is available).

In 8a, hash indexes are divided into Global Hash and Segment Hash, mainly differing in the scope of the column on which the hash index is created.

A Global hash index is created on the entire column data, whereas a segment hash index divides the entire column data into segments based on the specified dc number (key_DC_size) and creates an index on each segment. Segment hash indexes are recommended for easier space recovery when disk space is limited. In practice, Global Hash Index is more commonly used.

Example syntax for segment hash index:

CREATE INDEX idx_t_a ON t(a) key_DC_size = 1000 USING HASH GLOBAL;

Recommendation: The number of Dcs (key_DC_size) for segment hash indexes should be comparable to the number of dc hits in a single query scan, usually between 400 and 2000.

In 8a, the implementation of SQL for deleting data involves marking the data for deletion, with the data itself still existing on the disk. In fast update mode, an update SQL first deletes the original data row and then inserts a new row. For marked-for-deletion data on the disk, 8a provides the shrink space SQL for manual removal, effectively freeing disk space. The shrink space statement performs block-level and row-level recovery, simultaneously recovering index files. This approach significantly improves database management performance in large data analysis databases.

Example of block-level recovery with shrink space:

ALTER TABLE t SHRINK SPACE FULL block_reuse_ratio=30;

This consolidates and recovers DC space where the valid data proportion is less than 30%, thoroughly clearing invalid data with deletion marks, and re-saving the DCs. Index files are also recovered and rebuilt as needed.

In practice, GBase 8a first filters using intelligent indexes, and then uses Hash Index if an equality query condition column has a Hash Index, otherwise, it performs a full DC scan. This behavior is evident in the Trace Log.

For real-time data loading scenarios, set a time window to load data into temporary tables without indexes, then insert this data into indexed target tables or create indexes on the temporary tables. This one-time index processing significantly reduces the maintenance costs associated with indexes.

Notes

Indexes are a lossy optimization method, potentially impacting data loading and DML operation performance. Use based on specific needs.
Columns chosen for hash indexes should have few duplicate values to avoid severe hash collisions affecting performance.
Binary type columns are unsuitable for HASH indexes.
Only single columns can be specified when creating indexes; multi-column composite indexes are not allowed.

7. Kafka Consumer Tuning

7.1. Kafka Design

Using Kafka consumer for incremental data synchronization is suitable for scenarios where the data source is a transactional database (e.g., Oracle, MySQL). It offers more convenience compared to batch loading.

Kafka consumer pulls messages from Kafka, parses operation types, table names, primary key names, column names, and data from the messages, and temporarily stores them in a local queue as intermediate results. Transaction threads retrieve data from the local queue, merge and batch them, and send DML statements to gnode for execution and final submission.

Kafka consumer primarily enhances performance through merging and batching. Merging involves resolving delete and insert operations in memory, which is a prerequisite for batching. Batching refers to submitting as much data as possible in a single transaction to leverage GBase8a's high throughput.

7.2. Factors Affecting Performance

Several factors affect the performance of consumer data synchronization, listed in order of severity:

Frequent delete and update operations on large tables (with billions of rows) in the source database. These operations remain slow in gnode even after merging and batching by the consumer.
A large number of tables. If tens of thousands of tables synchronize data through the consumer, the batching effect of the consumer is severely reduced.
A large number of columns in tables. Since 8a is a column-store database, the number of columns is linearly related to disk access times, leading to a linear decrease in performance.
Concurrent user operations. During consumer data synchronization, numerous user operations on GBase8a compete for resources with the consumer.

DEV Community