DEV Community

Cong Li
Cong Li

Posted on

GBase 8a MPP Cluster Performance Optimization

1. Load Balancing Strategy

GBase 8a MPP Cluster supports load balancing strategies at three levels:

1.1. During the client application connection phase to the cluster, the node with the lowest current load is automatically selected for the connection.

ADO.NET:

String _ConnString = "server=192.168.0.2;failover=true;iplist=192.168.0.3;192.168.0.4;gclusterid=g1";
Enter fullscreen mode Exit fullscreen mode

C API:

Host="192.168.1.1; 192.168.1.2";
Enter fullscreen mode Exit fullscreen mode

JDBC:

String URL="jdbc:gbase://192.168.1.56/test?user=gbase&password=******&failoverEnable=true&hostList=192.168.1.57,192.168.1.58&gcluster=gcl1";
Enter fullscreen mode Exit fullscreen mode

ODBC:

DRIVER=GBase 8a MPP Cluster ODBC 8.3 Driver;UID=gbase;PWD=******;
SERVER={192.168.111.96; 192.168.5.212; 192.168.7.174; 192.168.7.173};
CONNECTION_BALANCE=1;GCLUSTER_ID=gcluster;
CHECK_INTERVAL=90;
Enter fullscreen mode Exit fullscreen mode

1.2. In terms of data distribution strategies, a uniform distribution strategy is supported to ensure an even data load across nodes.

1.3. Regarding SQL execution distribution, requests are broken down and executed in parallel across all nodes, ensuring balanced load distribution across the cluster.

2. Compression Strategy

In most applications, performance bottlenecks arise from disk I/O. Hence, modern database designs focus on reducing disk I/O. Data compression reduces I/O time and improves performance, and GBase 8a MPP Cluster is no exception. Compression is one of the key techniques for enhancing performance. The parallel executor of GBase 8a MPP Cluster can perform decompression from the upper parallel scheduling layer, significantly improving the applicability of decompression. In many scenarios (especially those involving large data volumes), using compressed data can yield better performance than uncompressed data.

3. Expansion and Contraction Optimization

When downsizing nodes, the Gnode configuration parameters have the following maximum value:

MAX_PARALLEL_DEGREE = (PROCESS_COUNT > ((TOTAL_NODES_COUNT-1) // (NEW_NODE_COUNT)) ? PROCESS_COUNT / ((TOTAL_NODES_COUNT-1) // (NEW_NODE_COUNT)) : 1);
Enter fullscreen mode Exit fullscreen mode

This prevents memory shortage errors caused by misconfigurations during downsizing.

RESULT_BUFF_COUNT = (Number of Retained Nodes / Number of Nodes Removed) * MAX_PARALLEL_DEGREE;
Enter fullscreen mode Exit fullscreen mode

Where:

  • PROCESS_COUNT: Number of CPUs.
  • TOTAL_NODES_COUNT: Total number of nodes in the cluster.
  • NEW_NODE_COUNT: Number of nodes added or removed from the cluster.

The maximum memory configuration formula is:

RESULT_BUFF_COUNT * gbase_buffer_result + other heap memory configuration parameters (data heap, temp heap) < 80% of physical memory.
Enter fullscreen mode Exit fullscreen mode

If parallelism is enabled:

TableParallel = Number of CPUs on default running nodes or the set value.
Enter fullscreen mode Exit fullscreen mode

The maximum memory configuration formula becomes:

TableParallel * gbase_buffer_result + other heap memory configuration parameters (data heap, temp heap) < 80% of physical memory.
Enter fullscreen mode Exit fullscreen mode

During node replacement in expansion+contraction mode, query performance remains stable.

When using the expansion+contraction method to replace nodes, enabling the gcluster_different_distribution_optimize parameter can ensure that query performance across distributions does not degrade during node replacement.

  • A value of 0 disables this feature.
  • A value of 1 enables it. The default is 0.

Explanation:

During the expansion+contraction node replacement process, the cluster maintains three distribution IDs: original, after contraction, and after expansion. If a query involves tables that are spread across different distribution IDs (some already expanded, others not), the query plan will either:

  • If disabled: Perform a table pull across different distribution IDs, moving data to the same distribution ID for processing.
  • If enabled: The query plan will inspect the data distribution across different distribution IDs and, if the data is similarly distributed, it will execute the query based on the same distribution ID, reducing table pulls and maintaining query performance.

4. Asynchronous Active-Active Cluster Data Synchronization Optimization

This optimization applies only to data synchronization between two clusters, not internal primary-standby synchronization.

1) Synchronization Bandwidth Limiting:

The maximum bandwidth for data synchronization between two clusters can be specified.

  • The bandwidth must be at least 1MB/s.
  • For each shard, the bandwidth per shard must be at least 10KB/s.

The formula for bandwidth per shard is:

Bandwidth per shard = Total bandwidth / Table parallelism / Number of shards being synchronized.
Enter fullscreen mode Exit fullscreen mode

The number of shards depends on the rsync_mode setting.

2) Compressed Transmission:

The metadata and DC data files that need to be synchronized between the two clusters undergo secondary compression using the zlib algorithm.

  • Data smaller than 50 bytes will not be compressed.
  • If compressed data is larger than the original, no secondary compression will be performed.

To enable this optimization, update the synctool.conf configuration file with:

BANDWIDTH_QOS_SWITCH=1
COMPRESSED_SWITCH=1
Enter fullscreen mode Exit fullscreen mode

Both parameters default to 0, meaning the feature is disabled.

Additionally, when starting the sync tool, add the following parameter:

./gcluster_rsynctool.py --sync_network_bandwidth=<bandwidth_limit>
Enter fullscreen mode Exit fullscreen mode

Where the default unit is MB/s, and the default value is 0, meaning unlimited bandwidth. The range is 1~99999.

Performance:

  • Secondary compression should achieve at least a 70% compression ratio.
  • The transmission time after secondary compression should not exceed 50% more than uncompressed data.

Compatibility:

Since the optimization modifies the synchronization protocol, clusters must use compatible versions of the sync_client and sync_server. If the optimization is disabled, there are no version restrictions.

5. Efficiency Improvement in Automatic Recovery of Partitioned Tables

The optimization includes:

  1. Filtering partitions based on SCN during dmlevent recovery to reduce the number of partitions to be synchronized.
  2. Network transmission optimization for dmlstorage by disabling the Nagle algorithm.
  3. If the lock acquisition fails during the submission phase after fevent synchronization, it is handled by dmlevent to avoid full data rollback and re-synchronization.

Required Parameters:

In synctool.conf, add:

CONCURR_SWITCH=1
FASTSYNC_SWITCH=1
NAGLE_SWITCH_OFF=0
Enter fullscreen mode Exit fullscreen mode

In the gc_recover.cnf file, adjust:

dml_recover_parall_switch=1
dml_recover_parall_thread_num=4
Enter fullscreen mode Exit fullscreen mode

That’s all for today’s content, thanks for reading!

Top comments (0)