Common Causes of Performance Issues

This section explains the troubleshooting processes for common performance issues and potential solutions to these issues.

Parent topic: Managing Performance

Identifying Hardware and Segment Failures

The performance of LightDB-A Database depends on the hardware and IT infrastructure on which it runs. LightDB-A Database is comprised of several servers (hosts) acting together as one cohesive system (array); as a first step in diagnosing performance problems, ensure that all LightDB-A Database segments are online. LightDB-A Database’s performance will be as fast as the slowest host in the array. Problems with CPU utilization, memory management, I/O processing, or network load affect performance. Common hardware-related issues are:

  • Disk Failure – Although a single disk failure should not dramatically affect database performance if you are using RAID, disk resynchronization does consume resources on the host with failed disks. The gpcheckperf utility can help identify segment hosts that have disk I/O issues.
  • Host Failure – When a host is offline, the segments on that host are nonoperational. This means other hosts in the array must perform twice their usual workload because they are running the primary segments and multiple mirrors. If mirrors are not enabled, service is interrupted. Service is temporarily interrupted to recover failed segments. The gpstate utility helps identify failed segments.
  • Network Failure – Failure of a network interface card, a switch, or DNS server can bring down segments. If host names or IP addresses cannot be resolved within your LightDB-A array, these manifest themselves as interconnect errors in LightDB-A Database. The gpcheckperf utility helps identify segment hosts that have network issues.
  • Disk Capacity – Disk capacity on your segment hosts should never exceed 70 percent full. LightDB-A Database needs some free space for runtime processing. To reclaim disk space that deleted rows occupy, run VACUUM after loads or updates.The gp_toolkit administrative schema has many views for checking the size of distributed database objects.

    See the LightDB-A Database Reference Guide for information about checking database object sizes and disk space.

Managing Workload

A database system has a limited CPU capacity, memory, and disk I/O resources. When multiple workloads compete for access to these resources, database performance degrades. Resource management maximizes system throughput while meeting varied business requirements. LightDB-A Database provides resource queues and resource groups to help you manage these system resources.

Resource queues and resource groups limit resource usage and the total number of concurrent queries running in the particular queue or group. By assigning database roles to the appropriate queue or group, administrators can control concurrent user queries and prevent system overload. For more information about resource queues and resource groups, including selecting the appropriate scheme for your LightDB-A Database environment, see Managing Resources.

LightDB-A Database administrators should run maintenance workloads such as data loads and VACUUM ANALYZE operations after business hours. Do not compete with database users for system resources; perform administrative tasks at low-usage times.

Avoiding Contention

Contention arises when multiple users or workloads try to use the system in a conflicting way; for example, contention occurs when two transactions try to update a table simultaneously. A transaction that seeks a table-level or row-level lock will wait indefinitely for conflicting locks to be released. Applications should not hold transactions open for long periods of time, for example, while waiting for user input.

Maintaining Database Statistics

LightDB-A Database uses a cost-based query optimizer that relies on database statistics. Accurate statistics allow the query optimizer to better estimate the number of rows retrieved by a query to choose the most efficient query plan. Without database statistics, the query optimizer cannot estimate how many records will be returned. The optimizer does not assume it has sufficient memory to perform certain operations such as aggregations, so it takes the most conservative action and does these operations by reading and writing from disk. This is significantly slower than doing them in memory. ANALYZE collects statistics about the database that the query optimizer needs.

Note When running an SQL command with GPORCA, LightDB-A Database issues a warning if the command performance could be improved by collecting statistics on a column or set of columns referenced by the command. The warning is issued on the command line and information is added to the LightDB-A Database log file. For information about collecting statistics on table columns, see the ANALYZE command in the LightDB-A Database Reference Guide

Identifying Statistics Problems in Query Plans

Before you interpret a query plan for a query using EXPLAIN or EXPLAIN ANALYZE, familiarize yourself with the data to help identify possible statistics problems. Check the plan for the following indicators of inaccurate statistics:

  • Are the optimizer’s estimates close to reality? Run EXPLAIN ANALYZE and see if the number of rows the optimizer estimated is close to the number of rows the query operation returned.
  • Are selective predicates applied early in the plan? The most selective filters should be applied early in the plan so fewer rows move up the plan tree.
  • Is the optimizer choosing the best join order? When you have a query that joins multiple tables, make sure the optimizer chooses the most selective join order. Joins that eliminate the largest number of rows should be done earlier in the plan so fewer rows move up the plan tree.

See Query Profiling for more information about reading query plans.

Tuning Statistics Collection

The following configuration parameters control the amount of data sampled for statistics collection:

  • default_statistics_target

These parameters control statistics sampling at the system level. It is better to sample only increased statistics for columns used most frequently in query predicates. You can adjust statistics for a particular column using the command:

ALTER TABLE...SET STATISTICS

For example:

ALTER TABLE sales ALTER COLUMN region SET STATISTICS 50;

This is equivalent to changing default_statistics_target for a particular column. Subsequent ANALYZE operations will then gather more statistics data for that column and produce better query plans as a result.

Optimizing Data Distribution

When you create a table in LightDB-A Database, you must declare a distribution key that allows for even data distribution across all segments in the system. Because the segments work on a query in parallel, LightDB-A Database will always be as fast as the slowest segment. If the data is unbalanced, the segments that have more data will return their results slower and therefore slow down the entire system.

Optimizing Your Database Design

Many performance issues can be improved by database design. Examine your database design and consider the following:

  • Does the schema reflect the way the data is accessed?
  • Can larger tables be broken down into partitions?
  • Are you using the smallest data type possible to store column values?
  • Are columns used to join tables of the same datatype?
  • Are your indexes being used?

LightDB-A Database Maximum Limits

To help optimize database design, review the maximum limits that LightDB-A Database supports:

Dimension Limit
Database Size Unlimited
Table Size Unlimited, 128 TB per partition per segment
Row Size 1.6 TB (1600 columns * 1 GB)
Field Size 1 GB
Rows per Table 281474976710656 (2^48)
Columns per Table/View 1600
Indexes per Table Unlimited
Columns per Index 32
Table-level Constraints per Table Unlimited
Table Name Length 63 Bytes (Limited by name data type)

Dimensions listed as unlimited are not intrinsically limited by LightDB-A Database. However, they are limited in practice to available disk space and memory/swap space. Performance may degrade when these values are unusually large.

Note There is a maximum limit on the number of objects (tables, indexes, and views, but not rows) that may exist at one time. This limit is 4294967296 (2^32).