This extension for LightDB helps you to find out most resource-consuming activities in your LightDB databases.
Extension is based on statistic views of LightDB and contrib extension lt_stat_statements, lt_stat_activity and system_stats. It is written in pure pl/pgsql and doesn't need any external libraries or software, but LightDB database itself.
Historic repository will be created in your database by this extension. This repository will hold statistic "samples" for your lightdb clusters. Sample is taken by calling take_sample() function. LightDB also provide another lt_cron extension tool performing periodic take sample tasks.
Periodic samples can help you finding most resource intensive activities in the past. Suppose, you were reported performance degradation several hours ago. Resolving such issue, you can build a report between two samples bounding performance issue period to see load profile of your database.
You can take an explicit sample before running any batch processing, and after it will be done.
Any time you are taking a sample, pg_stat_statements_reset() will be called, ensuring you wont loose statements due to reaching lt_stat_statements.max. Also, report will contain section, informing you if captured statements count in any sample reaches 90% of lt_stat_statements.max.
lt_profile, installed in one cluster is able to collect statistics from other clusters, called servers. You just need to define some servers, providing names and connection strings and make sure connection can be established to all databases of all servers. Now you can track, for example, statistics on your standbys from master, or from any other server. Once extension is installed, a local server is automatically created - this is a server for cluster where lt_profile resides.
Extension consists of four parts:
Historic repository is a storage for sampling data. Repository is a set of extension tables.
Sample management engine is a set of functions, used for taking samples and support repository by removing obsolete sample data from it.
Report engine is a set of functions used for report generation based on data from historic repository.
Administrative functions allows you to create and manage servers and baselines.
Monitored servers (LightDB clusters) list.
Table F.4. servers
Columns
Column Type Description |
---|
Server id |
Server name |
Server description |
Server created timestamp, default now() |
Databases that should be excluded |
Enabled or not, default true |
Connstr for dblink |
Global setting of max_sample_age would be overrided by this value if set |
Last sample id, default 0 |
Sample window start timestamp |
Sample window duration |
Sample window interval |
Sample times list.
Table F.5. samples
Columns
Column Type Description |
---|
Server id |
Sample id |
Timestamp of this sample |
Server hardware infomation (collected by system_stats). The information includes the number of CPUs, threads, cores, sockets, memory, etc.
Table F.6. sample_server_hardware
Columns
Column Type Description |
---|
Server id |
Sample id |
Cpu architecture of the system |
Byte order of the architecture |
Cpu count |
Thread count per core |
Core count per socket |
Sockets of the system |
Vendor id |
Cpu family |
Model of the system |
Model name of the system |
Cpu frequency in mhz |
Level 1 data cache |
Level 1 instruction cache |
Level 2 cache |
Level 3 cache |
Memory in total |
Swap memory in total |
Cluster configuration infomation.
Table F.7. sample_cluster_instance
Columns
Column Type Description |
---|
Server id |
Sample id |
Hostname (reference servers.server_name) |
Cluster name (reference current_setting('cluster_name')) |
Role, Primary if pg_is_in_recovery() = false, otherwise standby |
Version of LightDB, for example: LightDB 13.8-23.1 |
Startup time |
Sample kernel settings.
Table F.8. sample_kernel_settings
Columns
Column Type Description |
---|
Server id |
Sample id |
Name |
Value |
Source file |
Sample cpu usage, the data is produced by analyzing statistics from /proc/stat.
Table F.9. sample_cpu_usage
Columns
Column Type Description |
---|
Server id |
Sample id |
User cpu |
Sys cpu |
Idle cpu |
Io wait |
Sample memory usage.
Table F.10. sample_db_memory_usage
Columns
Column Type Description |
---|
Server id |
Sample id |
Shared memory size in bytes |
Local memory size in bytes |
Sample settings.
Table F.11. sample_settings
Columns
Column Type Description |
---|
Server id |
First seen timestamp |
Scope of setting. Currently may be 1 for pg_settings and 2 for other adm functions (like version) |
Name |
Setting |
Reset val |
Boot val |
Unit |
Source file |
Source line |
Pending restart flag |
Sample timings for various function call of sample system.
Table F.12. sample_timings
Columns
Column Type Description |
---|
Server id |
Sample id |
Sample event |
Spent time for the related event on this sample |
Baselines setting.
Table F.13. baselines
Columns
Column Type Description |
---|
Server id |
Baseline sample id |
Baseline sample name |
Timestamp indicating the end of life for this baseline sample |
Baseline Sample list.
Table F.14. bl_samples
Columns
Column Type Description |
---|
Server id |
Sample id |
Baseline id |
Statement list.
Table F.15. stmt_list
Columns
Column Type Description |
---|
Server id |
Md5 of the query id |
Query is a sql statement in text format |
Sample statistic for database.
Table F.16. sample_stat_database
Columns
Column Type Description |
---|
Server id |
Sample id |
Database id |
Database name |
Transaction commit count |
Transaction rollback count |
Blocks read count |
Blocks hit count |
Tuple returned count |
Tuple fetched count |
Tuple inserted count |
Tuple updated count |
Tuple deleted count |
Conflicts in total, including tablespace, lock, snapshot, bufferpin, startup_deadlock |
Temporary files |
Temporary files size in bytes |
Deadlocks |
Block read time |
Block write time |
Timestamp of reset for the statistics. |
Database size |
Database delta size |
True for a template database |
Sample statistic for statements.
Table F.17. sample_statements
Columns
Column Type Description |
---|
Server id |
Sample id |
OID of user who executed the statement |
OID of database in which the statement was executed |
Internal hash code, computed from the statement's parse tree |
Md5 of query, query is the sql statement in text format |
Number of times the statement was planned (if lt_stat_statements.track_planning is enabled, otherwise zero) |
Total time spent planning the statement, in milliseconds (if lt_stat_statements.track_planning is enabled, otherwise zero) |
Minimum time spent planning the statement, in milliseconds (if lt_stat_statements.track_planning is enabled, otherwise zero) |
Maximum time spent planning the statement, in milliseconds (if lt_stat_statements.track_planning is enabled, otherwise zero) |
Mean time spent planning the statement, in milliseconds (if lt_stat_statements.track_planning is enabled, otherwise zero) |
Population standard deviation of time spent planning the statement, in milliseconds (if lt_stat_statements.track_planning is enabled, otherwise zero) |
Number of times the statement was executed |
Total time spent executing the statement, in milliseconds |
Minimum time spent executing the statement, in milliseconds |
Maximum time spent executing the statement, in milliseconds |
Mean time spent executing the statement, in milliseconds |
Population standard deviation of time spent executing the statement, in milliseconds |
Total number of rows retrieved or affected by the statement |
Total number of shared block cache hits by the statement |
Total number of shared blocks read by the statement |
Total number of shared blocks dirtied by the statement |
Total number of shared blocks written by the statement |
Total number of local block cache hits by the statement |
Total number of local blocks read by the statement |
Total number of local blocks dirtied by the statement |
Total number of local blocks written by the statement |
Total number of temp blocks read by the statement |
Total number of temp blocks written by the statement |
Total time the statement spent reading blocks, in milliseconds (if track_io_timing is enabled, otherwise zero) |
Total time the statement spent writing blocks, in milliseconds (if track_io_timing is enabled, otherwise zero) |
Total number of WAL records generated by the statement |
Total number of WAL full page images generated by the statement |
Total amount of WAL generated by the statement in bytes |
Sample statistic for statements in total, group by database.
Table F.18. sample_statements_total
Columns
Column Type Description |
---|
Server id |
OID of database in which the statement was executed |
Number of times the statement was planned (if lt_stat_statements.track_planning is enabled, otherwise zero) |
Total time spent planning the statement, in milliseconds (if lt_stat_statements.track_planning is enabled, otherwise zero) |
Number of times the statement was executed |
Total time spent executing the statement, in milliseconds |
Total number of rows retrieved or affected by the database |
Total number of shared block cache hits by the database |
Total number of shared blocks read by the database |
Total number of shared blocks dirtied by the database |
Total number of shared blocks written by the database |
Total number of local block cache hits by the database |
Total number of local blocks read by the database |
Total number of local blocks dirtied by the database |
Total number of local blocks written by the database |
Total number of temp blocks read by the database |
Total number of temp blocks written by the statement by the database |
Total time the statement spent reading blocks, in milliseconds (if track_io_timing is enabled, otherwise zero) by the database |
Total time the statement spent writing blocks, in milliseconds (if track_io_timing is enabled, otherwise zero) by the database |
Total number of WAL records generated by the database |
Total number of WAL full page images generated by the database |
Total amount of WAL generated by all statements in bytes by the database |
Total statements count by the database |
Sample statistic for wait events in total.
Table F.19. sample_wait_event_total
Columns
Column Type Description |
---|
Server id |
Sample id |
Backend process id related to this wait event |
Queryid related to the wait event |
Wait event type in text |
Wait event in text |
Total wait time of this kind of event since the last sample |
Backend state in text |
Tablespaces, captured in samples.
Table F.20. tablespaces_list
Columns
Column Type Description |
---|
Server id |
OID of the tablespace |
Name of the tablespace |
Path of the tablespace |
Sample statistic for tablespace. The data is from the call of function pg_tablespace_size(oid).
Table F.21. sample_stat_tablespaces
Columns
Column Type Description |
---|
Server id |
Sample id |
OID of tablespace |
Total disk space used in the tablespace, in bytes |
Delta disk space compared with the size that captured by last sample, in bytes |
Table list related to the server and database.
Table F.22. tables_list
Columns
Column Type Description |
---|
Server id |
OID of the database |
Relid of the table (reference pg_class.oid) |
Relkind of the table |
OID of the TOAST table associated with this table (reference pg_class.reltoastrelid) |
Schemaname of the table |
Name of the table (reference pg_class.relname) |
Sample statistic for tables.
Table F.23. sample_stat_tables
Columns
Column Type Description |
---|
Server id |
Sample id |
OID of the database |
Relid of the table (reference pg_class.oid) |
OID of tablespace (reference sample_stat_tablespaces.tablespaceid) |
Delta of sequence scan since last sample |
Delta of sequence tuple read since last sample |
Delta of index scan since last scan |
Delta of index tuple fetch since last sample |
Delta of inserted tuple count since last sample |
Delta of updated tuple count since last sample |
Delta of deleted tuple count since last sample |
Delta of hot updated tuple count since last sample |
Number of live tuple |
Number of dead tuple |
Number of modified since analyze |
Number of inserted since vacuum |
Timestamp of last vacuum |
Timestamp of last autovacuum |
Timestamp of last analyze |
Timestamp of last autoanalyze |
Delta vacuum count since last sample |
Delta autovacuum count since last sample |
Delta analyze count since last sample |
Delta auto analyze count since last sample |
Delta heap block read since last sample |
Delta heap block hit since last sample |
Delta index blocks read since last sample |
Delta toast blocks read since last sample |
Delta toast blocks hit since last sample |
Delta toast index blocks read since last sample |
Delta toast index blocks hit since last sample |
Toast table size |
Delta toast table size since last sample |
Sample statistic for tables, group by server,sample_id,datid,relkind,tablespaceid.
Table F.24. sample_stat_tables_total
Columns
Column Type Description |
---|
Server id |
Sample id |
OID of the database |
OID of tablespace |
Relkind of the table |
Total seq_scan since last sample |
Total seq_tup_read since last sample |
Total index scan since last sample |
Total index tuple fetched since last sample |
Total number of tuple inserted since last sample |
Total number of tuple updated since last sample |
Total number of tuple deleted |
Total number of hot tuple updated since last sample |
Sample id |
Total vacuum count since last sample |
Total auto vacuum count since last sample |
Total analyze count since last sample |
Total auto analyze count since last sample |
Total heap blocks read since last sample |
Total heap blocks hit since last sample |
Total index blocks read since last sample |
Total index blocks hit since last sample |
Total toast blocks read since last sample |
Total toast blocks hit since last sample |
Total toast index blocks read since last sample |
Total toast index blocks hit since last sample |
Delta disk space for the table since last sample |
Sample index list, including index names and schemas, captured in samples.
Table F.25. indexes_list
Columns
Column Type Description |
---|
Server id |
OID of database |
The OID of the pg_class entry for this index |
The OID of the pg_class entry for the table this index is for |
Schema of the table |
Name of the table |
Sample statistic for indexes.
Table F.26. samples
Columns
Column Type Description |
---|
Server id |
OID of database |
OID of tablespace |
Delta Index scan since last sample |
Delta index tuple read since last sample |
Delta index tuple fetch since last sample |
Delta index blocks read since last sample |
Delta Index blocks hit since last sample |
Current disk space for the index |
Delta disk space of the index since last sample |
If true, this is a unique index |
Sample statistic index, group by server_id,sample_id,datid,tablespaceid.
Table F.27. sample_stat_indexes_total
Columns
Column Type Description |
---|
Server id |
Sample id |
OID of the database |
OID of tablespace |
Total index scan since last sample |
Total index tuple read since last sample |
Total index tuple fetch since last sample |
Total index blocks read since last sample |
Total index blocks hit since last sample |
Total delta disk space for the index since last sample |
Sample function list.
Table F.28. funcs_list
Columns
Column Type Description |
---|
Server id |
OID of database |
OID of the function (reference pg_proc.oid) |
Schemaname of the function (reference pg_namespace.schemaname) |
Function name (reference pg_proc.proname) |
Function arguments |
Sample statistic for user functions.
Table F.29. sample_stat_user_functions
Columns
Column Type Description |
---|
Server id |
Sample id |
OID of the database |
OID of the function (reference pg_proc.oid) |
Delta number of calls since last sample |
Delta total time since last sample |
Delta self time since last sample |
True if the data type of the function is trigger type |
Sample statistic for user functions, group by server_id,sample_id,datid,trg_fn.
Table F.30. samples
Columns
Column Type Description |
---|
Server id |
Sample id |
OID of database |
Total number of calls since last sample |
Total function time since last sample |
True if the data type of the function is trigger type |
Sample statistic for clsuter.
Table F.31. sample_stat_cluster
Columns
Column Type Description |
---|
Server id |
Sample id |
Delta checkpoints timed since last sample |
Delta requested checkpoints since last sample |
Delta accumulated checkpoint write time since last sample, in milliseconds |
Delta accumulated checkpoint sync time since last sample, in milliseconds |
Delta buffer written checkpoints since last sample |
Delta number of dirty buffer written since last sample |
Delta number of max written since last sample (count 1 once number of written >= bgwriter_lru_maxpages) |
Delta number of buffer written requested by backend since last sample |
Delta number of fsync done by backend since last sample |
Delta number of buffer alloced since last sample |
Last timestamp that reset pg stat |
Delta wal size since last sample (wal_size calculated by pg_wal_lsn_diff) |
Sample statistic for archive actions including success and failure count and timestamp.
Table F.32. sample_stat_archiver
Columns
Column Type Description |
---|
Server id |
Sample id |
Last archive succeed count |
Last archive wal file |
Last archive timestamp |
Last archive failed count |
Last archive failed wal file |
Last archive failed timestamp |
Timestamp of last reset for archive stat |
Sample statistic for tables failures in calculating disk space.
Table F.33. sample_stat_tables_failures
Columns
Column Type Description |
---|
Server id |
Sample id |
OID of database |
OID of the table (reference pg_class.oid) |
True indicates a failure in calculating the disk space for the table |
True indicates a failure in calculating the disk space for the toast table |
Sample statistic for indexes failures.
Table F.34. sample_stat_indexes_failures
Columns
Column Type Description |
---|
Server id |
Sample id |
OID of the database |
OID of index (reference pg_index.indexrelid) |
True indicates a failure in calculating the disk space for the index |
This table is utilized to pair the id and display text of each backend type. The backend type could be autovacuum launcher, autovacuum worker, client backend, background worker, background writer etc. Once a new backend type was added in lightdb, We have to add it in this table also. Available since LightDB 23.2 with lt_profile 0.3.7.
Table F.35. lt_stat_backend_type_id_text
Columns
Column Type Description |
---|
Id of the backend type |
Display text for the backend type |
This table is utilized to pair the id and display text of each cost type. The cost type could be DB Time, DB Wait, DB CPU. Available since LightDB 23.2 with lt_profile 0.3.7.
Table F.36. lt_stat_cost_type_id_text
Columns
Column Type Description |
---|
Id of the cost type |
Display text for the cost type |
This table is utilized to pair the id and display text of each wait event type. Available since LightDB 23.2 with lt_profile 0.3.7 .
Table F.37. lt_stat_wait_evt_id_text
Columns
Column Type Description |
---|
Id of the wait event type |
Display text for the wait event type |
This table is utilized to pair the id and display text of each wait event type. Available since LightDB 23.2 with lt_profile 0.3.7.
Table F.38. lt_stat_wait_evt_sub_id_text
Columns
Column Type Description |
---|
Id of the wait event (reference lt_stat_wait_evt_id_text.evt_id) |
Id of wait event type |
Display text for the wait event |
Sample statistic of accumulated cost time for DB Time, DB Wait. New sample would be inserted into this table by interval of 1 second. Available since LightDB 23.2 with lt_profile 0.3.7.
Table F.39. lt_stat_cost_1s
Columns
Column Type Description |
---|
Id of backend type of this sample |
Id of cost type of this sample |
Id of wait event type of this sample |
Id of wait event of this sample |
Total spent time captured on this sample for the event, in microsecond |
Sample statistic for DB Wait related cost time. Accumulated in 1 minute by default, the archive action would be triggered in function call "collect_activity_profile" which would be scheduled every single minute by a default setting. Available since LightDB 23.2 with lt_profile 0.3.7.
Table F.40. lt_stat_cost_60s
Columns
Column Type Description |
---|
Id of backend type of this sample |
Id of cost type of this sample |
Id of wait event type of this sample |
Id of wait event of this sample |
Total accumulated cost time captured on this archive cycle for the event, in microsecond |
Sample statistic for DB Wait related cost . Accumulated in 10 minutes by default, the archive action would be triggered in function call "take_sample" which would be scheduled every 10 minutes by a default setting . The rows in this table would be CASCADE deleted by deleting rows from samples table. Available since LightDB 23.2 with lt_profile 0.3.7.
Table F.41. lt_stat_cost_600s
Columns
Column Type Description |
---|
Server id |
Sample id |
Id of backend type of this sample |
Id of cost type of this sample |
Id of wait event type of this sample |
Id of wait event of this sample |
Total accumulated cost time captured on this sample for the event, in microsecond |
Although lt_profile is usually installed in the target cluster, it also can collect performance data from other clusters. Hence, we have prerequisites for lt_profile database, and for servers.
lt_profile extension depends on extensions plpgsql, dblink and ltfce.
Consider setting following Statistics Collector parameters:
track_activities = on track_counts = on track_io_timing = on track_functions = all/pl
To obtain statement statistics in reports, then database, mentioned in server connection string must have lt_stat_statements extension configured. Set lt_stat_statements parameters to meet your needs (see LightDB documentation):
lt_stat_statements.max - low setting for this parameter may cause some statements statistics to be wiped out before sample is taken. Report will warn you if your lt_stat_statements.max is seems to be undersized.
lt_stat_statements.track = 'top' - all value will affect accuracy of %Total fields for statements-related sections of a report.
The most easy way is to install everything in public schema of a database:
lightdb@postgres=# CREATE EXTENSION dblink; lightdb@postgres=# CREATE EXTENSION lt_stat_statements; lightdb@postgres=# CREATE EXTENSION lt_stat_activity; lightdb@postgres=# CREATE EXTENSION system_stats; lightdb@postgres=# CREATE SCHEMA IF NOT EXISTS lt_catalog; lightdb@postgres=# CREATE EXTENSION ltfce SCHEMA lt_catalog; lightdb@postgres=# CREATE EXTENSION lt_profile;
If you want to install lt_profile in other schema, just create it, and install extension in that schema:
lightdb@postgres=# CREATE EXTENSION dblink; lightdb@postgres=# CREATE EXTENSION lt_stat_statements; lightdb@postgres=# CREATE EXTENSION lt_stat_activity; lightdb@postgres=# CREATE EXTENSION system_stats; lightdb@postgres=# CREATE SCHEMA IF NOT EXISTS lt_catalog; lightdb@postgres=# CREATE EXTENSION ltfce SCHEMA lt_catalog; lightdb@postgres=# CREATE SCHEMA profile; lightdb@postgres=# CREATE EXTENSION lt_profile SCHEMA profile;
All objects will be created in schema, defined by SCHEMA clause. Installation in dedicated schema is the recommended way - the extension will create its own tables, views, sequences and functions. It is a good idea to keep them separate. If you don't want to specify schema qualifier when using module, consider changing search_path setting.
Using lt_profile with a superuser privileges does not have any issues, but if you want to avoid using superuser permissions, here is the guide:
Create an unprivileged user:
create role profile_usr login password 'pwd';
Create a schema for lt_profile installation:
create schema profile authorization profile_usr;
Grant usage permission on schema, where dblink extension resides:
grant usage on schema public to profile_usr;
Create the extension using profile_usr account:
lightdb@postgres=> create extension lt_profile schema profile;
Create a user for lt_profile to connect:
create role profile_mon login password 'pwd_mon';
Make sure this user have permissions to connect to any database in a cluster (by default, it is), and lt_hba.conf will permit such connection from lt_profile database host. Also, we need pg_read_all_stats privilege, and execute privilege on pg_stat_statements_reset:
grant pg_read_all_stats to profile_mon; grant execute on function pg_stat_statements_reset TO profile_mon;
Non-superusers can't establish connection without a password. Best way to provide password is using a password file.
N.B. lt_profile will connect to all databases on a server, thus password file must use a wildcard as a database name.
As an insecure way, you can provide a password in connection string:
select server_connstr('local','dbname=postgres port=5432 user=profile_mon password=pwd_mon');
You can define extension parameters in lightdb.conf. Default values:
lt_profile.topn = 10 - Number of top objects (statements, relations, etc.), to be reported in each sorted report table. Also, this parameter affects size of a sample - the more objects you want to appear in your report, the more objects we need to keep in a sample.
lt_profile.max_sample_age = 7 - Retention time of samples in days. Samples, aged lt_profile.max_sample_age days and more will be automatically deleted on next take_sample() call.
lt_profile.track_sample_timings = on - when this parameter is on, lt_profile will track detailed sample taking timings.
Once installed, extension will create one enabled local server - this is for cluster, where extension is installed.
Servers management functions:
create_server(server name, server_connstr text, server_enabled boolean = TRUE, max_sample_age integer = NULL, description text = NULL) - creates a new server description Function arguments:
server - server name (must be unique)
server_connstr - server connection string
enabled - server enable flag. When is set, server will be included in the common take_sample() call
max_sample_age - server sample retention parameter overrides global lt_profile.max_sample_age setting for this server
description - server description text. Will be included in reports
drop_server(server name) Drops a server and all its samples.
enable_server(server name) Includes server in common take_sample() call.
disable_server(server name) Excludes server from common take_sample() call.
rename_server(server name, new_name name) Renames a server.
set_server_max_sample_age(server name, max_sample_age integer) Set new retention period (in days) for a server. max_sample_age is integer value. To reset a server max_sample_age setting set it to NULL.
set_server_db_exclude(server name, exclude_db name[]) Set exclude databases list for a server. Used in cases, when you unable to connect to some databases in cluster (for example in Amazon RDS instances).
set_server_connstr(server name, new_connstr text) Set new connection string for a server.
set_server_description(server name, description text) Set new server description.
show_servers() Display existing servers.
Server creation example:
SELECT server_new('omega','host=name_or_ip dbname=postgres port=5432');
LightDB relation size functions may take considerable amount of time to collect sizes of all relations in a database. Also those functions require AccessExclusiveLock on a relation. However daily relation sizes collection may be quite enough for you. lt_profile can skip relation sizes collection while taking samples guided by server size collection policy. Policy is defined as a daily window when relation size collection is permitted, and a minimal gap between two samples with relation sizes collected. Thus when size collection policy is defined sample taking function will collect relation sizes only when sample is taken in a window and previous sample with sizes is older then gap. Function set_server_size_sampling defines this policy:
set_server_size_sampling(server name, window_start time with time zone = NULL, window_duration interval hour to second = NULL, sample_interval interval day to minute = NULL)
server - server name
window_start - size collection window start time
window_duration - size collection window duration
sample_interval - minimum time gap between two samples with relation sizes collected Size collection policy is defined only when all three parameters are set.
Example:
SELECT set_server_size_sampling('local','23:00+03',interval '2 hour',interval '8 hour');
Function show_servers_size_sampling show defined sizes collection policy for all servers:
lightdb@postgres=# SELECT * FROM show_servers_size_sampling(); server_name | window_start | window_end | window_duration | sample_interval -------------+--------------+-------------+-----------------+----------------- local | 23:00:00+03 | 01:00:00+03 | 02:00:00 | 08:00:00
When you build a report between samples either of which lacks relation sizes data then relation growth sections will be excluded from report. However, with_growth parameter of report generation functions will expand report bounds to nearest samples with relation sizes data collected. Relation sizes is needed for calculating sequentially scanned volume for tables and explicit vacuum load for indexes. When rare relation sizes collection is used, corresponding report sections data is based on linear interpolation.
Every sample contains statistic information about database workload since previous sample.
take_sample() Function take_sample() will collect a sample for all enabled servers. Server samples will be taken serially one by one. Function returns a table:
server name, result text, elapsed interval
Where:
server is a server name
result is a result of taken sample. It can be 'OK' if sample was taken successively, and will contain error text in case of exception
elapsed is a time elapsed taking a sample for server Such return makes it easy to control samples creation using SQL query.
take_sample_subset([sets_cnt integer], [current_set integer]) Due to serial samples processing take_sample() function can take considerable amount of time. Function take_sample_subset() will take samples for a subset of enabled servers. It is convenient for parallel samples collection. sets_cnt is number of servers subsets. current_set is a subset to process, taking values between 0 and sets_cnt - 1 inclusive. Function returns a table:
server name, result text, elapsed interval
Where:
server is a server name
result is a result of taken sample. It can be 'OK' if sample was taken successively, and will contain error text in case of exception
elapsed is a time elapsed taking a sample for server
take_sample(server name [, skip_sizes boolean]) Will collect a sample for specified server. Use it, for example, when you want to use different sampling frequencies, or if you want to take explicit sample on a server.
server - name of a server to take sample
skip_sizes - override server relation size collection policy. Policy applies only when skip_size argument is omitted or set to null. false value of skip_sizes argument cause take sample with relation sizes, while true will cause skipping relation sizes collection during a sample.
show_samples([server name,] [days integer]) Returns a table, containing existing samples of a server (local server assumed if server is omitted) for days last days (all existing samples if omitted):
sample integer, sample_time timestamp (0) with time zone, dbstats_reset timestamp (0) with time zone, clustats_reset timestamp (0) with time zone, archstats_reset timestamp (0) with time zone
Where:
sample is a sample identifier
sample_time is a time when this sample was taken
dbstats_reset, clustats_reset and archstats_reset is usual null, but will contain pg_stat_database, pg_stat_bgwriter and pg_stat_archiver statistics reset timestamp if it was happend since previous sample Sample-collecting functions are also supports the server repository - it will delete obsolete samples and baselines with respect to retention policy.
You must create at least 2 samples to be able to build your first report between 1st and 2nd samples. Samples for all enabled servers are taken by calling take_sample() function. There is no need in frequent samples creation - usual essential frequency is one or two samples per hour. You can use cron-like tool to schedule samples creation. Example with 30 min period:
*/30 * * * * ltsql -c 'SELECT take_sample()' > /dev/null 2>&1
However, such call has no error checking on take_sample() function results. Consider using more smart usage of take_sample() function, providing results to your monitoring system.
Function will return 'OK' for all servers with successfully taken samples, and show error text for failed servers:
select * from take_sample(); server | result | elapsed -----------+-----------------------------------------------------------------------------+------------- ok_node | OK | 00:00:00.48 fail_node | could not establish connection +| 00:00:00 | SQL statement "SELECT dblink_connect('server_connection',server_connstr)" +| | PL/pgSQL function take_sample(integer) line 69 at PERFORM +| | PL/pgSQL function take_sample_subset(integer,integer) line 27 at assignment+| | SQL function "take_sample" statement 1 +| | FATAL: database "nodb" does not exist | (2 rows)
We can't store sample data forever, thus we have a retention policy. You can define retentions on three levels:
Setting parameter lt_profile.max_sample_age in lightdb.conf file. This is a common retention, it is effective if none of others is defined.
Define server max_sample_age setting while creating a server, or using set_server_max_sample_age() function for existing server. This setting overrides global lt_profile.max_sample_age setting for a specific server.
Create a baseline (see below). Baseline will override retention period for included samples with highest priority.
Use show_samples() function to get list of existing samples in the repository. This function will show detected statistics reset times.
lt_profile will collect detailed sample taking timing statistics when parameter lt_profile.track_sample_timings is on. Results can be obtained from v_sample_timings view. v_sample_timings fields description:
server_name - sampled server name
sample_id - sample identifier
sample_time - when the sample was taken
event - sample taking stage
time_spent - amount of time spent in the event
Event descriptions:
total - Taking the sample (all stages).
connect - Making dblink connection to the server.
get server environment - Getting server GUC parameters, available extensions, etc.
collect database stats - Querying the pg_stat_database view for statistics on databases.
calculate database stats - Calculating differential statistics on databases since the previous sample.
collect tablespace stats - Querying the pg_tablespace view for statistics on tablespaces.
collect statement stats - Collecting statistics on statements using the lt_stat_statements extension.
query pg_stat_bgwriter - Collecting cluster statistics using the pg_stat_bgwriter view.
query pg_stat_archiver - Collecting cluster statistics using the pg_stat_archiver view.
collect object stats - Collecting statistics on database objects. Includes following events:
db:dbname collect tables stats - Collecting statistics on tables for the dbname database.
db:dbname collect indexes stats - Collecting statistics on indexes for the dbname database.
db:dbname collect functions stats - Collecting statistics on functions for the dbname database.
maintain repository - Executing support routines.
calculate tablespace stats - Calculating differential statistics on tablespaces.
calculate object stats - Calculating differential statistics on database objects. Includes following events:
calculate tables stats - Calculating differential statistics on tables of all databases.
calculate indexes stats - Calculating differential statistics on indexes of all databases.
calculate functions stats - Calculating differential statistics on functions of all databases.
calculate cluster stats - Calculating cluster differential statistics.
calculate archiver stats - Calculating archiever differential statistics.
delete obsolete samples - Deleting obsolete baselines and samples.
Baseline is a named sample sequence, having its own retention setting. Baseline can be used in report-building functions as a sample interval. Undefined baseline retention means infinite retention. You can use baselines to save information about database workload on certain time period. For example, you may want to save samples, gathered during load testing, or during regular load on your system for further reference. Baseline management functions:
create_baseline([server name,] baseline_name varchar(25), start_id integer, end_id integer [, days integer]) - create a baseline
server - server name. local server is assumed if omitted
name - baseline name. Each baseline must have unique name within a server.
start_id, end_id - first and last samples, included in baseline.
days - baseline retention time. Defined in integer days since now(). This parameter may be omitted (or set to null), meaning infinite retention.
create_baseline([server name,] baseline_name varchar(25), time_range tstzrange [, days integer]) - create a baseline
server - server name. local server is assumed if omitted
name - baseline name. Each baseline must have unique name within a server.
time_range - baseline time interval. Baseline will include all available samples, overlapping this interval.
days - baseline retention time. Defined in integer days since now(). This parameter may be omitted (or be set to null), meaning infinite retention.
drop_baseline([server name,] name varchar(25)) - drop a baseline
server - server name. local server is assumed if omitted
name - baseline name to drop. Dropping a baseline does not mean immediate drop of all its samples, they are just excluded from baseline, thus is not more covered with baseline retention.
keep_baseline([server name,] name varchar(25) [, days integer]) - change retention of baselines
server - server name. local server is assumed if omitted
name - baseline name. This parameter may be omitted (or be set to null) to change retention of all existing baselines.
days - retention time of a baseline in days since now(). Also, may be omitted (or set to null) to set infinite retention.
show_baselines([server name]) - displays existing baselines. Call this function to get information about existing baselines (names, sample intervals, and retention times)
server - server name. local server is assumed if omitted
lightdb@postgres=# SELECT * FROM baseline_show('local');
Collected samples can be exported from instance of lt_profile extension and than loaded into another one. This feature is useful when you want to move servers from one instance to another, or when you need to send collected data to your support team.
Data is exported as a regular table by function export_data(). You can use any method to export this table from your database. For example, you can use copy command of ltsql to obtain single .csv file:
lightdb@postgres=# \copy (select * from export_data()) to 'export.csv'
By default export_data() function will export all samples of all configured servers. However you can limit export to only one server, and further limit sample range too:
export_data([server name, [min_sample_id integer,] [max_sample_id integer]] [, obfuscate_queries boolean]) - export collected data
server is a server name. All servers is assumed if omitted
min_sample_id and max_sample_id - export bounding sample identifiers (inclusive). Null value of min_sample_id bound cause export of all samples till max_sample_id, and null value of max_sample_id cause export of all samples since min_sample_id.
obfuscate_queries - use this parameter only when you want to hide query texts - they will be exported as MD5 hash.
Data can be imported from local table only, thus previously exported data is need to be loaded first. In our case with copy command:
lightdb@postgres=# CREATE TABLE import (section_id bigint, row_data json); CREATE TABLE lightdb@postgres=# \copy import from 'export.csv' COPY 6437
Now we can perform data import, providing this table to import_data() function:
lightdb@postgres=# SELECT * FROM import_data('import');
Although server descriptions is also imported, your local lt_profile servers with matching names will prohibit import operations. Consider temporarily rename those servers. If you'll need to import new data for previously imported servers, they will be matched by system identifier, so fell free to rename imported sever as you wish. All servers are imported in disabled state. import_data() function takes only the imported table:
import_data(data regclass)
data - table containing exported data This function returns number of rows actually loaded in extension tables. After successful import operation you can drop import table.
Reports are generated in HTML markup by reporting functions, which containing statistical information about instance workload during report interval.
get_pwr([server name,] start_id integer, end_id integer [, description text [, with_growth boolean]]) - generate PWR report by sample identifiers
get_pwr([server name,] time_range tstzrange [, description text [, with_growth boolean]]) - generate PWR report for the shortest sample interval, covering provided time_range.
get_pwr([server name], baseline varchar(25) [, description text [, with_growth boolean]]) - generate PWR report, using baseline as samples interval
get_pwr_latest([server name]) - generate PWR report for two latest samples Function arguments:
server - server name. local server is assumed if omitted
start_id - interval begin sample identifier
end_id - interval end sample identifier
time_range - time range (tstzrange type)
baseline - a baseline name
with_growth - a flag, requesting interval expansion to nearest bounds with relation growth data available. Default value is false
description - a text memo, it will be included in report as a report description
get_psh([server name,] start_id integer, end_id integer [, description text [, with_growth boolean]]) - generate PSH report by sample identifiers
get_psh([server name,] time_range tstzrange [, description text [, with_growth boolean]]) - generate PSH report for the shortest sample interval, covering provided time_range.
get_psh([server name], baseline varchar(25) [, description text [, with_growth boolean]]) - generate PSH report, using baseline as samples interval
get_psh_latest([server name]) - generate PSH report for two latest samples Function arguments:
server - server name. local server is assumed if omitted
start_id - interval begin sample identifier
end_id - interval end sample identifier
time_range - time range (tstzrange type)
baseline - a baseline name
with_growth - a flag, requesting interval expansion to nearest bounds with relation growth data available. Default value is false
description - a text memo, it will be included in report as a report description
Report generation example:
$ ltsql -Aqtc "SELECT get_pwr(480,482)" -o report_480_482.html
For any other server, use it's name:
$ ltsql -Aqtc "SELECT get_pwr('omega',12,14)" -o report_omega_12_14.html
Report generation using time ranges:
ltsql -Aqtc "select get_psh(tstzrange('2020-05-13 11:51:35+03','2020-05-13 11:52:18+03'))" -o report_range.html
Also, time ranges is useful for generating periodic reports. Let's build last 24-hour report:
ltsql -Aqtc "select get_psh(tstzrange(now() - interval '1 day',now()))" -o last24h_report.html
Now you can view report file using any web browser.
Report tables and their columns are described in this section.
Contains cpu usage statistics such as user cpu, system cpu, idle cpu and iowait at the begin and end of report.
%User - user cpu percent
%System - system cpu percent
%Idle - idle cpu percent
%IO Wait - io wait cpu percent
Contains memory usage statistics of LightDB at the begin and end of report.
Host memory (MB) - total physical memory of the host
Shared memory use (MB) - shared memory used in LightDB, which is used by all processes of a LightDB server
Local memory use (MB) - local memory used in LightDB, which is allocated by each backend process for its own use
Host memory used for shared memory+local memory (%) - shared memory and local memory as a percentage of total host memory
Contains per-database statistics during report interval, based on pg_stat_database view.
Database - database name
Transactions - database transaction statistics
Commits - number of committed transactions (xact_commit)
Rollbacks - number of rolled back transactions (xact_rollback)
Deadlocks - number of deadlocks detected (deadlocks)
Block statistics - database blocks read and hit statistics
Hit(%) - buffer cache hit ratio
Read - number of disk blocks read in this database (blks_read)
Hit - number of times disk blocks were found already in the buffer cache (blks_hit)
Tuples - tuples statistics section
Returned - number of returned tuples (tup_returned)
Fetched - number of fetched tuples (tup_fetched)
Inserted - inserted tuples count (tup_inserted)
Updated - number of updated tuples (tup_updated)
Deleted - number of deleted tuples (tup_deleted)
Temp files - temporary files statistics
Size - total amount of data written to temporary files by queries in this database (temp_bytes)
Files - number of temporary files created by queries in this database (temp_files)
Size - database size at the end of report interval (pg_database_size())
Growth - database growth during report interval (pg_database_size() difference)
Contains per-database aggregated total statistics of pg_stat_statements data (if lt_stat_statements extension was available during report interval)
Database - database name
Calls - total count of all statements executions (sum of calls)
Time (s) - time spent in seconds
Plan - time spent planning (sum of total_plan_time) - available since lt_stat_statements 1.8
Exec - time spent executing (sum of total_time or total_exec_time)
Read - time spent reading blocks (sum of blk_read_time)
Write - time spent writing blocks (sum of blk_write_time)
Trigger - time spent executing trigger functions
Fetched blocks - total blocks fetched from disk and buffer cache
Shared - total fetched shared blocks count (sum of shared_blks_read + shared_blks_hit)
Local - total fetched local blocks count (sum of local_blks_read + local_blks_hit)
Dirtied blocks - total blocks dirtied in database
Shared - total number of shared blocks dirtied in the database (sum of shared_blks_dirtied)
Local - total number of local blocks dirtied in the database (sum of local_blks_dirtied)
Temp blocks - blocks used for operations (like joins and sorts)
Read - blocks read (sum of temp_blks_read)
Write - blocks written (sum of temp_blks_written)
Local blocks - blocks used for temporary tables
Read - blocks read (sum of local_blks_read)
Write - blocks written (sum of local_blks_written)
Statements - total count of captured statements
WAL size - total amount of WAL generated by statements (sum of wal_bytes)
This table contains data from pg_stat_bgwriter view
Scheduled checkpoints - total number of checkpoints, completed on schedule due to checkpoint_timeout parameter (checkpoints_timed field)
Requested checkpoints - total number of other checkpoints: due to values of max_wal_size, archive_timeout and CHECKPOINT commands (checkpoints_req field)
Checkpoint write time (s) - total time spent writing checkpoints in seconds (checkpoint_write_time field)
Checkpoint sync time (s) - total time spent syncing checkpoints in seconds (checkpoint_sync_time field)
Checkpoints buffers written - total number of buffers, written by checkpointer (buffers_checkpoint field)
Background buffers written - total number of buffers, written by background writer process (buffers_clean field)
Backend buffers written - total number of buffers, written by backends (buffers_backend field)
Backend fsync count - total number of backend fsync calls (buffers_backend_fsync field)
Bgwriter interrupts (too many buffers) - total count of background writer interrupts due to reaching value of the bgwriter_lru_maxpages parameter.
Number of buffers allocated - total count of buffers allocated (buffers_alloc field)
WAL generated - total amount of WAL generated (based on pg_current_wal_lsn())
WAL segments archived - archived WAL segments count (based on archived_count of pg_stat_archiver view)
WAL segments archive failed - WAL segment archive failures count (based on failed_count of pg_stat_archiver view)
This table contains information about tablespaces sizes and growth:
Tablespace - tablespace name
Path - tablespace path
Size - tablespace size as it was at time of last sample in report interval
Growth - tablespace growth during report interval
This report section contains tables of top wait events statistics during report interval sorted by wait time.
IO - Various file read/write activities during the query of sql statements.
LW Lock - LW lock for a variable or LW lock acquire.
WALSync - Fsync XLOG file.
WALWrite - Write Pages to XLOG file.
LockManager - Used for the predicate locking target and lock shared hash tables to reduce contention.
WALInsert - LW lock, Acquire a WAL insertion lock, for inserting to WAL.
DataFileExtend - Add a block to the specified relation.
DataFileRead - Read the specified block from a relation.
DataFileTruncate - Truncate relation to specified number of blocks.
BufFileRead - Load some data into buffer.
ProcArray - LW lock for memory access across multi-process, such as ProcGlobal, ShmemVariableCache, MyPgXact, procArray.
RelationMapRead - load data from the shared or local map file.
SLRUSync - Physical write of a page from a buffer slot.
SLRUWrite - Physical flush of a page from a buffer slot after writing.
RegisterSyncRequest - italic">RegisterSyncRequest.
XidGen - LW lock, Set and return the next transaction ID.
XactSLRU - LW lock, for xlog, clog related memory data.
This table contains top lt_profile.topn wait event types sorted by wait time.
Wait Event Type - Type of wait event, each type may contain several wait events.
Wait Time(s) - Total wait time for all wait events of this type.
%Total Wait Time - Total wait time of this type as a percentage of total wait time for all types of events in a cluster.
This table contains top lt_profile.topn wait events sorted by wait time.
Wait Event - The specific event that process waiting for.
Wait Event Type - Type of this event.
Wait Time(s) - Total wait time of all this event.
%Total Wait Time - Total wait time of all this event as a percentage of total wait time for all types of events in a cluster.
This report section contains tables of top statements during report interval sorted by several important statistics. Data is captured from pg_stat_statements view if it was available at the time of samples.
This table contains top lt_profile.topn statements sorted by elapsed time total_plan_time + total_exec_time of pg_stat_statements view.
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
%Total - total time of this statement as a percentage of total time of all statements in a cluster
Time (s) - time spent in this statement (in seconds)
Elapsed - total time, spent in this statement (total_plan_time + total_exec_time)
Plan - time spent in planning this statement (total_plan_time field)
Exec - time spent executing this query (total_exec_time field)
I/O time (s):
Read - time spent reading blocks (blk_read_time field)
Write - time spent writing blocks (blk_write_time field)
Plans - number of times the statement was planned (plans field)
Executions - number of times the statement was executed (calls field)
Top lt_profile.topn statements sorted by total_plan_time field of pg_stat_statements view
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
Plan elapsed (s) - time spent in planning this statement (total_plan_time field)
%Elapsed - plan time of this statement as a percentage of statement elapsed time
Plan times (ms) - detailed plan time statistics of this statement (in milliseconds)
Mean - mean time spent planning this statement (mean_plan_time field)
Min - minimum time spent planning this statement (min_plan_time field)
Max - maximum time spent planning this statement (max_plan_time field)
StdErr - population standard deviation of time spent planning this statement (stddev_plan_time field)
Plans - number of times this statement was planned (plans field)
Executions - number of times this statement was executed (calls field)
Top lt_profile.topn statements sorted by total_time (or total_exec_time) field of pg_stat_statements view
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
Exec (s) - time spent executing this statement (total_exec_time field)
%Elapsed - execution time of this statement as a percentage of statement elapsed time
%Total - exec time of this statement as a percentage of total elapsed time of all statements in a cluster
I/O time (s):
Read - time spent reading blocks (blk_read_time field)
Write - time spent writing blocks (blk_write_time field)
Rows - number of rows retrieved or affected by the statement (rows field)
Execution times (ms) - detailed execution time statistics of this statement (in milliseconds)
Mean - mean time spent executing this statement (mean_exec_time field)
Min - minimum time spent executing this statement (min_exec_time field)
Max - maximum time spent executing this statement (max_exec_time field)
StdErr - population standard deviation of time spent executing this statement (stddev_exec_time field)
Executions - number of times this statement was executed (calls field)
Top lt_profile.topn statements sorted by calls field of pg_stat_statements view
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
Executions - count of statement executions (calls field)
%Total - calls of this statement as a percentage of total calls of all statements in a cluster
Rows - number of rows retrieved or affected by the statement (rows field)
Mean(ms) - mean time spent in the statement, in milliseconds (mean_time or mean_exec_time field)
Min(ms) - minimum time spent in the statement, in milliseconds (min_time or min_exec_time field)
Max(ms) - maximum time spent in the statement, in milliseconds (max_time or max_exec_time field)
StdErr(ms) - population standard deviation of time spent in the statement, in milliseconds (stddev_time or stddev_exec_time field)
Elapsed(s) - amount of time spent executing this query, in seconds (total_time or total_exec_time field)
Top lt_profile.topn statements sorted by read and write time (blk_read_time + blk_write_time)
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
IO(s) - amount of time spent on reading and writing (I/O time) by this statement in seconds (blk_read_time + blk_write_time)
Read(s) - amount of time spent on reading by this statement in seconds (blk_read_time)
Write(s) - amount of time spent on writing by this statement in seconds (blk_write_time)
%Total - I/O time of this statement as a percentage of total I/O time for all statements in a cluster
Reads - number of blocks read by this statement divided in three sub-columns:
Shared - shared reads (shared_blks_read field)
Local - local reads (local_blks_read field)
Temp - temp reads (temp_blks_read field)
Writes - number of blocks written by this statement divided in three sub-columns:
Shared - shared writes (shared_blks_written field)
Local - local writes (local_blks_written field)
Temp - temp writes (temp_blks_written field)
Elapsed(s) - amount of time spent executing this query, in seconds (total_time or total_exec_time field)
Executions - number of executions for this statement (calls field)
Top lt_profile.topn statements sorted by read and hit blocks, helping to detect the most data processing statements.
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
Fetched - number of fetched blocks (expression: shared_blks_hit + shared_blks_read)
%Total - blocks fetched for this statement as a percentage of total blocks fetched for all statements in a cluster
Hits(%) - percentage of blocks got from buffers within all blocks got
Elapsed(s) - amount of time spent in this statement, in seconds (total_time or total_exec_time+total_plan_time field)
Rows - number of rows retrieved or affected by the statement (rows field)
Executions - number of executions for this statement (calls field)
Top lt_profile.topn statements sorted by shared reads, helping to detect most read intensive statements.
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
Reads - number of shared read blocks for this statement (shared_blks_read field)
%Total - shared reads for this statement as a percentage of total shared reads for all statements in a cluster
Hits(%) - percentage of blocks got from buffers within all blocks got
Elapsed(s) - amount of time spent in this statement, in seconds (total_time or total_exec_time+total_plan_time field)
Rows - number of rows retrieved or affected by the statement (rows field)
Executions - number of executions for this statement (calls field)
Top lt_profile.topn statements sorted by shared dirtied buffer count, helping to detect most data changing statements.
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
Dirtied - number of shared blocks dirtied by this statement (shared_blks_dirtied field)
%Total - shared blocks dirtied by this statement as a percentage of total shared blocks dirtied by all statements in a cluster
Hits(%) - percentage of blocks got from buffers within all blocks got
Elapsed(s) - amount of time spent in this statement, in seconds (total_time or total_exec_time+total_plan_time field)
Rows - number of rows retrieved or affected by the statement (rows field)
Executions - number of executions for this statement (calls field)
Top lt_profile.topn statements, which had to perform writes sorted by written blocks count.
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
Written - number of blocks written by this statement (shared_blks_written field)
%Total - number of blocks written by this statement as a percentage of total blocks written by all statements in a cluster
%BackendW - number of blocks written by this statement as a percentage of all blocks written in a cluster by backends (buffers_backend field of pg_stat_bgwriter view)
Hits(%) - percentage of blocks got from buffers within all blocks got
Elapsed(s) - amount of time spent in this statement, in seconds (total_time or total_exec_time+total_plan_time field)
Rows - number of rows retrieved or affected by the statement (rows field)
Executions - number of executions for this statement (calls field)
Top lt_profile.topn statements, sorted by WAL generated (available since lt_stat_statements v1.8)
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
WAL - amount of WAL generated by the statement (wal_bytes field)
%Total - amount of WAL generated by the statement as a percentage of total WAL generated in cluster (pg_current_wal_lsn() increment)
Dirtied - number of shared blocks dirtied by this statement (shared_blks_dirtied field)
WAL full page images - total number of WAL full page images generated by the statement ( wal_fpi field)
WAL records - total amount of WAL bytes generated by the statement (wal_bytes field)
Top lt_profile.topn statements sorted by temp I/O, calculated as the sum of temp_blks_read, temp_blks_written, local_blks_read and local_blks_written fields
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Database - Statement database name (derived from dbid field)
Local fetched - number of retrieved local blocks (expression: local_blks_hit + local_blks_read)
Hits(%) - percentage of local blocks got from temp buffers within all local blocks got
Local blocks - I/O statistics of blocks used in temporary tables
Write - number of written local blocks (local_blks_written)
%Total - local_blks_written of this statement as a percentage of total local_blks_written for all statements in a cluster
Read - number of read local blocks (local_blks_read)
%Total - local_blks_read of this statement as a percentage of total local_blks_read for all statements in a cluster
Temp blocks - I/O statistics of blocks used in operations (like sorts and joins)
Write - number of written temp blocks (temp_blks_written)
%Total - temp_blks_written of this statement as a percentage of total temp_blks_written for all statements in a cluster
Read - number of read local blocks (temp_blks_read)
%Total - temp_blks_read of this statement as a percentage of total temp_blks_read for all statements in a cluster
Elapsed(s) - amount of time spent in this statement, in seconds (total_time or total_exec_time+total_plan_time field)
Rows - number of rows retrieved or affected by the statement (rows field)
Executions - number of executions for this statement (calls field)
Query texts of all statements mentioned in report. You can use Query ID link in any statistic table to get there and see query text.
This section of report contains top database objects, using statistics from Statistics Collector views.
Top database tables sorted by estimated volume, read by sequential scans. Based on pg_stat_all_tables view. Here you can search for tables, possibly lacks some index on it.
Database - database name of the table
Tablespace - tablespace name, where the table is located
Schema - schema name of the table
Table - table name
~SeqBytes - estimated volume, read by sequential scans. Calculated as a sum of relation size multiplied by seq_scan for all samples of a report.
SeqScan - number of sequential scans performed on the table (seq_scan field)
IndexScan - number of index scans initiated on this table (idx_scan field)
IndexFetch - number of live rows fetched by index scans (idx_tup_fetch field)
Inserted - number of rows inserted (n_tup_ins field)
Updated - number of rows updated (including HOT) (n_tup_upd field)
Deleted - number of rows deleted (n_tup_del field)
HOT Updated - number of rows HOT updated (n_tup_hot_upd field)
Fetched block is a block being processed from disk (read), or from shared buffers (hit). Tables in this list are sorted by sum of fetched blocks for table relation, its indexes, TOAST of a table (if exists), and TOAST index (if exists). This section can focus your attention on tables with excessive blocks processing. Based on data of pg_statio_all_tables view.
Database - database name of the table
Tablespace - tablespace name, where the table is located
Schema - schema name of the table
Table - table name
Heap - statistics for relation blocks fetched (heap_blks_read + heap_blks_hit)
Index - statistics for all relation indexes blocks fetched (idx_blks_read + idx_blks_hit)
TOAST - statistics for TOAST-table blocks fetched (toast_blks_read + toast_blks_hit)
TOAST-Index - statistics for TOAST index blocks fetched (tidx_blks_read + tidx_blks_hit)
Each statistic field in this table is divided in two columns:
Blocks - number of blocks fetched for relation heap, index, TOAST or TOAST index
%Total - blocks fetched for relation heap, index, TOAST or TOAST index as a percentage of all blocks fetched in a whole cluster
Top tables sorted by block reads. Tables in this list are sorted by sum of block reads for table, its indexes, TOAST of a table (if exists), and TOAST index (if exists). This section can focus your attention on tables with excessive blocks reading. Based on data of pg_statio_all_tables view.
Database - database name of the table
Tablespace - tablespace name, where the table is located
Schema - schema name of the table
Table - table name
Heap - statistics for relation block reads (heap_blks_read)
Index - statistics for all relation indexes blocks reads (idx_blks_read)
TOAST - statistics for TOAST-table block reads (toast_blks_read)
TOAST-Index - statistics for TOAST index block reads (tidx_blks_read)
Each read statistic in this table is divided in two columns:
Blocks - number of block reads for relation heap, index, TOAST or TOAST index
%Total - block reads for relation heap, index, TOAST or TOAST index as a percentage of all block reads in a whole cluster
Top tables sorted by amount of DML-affected rows, i.e. sum of n_tup_ins, n_tup_upd and n_tup_del (including TOAST tables).
Database - database name of the table
Tablespace - tablespace name, where the table is located
Schema - schema name of the table
Table - table name
Inserted - number of rows inserted (n_tup_ins field)
Updated - number of rows updated (including HOT) (n_tup_upd field)
Deleted - number of rows deleted (n_tup_del field)
HOT Updated - number of rows HOT updated (n_tup_hot_upd field)
SeqScan - number of sequential scans performed on the table (seq_scan field)
SeqFetch - number of live rows fetched by sequential scans (seq_tup_read field)
IndexScan - number of index scans initiated on this table (idx_scan field)
IndexFetch - number of live rows fetched by index scans (idx_tup_fetch field)
Top tables sorted by amount of operations, causing autovacuum load, i.e. sum of n_tup_upd and n_tup_del (including TOAST tables). Consider fine-tune of vacuum-related parameters based on provided vacuum and analyze run statistics.
Database - database name of the table
Tablespace - tablespace name, where the table is located
Schema - schema name of the table
Table - table name
Updated - number of rows updated (including HOT) (n_tup_upd field)
HOT Updated - number of rows HOT updated (n_tup_hot_upd field)
Deleted - number of rows deleted (n_tup_del field)
Vacuum - number of times this table has been manually vacuumed (not counting VACUUM FULL) (vacuum_count field)
AutoVacuum - number of times this table has been vacuumed by the autovacuum daemon (autovacuum_count field)
Analyze - number of times this table has been manually analyzed (analyze_count field)
AutoAnalyze - number of times this table has been analyzed by the autovacuum daemon (autoanalyze_count field)
Top tables sorted by growth
Database - database name of the table
Tablespace - tablespace name, where the table is located
Schema - schema name of the table
Table - table name
Size - table size, as it was at the moment of last sample in report interval
Growth - table growth
Inserted - number of rows inserted (n_tup_ins field)
Updated - number of rows updated (including HOT) (n_tup_upd field)
Deleted - number of rows deleted (n_tup_del field)
HOT Updated - number of rows HOT updated (n_tup_hot_upd field)
Fetched block is a block being processed from disk (read), or from shared buffers (hit). Based on data of pg_statio_all_indexes view.
Database - database name of the index
Tablespace - tablespace name, where the index is located
Schema - schema name of the index
Table - table name
Index - index name
Scans - number of scans, performed on index (idx_scan field)
Blocks - blocks fetched from this index (idx_blks_read + idx_blks_hit)
%Total - blocks fetched for this index as a percentage of all blocks fetched in a whole cluster
Top indexes sorted by block reads. Based on data of pg_statio_all_indexes view.
Database - database name of the index
Tablespace - tablespace name, where the index is located
Schema - schema name of the index
Table - table name
Index - index name
Scans - number of scans, performed on index (idx_scan field)
Block Reads - number of disk blocks read from this index (idx_blks_read)
%Total - block reads from this index as a percentage of all block reads in a whole cluster
Hits(%) - percentage of index blocks got from buffer cache within all index blocks fetched for this index
Top indexes sorted by growth
Database - database name of the index
Tablespace - tablespace name, where the index is located
Schema - schema name of the index
Table - table name
Index - index name
Index - index statistics
Size - index size, as it was at the moment of last sample in report interval
Growth - index growth during report interval
Table - underlying table statistics
Inserted - number of rows inserted into underlying table (n_tup_ins field)
Updated - number of rows updated in underlying table (without HOT) (n_tup_upd - n_tup_hot_upd)
Deleted - number of rows deleted from underlying table (n_tup_del field)
Non-scanned indexes during report interval sorted by DML operations on underlying tables, causing index support. Constraint indexes are excluded.
Database - database name of the index
Tablespace - tablespace name, where the index is located
Schema - schema name of the index
Table - table name
Index - index name
Index - index statistics
Size - index size, as it was at the moment of last sample in report interval
Growth - index growth during report interval
Table - underlying table statistics
Inserted - number of rows inserted into underlying table (n_tup_ins field)
Updated - number of rows updated in underlying table (without HOT) (n_tup_upd - n_tup_hot_upd)
Deleted - number of rows deleted from underlying table (n_tup_del field)
This report section contains top functions in cluster, based on pg_stat_user_functions view.
Top functions sorted by time elapsed.
Database - database name of the function
Schema - schema name of the index
Function - function name
Executions - number of times this function has been called (calls field)
Time (s) - function timing statistics in seconds
Total - total time spent in this function and all other functions called by it (total_time field)
Self - total time spent in this function itself, not including other functions called by it (self_time field)
Mean - mean time of single function execution
Mean self - mean self time of single function execution
Top functions sorted by executions count.
Database - database name of the function
Schema - schema name of the index
Function - function name
Executions - number of times this function has been called (calls field)
Time (s) - function timing statistics in seconds
Total - total time spent in this function and all other functions called by it (total_time field)
Self - total time spent in this function itself, not including other functions called by it (self_time field)
Mean - mean time of single function execution
Mean self - mean self time of single function execution
Top trigger functions sorted by time elapsed.
Database - database name of the function
Schema - schema name of the index
Function - function name
Executions - number of times this function has been called (calls field)
Time (s) - function timing statistics in seconds
Total - total time spent in this function and all other functions called by it (total_time field)
Self - total time spent in this function itself, not including other functions called by it (self_time field)
Mean - mean time of single function execution
Mean self - mean self time of single function execution
Top tables sorted by vacuums (manual and automatic) processed
Database - database name of the table
Tablespace - tablespace name, where the table is located
Schema - schema name of the table
Table - table name
Vacuum count - number of times this table has been manually vacuumed (not counting VACUUM FULL) (vacuum_count field)
Autovacuum count - number of times this table has been vacuumed by the autovacuum daemon (autovacuum_count field)
Inserted - number of rows inserted (n_tup_ins field)
Updated - number of rows updated (including HOT) (n_tup_upd field)
Deleted - number of rows deleted (n_tup_del field)
HOT Updated - number of rows HOT updated (n_tup_hot_upd field)
Top tables sorted by analyze run (manual and automatic) count
Database - database name of the table
Tablespace - tablespace name, where the table is located
Schema - schema name of the table
Table - table name
Analyze count - number of times this table has been manually analyzed (analyze_count field)
Autoanalyze count - number of times this table has been analyzed by the autovacuum daemon (autoanalyze_count field)
Inserted - number of rows inserted (n_tup_ins field)
Updated - number of rows updated (including HOT) (n_tup_upd field)
Deleted - number of rows deleted (n_tup_del field)
HOT Updated - number of rows HOT updated (n_tup_hot_upd field)
This table provides estimation of implicit vacuum load caused by table indexes. Here is top indexes sorted by count of vacuums performed on underlying table multiplied by index size.
Database - database name of the index
Tablespace - tablespace name, where the index is located
Schema - schema name of the index
Table - table name
Index - index name
~Vacuum bytes - vacuum load estimation calculated as (vacuum_count + autovacuum_count) * index_size
Vacuum count - number of times this table has been manually vacuumed (not counting VACUUM FULL) (vacuum_count field)
Autovacuum count - number of times this table has been vacuumed by the autovacuum daemon (autovacuum_count field)
Index size - average index size during report interval
Relsize - average relation size during report interval
This section contains modified tables with last vacuum run. Statistics is valid for last sample in report interval. Based on pg_stat_all_tables view.
Top tables, sized 5 MB and more, sorted by dead tuples ratio.
Database - database name of the table
Schema - schema name of the table
Table - table name
Live - estimated number of live rows (n_live_tup)
Dead - estimated number of dead rows (n_dead_tup)
%Dead - dead rows of the table as a percentage of all rows in the table
Last AutoVacuum - last time when this table was vacuumed by the autovacuum daemon (last_autovacuum)
Size - table size, as it was at the moment of last report sample.
This section contains modified tables with last vacuum run. Statistics is valid for last sample in report interval. Based on pg_stat_all_tables view.
Top tables, sized 5 MB and more, sorted by modified tuples ratio.
Database - database name of the table
Schema - schema name of the table
Table - table name
Live - estimated number of live rows (n_live_tup)
Dead - estimated number of dead rows (n_dead_tup)
Mod - estimated number of rows modified since this table was last analyzed (n_mod_since_analyze)
%Mod - modified rows of the table as a percentage of all rows in the table
Last AutoAnalyze - last time when this table was analyzed by the autovacuum daemon
Size - table size, as it was at the moment of last report sample.
This section of a report contains LightDB GUC parameters, and values of functions version(), pg_postmaster_start_time(), pg_conf_load_time() and field system_identifier of pg_control_system() function during report interval.
Setting - name of a parameter
reset_val - reset_val field of pg_settings view. Bold font is used to show settings, changed during report interval.
Unit - setting unit
Source - configuration file, where this setting was defined, line number after semicolon.
Notes - This field will contain timestamp of a sample, when this value was observed first time during report interval.
This section of a report contains several important linux kernel parameters.
Setting - name of a parameter
Value - value of a parameter.
Source - configuration file, where this setting was defined.
LightDB collects execution statistics after execution is complete. If single execution of a statement lasts for several samples, it will affect statistics of only last sample (when it was completed). And you can't get statistics on still running statements. Also, maintenance processes like vacuum and checkpointer will update statistics only on completion.
Resetting any LightDB statistics may affect accuracy of a next sample.
Exclusive locks on relations conflicts with calculating relation size. Sample won't collect relation sizes of relations with AccessExclusiveLock held by any session. However, a session can aquire AccessExclusiveLock on relation during sample processing. To workaround this problem, lock_timeout is set to 3s, so if take_sample() function will be unable to acquire a lock for 3 seconds, it will fail, and no sample will be generated.
Report tables and their columns are described in this section.
This table contains top lt_profile.topn user wait event sorted by wait time.
Wait Event - The specific event that process waiting for.
Wait Event Type - Type of this event.
Wait Time(s) - Total wait time of all this event.
%Total Wait Time - Total wait time of all this event as a percentage of total wait time for all types of events in a cluster.
This table contains top lt_profile.topn background wait event sorted by wait time.
Wait Event - The specific event that process waiting for.
Wait Event Type - Type of this event.
Wait Time(s) - Total wait time of all this event.
%Total Wait Time - Total wait time of all this event as a percentage of total wait time for all types of events in a cluster.
Top wait events sorted by wait time for the top statements sorted by total_exec_time field of pg_stat_statements view.
Query ID - Query identifier as a hash of database, user and query text. Compatible with ltcenter utility. Native pg_stat_statements field qieryid in hexadecimal notation is shown in square brackets.
SQL Text - First 100 characters of the complete query.
Exec (s) - time spent executing this statement (total_exec_time field).
%Total - exec time of this statement as a percentage of total elapsed time of all statements in a cluster
Wait Event - The specific event that process waiting for.
Wait Time(s) - Total wait time of the event in this query.
Query texts of all statements mentioned in report. You can use Query ID link in any statistic table to get there and see query text.