DEV Community

Dmitry Romanoff
Dmitry Romanoff

Posted on

Automating PostgreSQL Table Monitoring with Bash

In the world of database management, monitoring the state of your databases is crucial for maintaining performance and ensuring data integrity. In this article, we'll explore a Bash script that automates the process of checking PostgreSQL database tables and their respective row counts. The script iterates through databases, collects relevant information, and populates a target table with this data. Let’s break down how it works and how you can leverage it for your PostgreSQL environments.

Why Monitor Database Tables?

Monitoring database tables serves several purposes:

  • Performance Optimization: Understanding row counts can help identify tables that may require optimization, such as indexing or partitioning.
  • Data Integrity: Regularly checking tables ensures that data is consistent and helps detect anomalies.
  • Capacity Planning: Knowing the growth of your tables aids in forecasting resource needs and scaling your infrastructure effectively.

Overview of the Script

This script is designed to connect to a PostgreSQL instance, retrieve databases that match a specific pattern, and gather statistics about their tables. The results are then saved in a designated reporting table. Here’s how the script is structured:

Key Components

  • Input Parameters: The script accepts various parameters including hostname, port, database pattern, offset, and limit.
  • Database Querying: It uses SQL queries to fetch the list of databases and their table statistics.
  • Reporting: The script generates SQL insert statements for populating a reporting table, ensuring that duplicates are avoided.

Script Code

#!/bin/bash

##########################################
#
# Check PostgreSQL DB tables
# and their row nums
#
# This scripts iterates databases and 
# populates some target db table with
# info about tables and their row nums
#
# Date: 09-Jun-2022
#
# Author: Dmitry
#
##########################################

helpFunction()
{
   echo ""
   echo "Usage: $0 -h hostname -p port -x db_pattern -a offset -b limit"
   echo -e "\t-h Postgres hostname"
   echo -e "\t-p Postgers port"
   echo -e "\t-x Postgres db pattern"
   echo -e "\t-a Report offset starting from 0"
   echo -e "\t-b Report limit"
   echo -e " "
   echo -e "Example how to run: $0 -h localhost -p 5432 -x % -a 0 -b 10 "
   echo -e " "
   exit 1 # Exit script after printing help
}

trim() {
    local var="$*"
    # remove leading whitespace characters
    var="${var#"${var%%[![:space:]]*}"}"
    # remove trailing whitespace characters
    var="${var%"${var##*[![:space:]]}"}"
    printf '%s' "$var"
}

while getopts "h:p:x:a:b:" opt
do
   case "$opt" in
      h ) inpHost="$OPTARG" ;;
      p ) inpPort="$OPTARG" ;;
      x ) inpDBPattern="$OPTARG" ;;
      a ) inpOffset="$OPTARG" ;;
      b ) inpLimit="$OPTARG" ;;
      ? ) helpFunction ;; # Print helpFunction in case parameter is non-existent
   esac
done

# Print helpFunction in case parameters are empty
if [ -z "$inpHost" ] || [ -z "$inpPort" ] || [ -z "$inpDBPattern" ] || [ -z "$inpOffset" ] || [ -z "$inpLimit" ] 
then
   echo "Some or all of the parameters are empty";
   helpFunction
fi

echo " " 
echo "Input parameters:" 
echo "---------------- " 
echo "inpHost=$inpHost" 
echo "inpPort=$inpPort" 
echo "inpDBPattern=$inpDBPattern"
echo "inpOffset=$inpOffset"
echo "inpLimit=$inpLimit"
echo "---------------- "

export dbs_to_process_lst="/tmp/dbs_to_process.tmp"
export dbs_not_processed_yet="/tmp/dbs_not_processed_yet.tmp"
export dbs_already_populated="/tmp/dbs_already_populated.tmp"

export report_name="/tmp/report_check_db_tables_and_their_row_nums.tmp"
export report_name_2="/tmp/report_check_db_tables_and_their_row_nums_processed.sql"

num_of_dbs_to_process=$(psql -h $inpHost -p $inpPort -t -c "select count(M.*) from (select datname from pg_database where datname like '${inpDBPattern}' order by 1 offset ${inpOffset} limit ${inpLimit}) M " postgres)
num_of_dbs_to_process=`trim $num_of_dbs_to_process`

export PGPASSWORD='password2'
psql -h $inpHost -p $inpPort -U postgres -d postgres -qtX << EOF > ${dbs_to_process_lst}
select M.datname from (select datname from pg_database where datname like '${inpDBPattern}' order by 1 offset ${inpOffset} limit ${inpLimit}) M
EOF

echo " "
echo " Number of DBs in the DB instance ($inpHost $inpPort): ${num_of_dbs_to_process}"
echo " "

export the_yyyymmdd=$(date '+%Y%m%d')
export hh24miss=$(date '+%H%M%S')

export PGPASSWORD='password2'
psql -h localhost -p 5432 -U my_report_db -d my_report_db -qtX << EOF > ${dbs_already_populated}
select distinct db from stat_for_db_tables_and_their_rownums where db_port='$inpPort' order by 1
EOF

grep -F -x -v -f ${dbs_already_populated} ${dbs_to_process_lst} > ${dbs_not_processed_yet}

n_of_dbs_not_processed_yet=`cat ${dbs_not_processed_yet} | wc -l`
n_of_dbs_not_processed_yet=`trim $n_of_dbs_not_processed_yet`

echo " How many DBs from the DB instance ($inpHost $inpPort) remain to be populated: $n_of_dbs_not_processed_yet"

cat ${dbs_not_processed_yet} | while read line || [[ -n $line ]];
do

DBs_to_process=$line

echo " "

export db=$DBs_to_process

echo "db: $db"

export PGPASSWORD='password1'
psql -h localhost -p $inpPort -U $db -d $db -qtX << EOF > ${report_name}
SELECT 
      '$the_yyyymmdd', 
      '$hh24miss', 
      '$inpHost', 
      '$inpPort', 
      '$db', 
      pgClass.relname, 
      to_char(pgClass.reltuples, '999999999999999999') row_nums
FROM pg_class pgClass
INNER JOIN pg_namespace pgNamespace ON (pgNamespace.oid = pgClass.relnamespace)
WHERE pgNamespace.nspname NOT IN ('pg_catalog', 'information_schema') AND pgClass.relkind='r'
ORDER BY 7 DESC;
EOF

echo "insert into stat_for_db_tables_and_their_rownums values " > ${report_name_2}
cat ${report_name} | awk -F "|" ' { if (length($1)>0) printf("(trim(\x27%s\x27), trim(\x27%s\x27), trim(\x27%s\x27), trim(\x27%s\x27), trim(\x27%s\x27), trim(\x27%s\x27), %s),\n", $1, $2, $3, $4, $5, $6, $7); } ' >> ${report_name_2}
cat ${report_name} | awk -F "|" ' NR==1 { if (length($1)>0) printf("(trim(\x27%s\x27), trim(\x27%s\x27), trim(\x27%s\x27), trim(\x27%s\x27), trim(\x27%s\x27), trim(\x27%s\x27), %s)\n", $1, $2, $3, $4, $5, $6, $7); } ' >> ${report_name_2}
echo " on conflict do nothing;" >> ${report_name_2}

n_of_recs_in_the_sql_file=`cat ${report_name_2} | wc -l`

if [ "$n_of_recs_in_the_sql_file" -gt "2" ]; then
export PGPASSWORD='password2'
psql -h localhost -p 5432 -U my_report_db -d my_report_db -f ${report_name_2}
fi

done

echo " "
echo "End"
echo " "
Enter fullscreen mode Exit fullscreen mode

Script Breakdown

  • Help Function: Displays usage information and guides users on how to run the script.
  • Input Handling: Utilizes getopts to handle command-line arguments, ensuring required parameters are provided.
  • Database Processing:
  • It counts and lists databases matching the specified pattern.
  • Filters out databases that have already been processed.
  • Data Extraction: For each database, it fetches the table names and row counts using a SQL query against the pg_class and pg_namespace system catalogs.
  • Report Generation: Formats the output into SQL insert statements, preparing them for execution against a reporting database.
  • Execution of SQL Statements: Finally, it executes the generated SQL to update the reporting table.

Example Usage

You can run the script with the following command, specifying your PostgreSQL connection details:

./monitor_db_tables.sh -h localhost -p 5432 -x '%' -a 0 -b 10
Enter fullscreen mode Exit fullscreen mode

-h: Hostname of the PostgreSQL server
-p: Port number
-x: Pattern for database names (e.g., % for all databases)
-a: Offset for pagination
-b: Limit of databases to process

Conclusion

This Bash script provides a straightforward solution for monitoring PostgreSQL tables and their row counts, helping database administrators keep track of their data. By automating this process, you can focus on other critical tasks while ensuring your databases are performing optimally.

Feel free to customize the script to fit your specific requirements, such as changing the reporting structure or adding additional metrics.

Top comments (0)