DEV Community

Dmitry Romanoff
Dmitry Romanoff

Posted on

Identifying Unused Indexes in PostgreSQL: A Bash Script Solution

Indexes are essential for optimizing query performance in relational databases, but not all indexes are created equal. Some may remain unused, consuming valuable disk space and potentially slowing down write operations. This article explores a Bash script designed to help PostgreSQL administrators identify and report unused indexes across multiple databases in a PostgreSQL instance.

The Importance of Monitoring Index Usage

Unused indexes can be a silent drain on your database's performance. They take up space, add overhead during data modification operations, and can complicate maintenance tasks. Regularly auditing your indexes can lead to better performance and more efficient use of resources.

Overview of the Script

The script, check_unused_indexes_for_PostgreSQL_db_instance.sh, connects to a PostgreSQL instance, iterates through its databases, and identifies indexes that have never been used. It produces a comprehensive report detailing these unused indexes, including their size.

Key Features:

  • Connects to any PostgreSQL instance using specified host and port.
  • Iterates through all user-defined databases, excluding system databases.
  • Runs a SQL query to check for unused indexes.
  • Generates a detailed output report, both in the terminal and as a trace file.

Getting Started

Prerequisites

Before running the script, ensure you have:

  • Access to a PostgreSQL instance with appropriate permissions.
  • psql command-line tool installed.

Usage

The script can be executed with the following command:

./check_unused_indexes_for_PostgreSQL_db_instance.sh -h <db_host> -p <db_port>
Enter fullscreen mode Exit fullscreen mode

For example:

./check_unused_indexes_for_PostgreSQL_db_instance.sh -h localhost -p 5432 > ./unused_indexes_$(date +'%Y%m%d_%H%M%S').trc
Enter fullscreen mode Exit fullscreen mode

Script

#!/bin/bash

##################################################################################################################################
#
# Name: check_unused_indexes_for_PostgreSQL_db_instance.sh
#
# Description: This script iterates all the DBs on the given PostgreSQL DB instance
#              and gets the detailed list of the unused DB indexes.
#
# Author: Dmitry
#
# Date: 02-Nov-2020
#
# Usage Example:
#
#     ./check_unused_indexes_for_PostgreSQL_db_instance.sh -h <db_host> -p <db_port>
#
####################################################################################################################################

current_date_time="$(date +'%Y%m%d_%H%M%S')" 

helpFunction()
{
   echo ""
   echo "Usage: $0 -h hostname -p port"
   echo -e "\t-h Postgres hostname"
   echo -e "\t-p Postgers port"
   echo -e " "
   echo -e "Example how to run: ./$0 -h localhost -p 5432 > ./unused_indexes_$current_date_time.trc"
   echo -e " "
   exit 1 # Exit script after printing help
}

while getopts "h:p:" opt
do
   case "$opt" in
      h ) inpHost="$OPTARG" ;;
      p ) inpPort="$OPTARG" ;;
      ? ) helpFunction ;; # Print helpFunction in case parameter is non-existent
   esac
done

# Print helpFunction in case parameters are empty
if [ -z "$inpHost" ] || [ -z "$inpPort" ]
then
   echo "Some or all of the parameters are empty";
   helpFunction
fi

report_name="Check unused indexes for PostgreSQL DB instance"

output_dir="./output_dir"

if [ ! -d $output_dir ]; then
  echo "The directory $output_dir should exist"
  exit 1
fi

output_trace_name="check_unused_indexes_for_pg_db_instance_${current_date_time}_port_${inpPort}.trc"

output_full_name="${output_dir}/${output_trace_name}"

touch ${output_full_name}

echo "------------------------------------------------------------------" 
echo "Report: ${report_name} " 
echo "------------------------------------------------------------------" 
echo "Postgres DB Instance: port [${inpPort}]" 
echo "Check Unused Indexes for pg DB instance"
echo "Timestamp: [${current_date_time}]" 
echo "------------------------------------------------------------------" 

echo " " >  ${output_full_name}
echo "------------------------------------------------------------------" >> ${output_full_name}
echo "Report: ${report_name} " >> ${output_full_name}
echo "------------------------------------------------------------------" >> ${output_full_name}
echo "Postgres DB Instance: port [${inpPort}]" >> ${output_full_name}
echo "Check Unused Indexes for pg DB instance" >> ${output_full_name}
echo "Timestamp: [${current_date_time}]" >> ${output_full_name}
echo "------------------------------------------------------------------" >> ${output_full_name}

echo " " 
echo "Input parameters:" 
echo "---------------- " 
echo "inpHost=$inpHost" 
echo "inpPort=$inpPort" 
echo "---------------- "

echo " " >> ${output_full_name}
echo "Input parameters:" >> ${output_full_name}
echo "---------------- " >> ${output_full_name}
echo "inpHost=$inpHost" >> ${output_full_name}
echo "inpPort=$inpPort" >> ${output_full_name}
echo "---------------- " >> ${output_full_name}

query_check_unused_DB_indexes="
SELECT s.schemaname,
    s.relname AS tablename,
    s.indexrelname AS indexname,
    pg_relation_size(s.indexrelid) AS index_size_bytes,
    round(pg_relation_size(s.indexrelid)/1024) AS index_size_Kbytes,
    round(pg_relation_size(s.indexrelid)/1024/1024) AS index_size_Mbytes
FROM pg_catalog.pg_stat_user_indexes s
  JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid
WHERE s.idx_scan = 0   -- has never been scanned
 AND 0 <>ALL (i.indkey) -- no index column is an expression
 AND NOT i.indisunique  -- is not a UNIQUE index
 AND NOT EXISTS     -- does not enforce a constraint
     (SELECT 1 FROM pg_catalog.pg_constraint c
     WHERE c.conindid = s.indexrelid)
-- and s.relname='my_table' -- in case you want examine the unused indexes for the specific table
ORDER BY pg_relation_size(s.indexrelid) DESC;
"

query_DBs="
select M.* 
from 
  (select datname 
  from pg_database 
  where datname not in ('postgres', 'template0', 'template1')
  order by 1) M
"

query_num_of_DBs="
select count(1) 
from ($query_DBs) M
"

num_of_dbs_to_process=$(psql -h $inpHost -p $inpPort -t -c "$query_num_of_DBs" postgres)

echo " "
echo "------------------------------------------------------------------"
echo " Number of databases: ${num_of_dbs_to_process}"
echo "------------------------------------------------------------------"

echo " " >> ${output_full_name}
echo "------------------------------------------------------------------" >> ${output_full_name}
echo " Number of databases: ${num_of_dbs_to_process}" >> ${output_full_name}
echo "------------------------------------------------------------------" >> ${output_full_name}

idx=$(($idx + 1))

for DBs_to_process in $(psql -h $inpHost -p $inpPort -t -c "$query_DBs" postgres); do
echo " " 
echo "#${idx}:"
echo "${DBs_to_process}" 

echo " " >> ${output_full_name}
echo "#${idx}:" >> ${output_full_name}
echo "${DBs_to_process}" >> ${output_full_name}

echo " " 
echo " " >> ${output_full_name}

psql -h $inpHost -p $inpPort -U postgres -d $DBs_to_process -qtX << EOF >> ${output_full_name}

\x

$query_check_unused_DB_indexes

EOF


#### for terminal output

psql -h $inpHost -p $inpPort -U postgres -d $DBs_to_process -qtX << EOF 

\x

$query_check_unused_DB_indexes

EOF


echo "................................................................."

echo "................................................................." >> ${output_full_name}

idx=$(($idx + 1))

done

echo " "
echo " " >> ${output_full_name}

echo "                      --- The End ---                            " 
echo "                      --- The End ---                            " >> ${output_full_name}

echo " " 
echo " " >> ${output_full_name}

echo "Trace File: ${output_full_name}"
echo " "
Enter fullscreen mode Exit fullscreen mode

Script Breakdown

Let’s dive deeper into the key sections of the script.

Parameter Handling

The script uses getopts to handle command-line arguments for the database host and port. It provides a help function to guide users on proper usage:

helpFunction()
{
   echo "Usage: $0 -h hostname -p port"
   exit 1
}
Enter fullscreen mode Exit fullscreen mode

Database Querying

The script constructs a SQL query to find unused indexes:

SELECT s.schemaname,
    s.relname AS tablename,
    s.indexrelname AS indexname,
    pg_relation_size(s.indexrelid) AS index_size_bytes
FROM pg_catalog.pg_stat_user_indexes s
  JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid
WHERE s.idx_scan = 0
 AND NOT i.indisunique
 AND NOT EXISTS
     (SELECT 1 FROM pg_catalog.pg_constraint c
     WHERE c.conindid = s.indexrelid)
ORDER BY pg_relation_size(s.indexrelid) DESC;
Enter fullscreen mode Exit fullscreen mode

This query retrieves details about indexes that have never been scanned, are not unique, and do not enforce constraints, making them candidates for removal.

Generating Reports

The script produces both console output and a trace file that records the findings:

touch ${output_full_name}
echo "------------------------------------------------------------------" > ${output_full_name}
echo "Report: ${report_name}" >> ${output_full_name}
Enter fullscreen mode Exit fullscreen mode

Running the Script

Once executed, the script outputs a summary of the databases processed and lists any unused indexes it finds. The final report helps database administrators make informed decisions about which indexes can be safely dropped.

Conclusion

Regularly auditing indexes in your PostgreSQL databases is crucial for maintaining optimal performance. The check_unused_indexes_for_PostgreSQL_db_instance.sh script provides an automated way to identify unused indexes, allowing you to streamline your database environment effectively.

For further enhancement, consider scheduling this script to run periodically or integrating it into your database maintenance routine. By keeping your indexes in check, you’ll ensure your PostgreSQL databases run smoothly and efficiently.

Whether you're managing a single database or multiple instances, keeping track of your indexes will lead to better performance and resource management.

Top comments (0)