PostgreSQL 技术之家: 事先判断加索引是否有效的插件：HypoPG

1. 简介
2. 安装HypoPG
3. 使用

1. 简介

HypoPG是一个可以创建“虚拟/虚假”索引的插件，让你不用建立真实的索引，就可以让你事先判断出在一些字段上建索引是否有效的插件。

通常加索引是一个比较有效的优化方法，但是对于一些复杂的SQL，在表上加索引有没有效果是难以很快判断的，需要更多的一些知识，如需要了解数据分布、输入条件、列的唯一值等等情况来判断，而使用HypoPG则不需要这么复杂的判断了，通过HypoPG创建虚拟索引，然后看复杂SQL的执行计划就可以判断出建索引是否有效了。所以对于没有DBA的公司这个插件可以帮助开发人员快速确定是否该加索引，对于有DBA的公司也可以节省DBA的时间。

HypoPG的源代码放在github上：

https://github.com/HypoPG/hypopg

官方文档：

https://hypopg.readthedocs.io/en/latest/

我们下面就会详细演示HypoPG的使用。

2. 安装HypoPG

如果你的PostgreSQL是使用yum安装的官方的PostgreSQL版本，则可以直接用yum安装HypoPG这个插件：

yum install hypopg

如果从源码安装，则：

先使用git把源码从github上拉下来：

git clone https://github.com/dalibo/hypopg.git

查看版本标签：

[root@pg01 hypopg]# git tag
0.0.1
0.0.2
0.0.3
0.0.4
0.0.5
1.0.0
1.1.0

选择一个版本，通常选择最新版本，这里选择1.1.0:

git checkout 1.1.0

然后编译安装即可：

make
make install

3. 使用

在数据库创建插件：

CREATE EXTENSION hypopg;

建测试表：

postgres=# CREATE TABLE hypotest AS SELECT seq as id, ('osdba ' || seq)::text as t FROM generate_series(1,100000) as seq;
SELECT 100000

查看没有索引时全表扫描的代价：

postgres=# explain select * from hypotest where id=1228;
                         QUERY PLAN
------------------------------------------------------------
 Seq Scan on hypotest  (cost=0.00..1791.00 rows=1 width=15)
   Filter: (id = 1228)
(2 rows)

建虚拟索引：

postgres=# \timing
Timing is on.
postgres=# SELECT hypopg_create_index('CREATE INDEX ON hypotest (id)');
       hypopg_create_index
----------------------------------
 (16635,<16635>btree_hypotest_id)
(1 row)
Time: 0.446 ms
postgres=#

因为虚拟索引是一个假的，建立时即不占用CPU，也会实际占用磁盘空间，故从上面可以看到很快就把虚拟索引建好了。

这时我们再用explain来查看执行计划：

postgres=# EXPLAIN SELECT * FROM hypotest WHERE id = 1228;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Index Scan using <16635>btree_hypotest_id on hypotest  (cost=0.04..8.06 rows=1 width=15)
   Index Cond: (id = 1228)
(2 rows)
Time: 0.350 ms

发现执行计划变成了走索引：

Index Scan using <16635>btree_hypotest_id on hypotest (cost=0.04..8.06 rows=1 width=15)

如果我们真实执行此条SQL（在explain中加analyze）：

postgres=# EXPLAIN ANALYZE SELECT * FROM hypotest WHERE id = 1228;
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Seq Scan on hypotest  (cost=0.00..1791.00 rows=1 width=15) (actual time=0.145..6.661 rows=1 loops=1)
   Filter: (id = 1228)
   Rows Removed by Filter: 99999
 Planning time: 0.037 ms
 Execution time: 6.684 ms
(5 rows)
Time: 7.021 ms

从上面可以看到真实的执行还是走全表扫描的执行计划，这就是HypoPG的作用，HypoPG的虚拟索引只是影响非真实执行的explain的结果，而不会改变SQL执行时的真实执行计划。

HypgPG还提供了一些函数用于查看、删除虚拟索引等操作：

查看有哪些虚拟索引：

postgres=# SELECT * FROM hypopg_list_indexes();
 indexrelid |        indexname         | nspname | relname  | amname
------------+--------------------------+---------+----------+--------
      16635 | <16635>btree_hypotest_id | public  | hypotest | btree
(1 row)
Time: 3.310 ms

估算此真正建立后会有多大：

postgres=# SELECT indexname, pg_size_pretty(hypopg_relation_size(indexrelid)) FROM hypopg_list_indexes();
        indexname         | pg_size_pretty
--------------------------+----------------
 <16636>btree_hypotest_id | 2544 kB
(1 row)
Time: 1.029 ms

获得建索引的DDL语句：

postgres=# select hypopg_get_indexdef(16636);
               hypopg_get_indexdef
--------------------------------------------------
 CREATE INDEX ON public.hypotest USING btree (id)
(1 row)
Time: 0.294 ms

删除虚拟索引：

postgres=# select hypopg_drop_index(16635);
 hypopg_drop_index
-------------------
 t
(1 row)
Time: 0.409 ms
postgres=# SELECT * FROM hypopg_list_indexes();
 indexrelid | indexname | nspname | relname | amname
------------+-----------+---------+---------+--------
(0 rows)
Time: 0.601 ms

也可以调用hypopg_reset()函数把所有的虚拟索引都删除掉：

postgres=# select hypopg_reset();
 hypopg_reset
--------------
(1 row)

事先判断加索引是否有效的插件：HypoPG

1. 简介

2. 安装HypoPG

3. 使用

用户登录