PostgreSQL 技术之家: 使用pl/lua写存储过程提升10倍性能

1. 为什么要使用pl/lua

pl/lua是一个插件，可以使用lua语言写PostgreSQL的存储过程和函数，使用lua语言有如下几个好处，

可以大大提升存储过程的性能，特别是循环的性能。
在PostgreSQL中数组或一些json对象都是不可变的，如往数组或json对象中添加元素时，需要把拷贝源对象而生成一个新对象，导致很大的开销。而使用lua语言的list和table对象就没有这个问题了。
lua语言的语法更灵活。

也就是当我们需要在存储过程或函数中做一些密集的运算，使用plpgsql会比较慢，而使用pl/lua会提升10倍以上的性能。这个提升还是很观的，所以建议这些存储过程使用pllua来编写。

2. 我们先看看性能

2.1 查看循环的效率

我们分别使用pllua和pgplsql建两个循环的函数：

create or replace function f_pl01(cnt int) returns int language plpgsql as $$
declare
i int;
begin
 i:=0;
 LOOP
     i = i + 1;
     EXIT WHEN i >= cnt;
 END LOOP;
 return i;
end;
$$;

create function f_lua01(cnt int) returns int language pllua as $$
local i=0
while( i < cnt ) do
   i = i+1
end
return i
$$;

运行一下看执行时间：

postgres=# \timing
Timing is on.
postgres=# select f_pl01(10000000);
  f_pl01
----------
 10000000
(1 row)
Time: 6482.846 ms (00:06.483)
postgres=# select f_lua01(10000000);
 f_lua01
----------
 10000000
(1 row)
Time: 556.831 ms

可以看出使用pgplsql循环1千万次，需要6秒多，而使用pllua只需要557毫秒,快了近12倍。

如果我们建一个plpython的函数，如下所示：

create or replace function f_py01(cnt int) returns int language plpython3u as $$
    i = 0
    while i < cnt:
        i = i + 1
    return i
$$;

看一下执行时间：

postgres=# select f_py01(10000000);
  f_py01
----------
 10000000
(1 row)
Time: 1008.750 ms (00:01.009)

可以看出使用python是1秒中，也比plpgsql快很多。

2.2 数组中添加元素中的效率

我们再建两个存储过程，

create or replace function f_pl02(cnt int) returns int language plpgsql as $$
declare
i int;
myarr text[];
s1 text;
begin
 i:=0;
 s1 :=  lpad('', 2048, 'helloosdba');
 LOOP
     myarr := array_append(myarr, s1);
     i = i + 1;
     EXIT WHEN i >= cnt;
 END LOOP;
 return array_length(myarr, 1);
end;
$$;

create or replace function f_lua02(cnt int) returns int language pllua as $$
local i=0
local myarr = {}
local s1 = string.rep('helloosdba', 2048)
while( i < cnt ) do
   i = i+1
   myarr[i] = s1
end
return #myarr
$$;

postgres=# select f_pl02(100000);
 f_pl02
--------
 100000
(1 row)
Time: 756.772 ms
postgres=# select f_lua02(100000);
 f_lua02
---------
  100000
(1 row)
Time: 10.731 ms

可以看到差了70多倍。这是因为使用plpgsql是，每次改变数组都需要把源数组复制一次，当数组越来越大时，每复制一次会花很长的时间。

如果是plpython：

create or replace function f_py02(cnt int) returns int language plpython3u as $$
    i = 0
    myarr = []
    s1 = 'helloosdba'*2048
    while i < cnt:
        i = i+1
        myarr.append(s1)
    return len(myarr)
$$;

执行时间：

postgres=# select f_py02(100000);
 f_py02
--------
 100000
(1 row)
Time: 23.459 ms

plpython也比较快，不过比pllua慢一倍。

3. 安装方法

安装pllua之前需要先安装lua5.3.5，到网站上http://www.lua.org/download.html 下载lua5.3.5,然后编译安装：

cd /usr/src
curl -R -O http://www.lua.org/ftp/lua-5.3.5.tar.gz
tar xvf lua-5.3.5.tar.gz

默认的编译出来的lua再编译选项中没有加“-fPIC”，会导致我们后面编译pllua时报错：

/usr/bin/ld: /usr/local/lib/liblua.a(loadlib.o): relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status

到目录cd lua-5.3.5/src下，修改Makefile，在CFLAGS中增加-fPIC：

CFLAGS= -fPIC -O2 -Wall -Wextra -DLUA_COMPAT_5_2 $(SYSCFLAGS) $(MYCFLAGS)

然后再编译和安装lua：

make install

编译安装pllua:

cd /usr/src
git clone https://github.com/pllua/pllua-ng.git
cd pllua-ng
make PG_CONFIG=/usr/pgsql-11/bin/pg_config
make PG_CONFIG=/usr/pgsql-11/bin/pg_config install

然后在数据库中就可以加载pllua插件了：

create extension pllua;

4. pl/lua的一些资料

使用文档： https://pllua.github.io/pllua-ng/
源代码：https://github.com/pllua/pllua-ng

使用pl/lua写存储过程提升10倍性能