Skip to content

An extension for PostgreSQL that allows embedded Haskell code.

License

Notifications You must be signed in to change notification settings

ed-o-saurus/PLHaskell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PL/Haskell

This project is a "procedural language" extension of PostgreSQL allowing the execution of code in Haskell within SQL code. Despite the name, Haskell is, of course, not a procedural language as it is a functional language. However, "procedural language" is the term that PostgreSQL uses to describe languages that can be embedded in its SQL code.

The extension allows users, even unprivileged ones, to write, install, and run functions written in Haskell.

Copyright

Copyright (C) 2023 Edward F. Behn, Jr.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Contact

I wrote the project mostly for my own amusement and as a challenge. If you have any questions, comments, or issues, or if you find the project interesting or useful, please don't hesitate to contact me at [email protected].

Bugs and Features

Bug reports and feature requests should be submitted in the Github issues page.

Prerequisites

This document assumes that the reader is familiar with PostgreSQL (specifically procedural languages) and with the Haskell language.

Package Repositories

The easiest way to install the project is to use the RPM package repository for Fedora or the apt package repository for Ubuntu.

Fedora

Add the repository:

$> wget --quiet -O- https://ed-o-saurus.github.io/repos/plhaskell/fedora/plhaskell.repo | sudo tee /etc/yum.repos.d/plhaskell.repo > /dev/null

Update the repository information:

$> sudo dnf update

Install the package:

$> sudo dnf install plhaskell

Ubuntu

Add the signing key to the apt keys:

$> wget --quiet -O- https://ed-o-saurus.github.io/keys/A9DD4516.asc | gpg --dearmor | sudo tee /usr/share/keyrings/plhaskell-keyring.gpg > /dev/null

Add the repository:

$> echo deb \[signed-by=/usr/share/keyrings/plhaskell-keyring.gpg\] https://ed-o-saurus.github.io/repos/plhaskell/ubuntu/$(lsb_release -cs)/apt-repo stable main | sudo tee /etc/apt/sources.list.d/plhaskell.list > /dev/null

Update the repository information:

$> sudo apt update

Install the package:

$> sudo apt install plhaskell

Build and Install

This extension is intended to be built and installed in a Linux environment. It has only been tested on the x86-64 architecture.

Red Hat Package Management

The following should be done as a non-root user.

Install git, rpmdevtools, make, checkpolicy, policycoreutils, postgresql-server-devel, ghc-compiler, ghc-bytestring-devel, ghc-text-devel, and ghc-hint-devel:

$> sudo dnf install git rpmdevtools make selinux-policy-devel postgresql-server-devel ghc-compiler ghc-bytestring-devel ghc-text-devel ghc-hint-devel

Create the directories necessary to build the .rpm:

$> rpmdev-setuptree

Download the project code:

$> git clone https://github.com/ed-o-saurus/PLHaskell

Copy the .spec file to the ~/rpmbuild/SPECS directory:

$> cp PLHaskell/spec/plhaskell.spec ~/rpmbuild/SPECS

Copy the source to the ~/rpmbuild/SOURCES directory:

$> tar --exclude-vcs -czf ~/rpmbuild/SOURCES/PLHaskell.tar.gz PLHaskell

Build the .rpm package:

$> rpmbuild -bb ~/rpmbuild/SPECS/plhaskell.spec

Install the .rpm package written to the ~/rpmbuild/RPMS/<arch> directory on the target machine.

Debian Package Management

The following should be done as a non-root user.

Install git, devscripts, debhelper-compat, postgresql-server-dev-all ghc, libghc-hint-dev:

$> sudo apt install git devscripts debhelper-compat postgresql-server-dev-all ghc libghc-hint-dev

Download the project code:

$> git clone https://github.com/ed-o-saurus/PLHaskell

Build the .deb package:

$> cd PLHaskell

$> debuild --no-tgz-check

Install the .deb package written to the parent directory on the target machine.

Other

The alternative to building an .rpm or .deb package is to build and install the project manually.

The following are needed to build and install PL/Haskell:

  • PostgreSQL server
  • PostgreSQL development files
  • Glasgow Haskell Compiler (GHC)
  • The GHC Hint development module
  • libpq-devel

Build

Ensure that pg_config is available in the path. Run make to build all the files needed for the extension.

Install

To install, run sudo make install.

On each database that you wish to have the extension, run the SQL command CREATE EXTENSION plhaskell;.

Uninstall

To uninstall, run sudo make uninstall.

Security Enhanced Linux

Systems that use Security Enhanced Linux (SELinux) may encounter problems running the extension. This manifests as the ability to create functions and the inability to call them. Appropriate policies must be implemented. Full details are beyond the scope of this document.

A policy file designed to accommodate Red Hat based systems can be built by running make SELINUX=1. Upon installation, this file is saved to /usr/share/selinux/packages/plhaskell.pp. It is the user's responsibility to install the policy.

Testing

A test battery is provided in the tests.sql file in the root directory. It can be run with the command

psql -f tests.sql

If successful, it will terminate with the notice All tests passed.

Trust

PL/Haskell is a "trusted" language. The PostgreSQL manual explains:

The optional key word TRUSTED specifies that the language does not grant access to data that the user would not otherwise have. Trusted languages are designed for ordinary database users (those without superuser privilege) and allows them to safely create functions and procedures.

As such, unprivileged users are permitted to write and execute functions without the possibility that they will be able to access information or resources that are not allowed to access. This is accomplished by enforcing Haskell's strong type system.

Usage

To install the language in a database, run the SQL command CREATE EXTENSION plhaskell;.

Functions in PL/Haskell are created the same manner as other PostgreSQL functions. The code must be valid Haskell. It must import the PGm monad type from the PGutils module.

The code must contain a function with the same name as the PostgreSQL function. Each of its arguments must be of the type Maybe arg where arg is the Haskell type as determined by the PostgreSQL type as indicated by the following table.

PostgreSQL Type Module Haskell Type
bytea Data.ByteString ByteString
text Data.Text Text
char Prelude Char
bool Prelude Bool
smallint Data.Int Int16
integer Data.Int Int32
bigint Data.Int Int64
real Prelude Float
float Prelude Double

The function must return type PGm (Maybe result) where result is the appropriate Haskell type as determined by the return type of function.

In addition, functions can use composite types as arguments or return values provided that the composite types consist of elements that are listed in the table above or are themselves composite types. Composite values are represented as Haskell tuples.

Returning Sets

Functions can return sets of values by returning type PGm [Maybe result] where result is the appropriate Haskell type as determined by the return type of function.

Reporting Messages and Raising Error

To report a message or raise an error, use the function report :: ErrorLevel -> Text -> PGm (). ErrorLevel is any of the following:

Level
exception
warning
notice
info
log
debug1
debug2
debug3
debug4
debug5

In addition, the function raiseError :: Text -> a stops execution and raises an error.

Queries

Queries can be performed from inside PL/Haskell with the function query :: Text -> [QueryParam] -> PGm QueryResults. The first argument is the query itself. The second argument is a list of values to be substituted into the query using the expressions $n. This should be used rather than constructing a query string directly to prevent SQL injection attacks.

Queries in stable and immutable functions are executed in read-only mode, while volatile functions execute queries in read-write mode.

Any base type that can be passed to PL/Haskell functions can be used as a parameter by using the appropriate constructor for the type QueryParam. The constructors are the following:

QueryParam
QueryParamByteA (Maybe ByteString)
QueryParamText (Maybe Text)
QueryParamChar (Maybe Char)
QueryParamBool (Maybe Bool)
QueryParamInt2 (Maybe Int16)
QueryParamInt4 (Maybe Int32)
QueryParamInt8 (Maybe Int64)
QueryParamFloat4 (Maybe Float)
QueryParamFloat8 (Maybe Double)

The constructors for QueryResults are the following:

QueryResults
SelectResults Word64 [Text] [[QueryResultValue]]
SelectIntoResults Word64
InsertResults Word64
DeleteResults Word64
UpdateResults Word64
InsertReturningResults Word64 [Text] [[QueryResultValue]]
DeleteReturningResults Word64 [Text] [[QueryResultValue]]
UpdateReturningResults Word64 [Text] [[QueryResultValue]]
UtilityResults Word64
RewrittenResults Word64

The constructor indicates the type of query run. The Word64 field is the number of rows processed. The [Text] field is the names of the columns returned and the [[QueryResultValue]] is the data returned.

The constructors for QueryResultValue are the following:

QueryResultValue
QueryResultValueByteA (Maybe ByteString)
QueryResultValueText (Maybe Text)
QueryResultValueChar (Maybe Char)
QueryResultValueBool (Maybe Bool)
QueryResultValueInt2 (Maybe Int16)
QueryResultValueInt4 (Maybe Int32)
QueryResultValueInt8 (Maybe Int64)
QueryResultValueFloat4 (Maybe Float)
QueryResultValueFloat8 (Maybe Double)
QueryResultValueComposite (Maybe [QueryResultValue])

Maximum Memory

By default, the maximum memory that can be used by the Haskell runtime system is 128 MB. This can be changed by setting the plhaskell.max_memory variable in the postgresql.conf file.

Examples

Addition

CREATE FUNCTION add(int, int) RETURNS int IMMUTABLE AS
$$
    import PGutils (PGm)
    import Data.Int (Int32)

    add :: Maybe Int32 -> Maybe Int32 -> PGm (Maybe Int32)

    add Nothing Nothing = return Nothing
    add (Just a) (Just b) = return (Just (a+b))
    add a Nothing = return a
    add Nothing b = return b
$$
LANGUAGE plhaskell;

Fibonacci Sequence

CREATE FUNCTION fibonacci(int) RETURNS int IMMUTABLE AS
$$
    import PGutils (PGm)
    import Data.Int (Int32)

    fibonacci' :: Int32 -> Int32
    fibonacci' 0 = 0
    fibonacci' 1 = 1
    fibonacci' n = fibonacci' (n-2) + fibonacci' (n-1)

    fibonacci :: Maybe Int32 -> PGm (Maybe Int32)
    fibonacci Nothing = return Nothing
    fibonacci (Just n) = return (Just (fibonacci' n))
$$
LANGUAGE plhaskell;

Primes

This section shows how to return a set of results. The functions listed produce lists of prime numbers using the Sieve of Eratosthenes.

Composite type

The function in this section returns a set of composite results.

CREATE TYPE n_p AS (n int, p int);

CREATE FUNCTION primes(int) RETURNS SETOF n_p IMMUTABLE AS
$$
    import PGutils (PGm, raiseError)
    import Data.Int (Int32)

    sieve :: [Int32] -> [Int32]
    sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p /= 0]
    sieve [] = []

    primes :: Maybe Int32 -> PGm [Maybe (Maybe Int32, Maybe Int32)]
    primes Nothing = raiseError "Invalid Null"
    primes (Just n) = return (map Just (zip [Just i | i <- [1..n]] (map Just (sieve [2..]))))
$$
LANGUAGE plhaskell;

By running the query

SELECT *
FROM primes(10)

The following is produced

n p
1 2
2 3
3 5
4 7
5 11
6 13
7 17
8 19
9 23
10 29

Infinite list

The following function returns a infinite list of prime numbers

CREATE FUNCTION primes() RETURNS SETOF int IMMUTABLE AS
$$
    import PGutils (PGm)
    import Data.Int (Int32)

    sieve :: [Int32] -> [Int32]
    sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p /= 0]

    primes :: PGm [Maybe Int32]
    primes = return (map Just (sieve [2..]))
$$
LANGUAGE plhaskell;

To generate the first twenty-five prime numbers, run:

SELECT primes()
LIMIT 25
primes
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
59
61
67
71
73
79
83
89
97

Message

The following demonstrates how to show a notice from within a function.

CREATE FUNCTION forty_two() RETURNS int IMMUTABLE AS
$$
    import PGutils (PGm, report, notice)
    import Data.Int (Int32)

    forty_two :: PGm (Maybe Int32)
    forty_two = do
        report notice "Don't Panic"
        return (Just 42)
$$
LANGUAGE plhaskell;

Queries

The following deletes all elements of table t and returns the number of rows removed.

CREATE FUNCTION remove_all() RETURNS bigint VOLATILE AS
$$
    import PGutils (PGm, query, QueryResults (DeleteResults))
    import Data.Int (Int64)

    remove_all :: PGm (Maybe Int64)
    remove_all = do
        DeleteResults processed <- query "DELETE FROM t" []
        return (Just (fromIntegral processed))
$$
LANGUAGE plhaskell;

The following returns the last names of students with the passed first name.

DROP FUNCTION last_names(text);

CREATE FUNCTION last_names(text) RETURNS SETOF text IMMUTABLE AS
$$
    import PGutils (PGm, query, QueryResults (SelectResults), QueryParam (QueryParamText), QueryResultValue (QueryResultValueText))
    import Data.Text (Text)

    extract_text :: [QueryResultValue] -> Maybe Text
    extract_text [QueryResultValueText name] = name

    last_names :: Maybe Text -> PGm [Maybe Text]
    last_names first_name = do
        SelectResults _processed _header results <- query "SELECT last_name FROM students WHERE first_name = $1" [QueryParamText first_name]
        return (map extract_text results)
$$
LANGUAGE plhaskell;