Tool for dumping/exporting your postgres database's table data with custom masking rules.
pgfaker [options] <config>
Currently available options are
Options:
-o --output <outFile> STDOUT or name of the output sql file (default: "dump.sql")
-v --verbose <mode> verbosity - verbose | info | silent
-h, --help display help for command
You can also use this with npx
$ npx pgfaker config.js --output=STDOUT
-
--output
: Use it to either write to file or stdout.Examples:
$ npx pgfaker config.js --output=STDOUT | psql database_url $ npx pgfaker config.js --output=dump.sql
Defualt:
dump.sql
-
--verbose
: This is used to control the verbosity level of logging frompgfaker
silent
: No output at all, only runtime errors.info
: Basic information during the dump process such as current table, start and finish prompts etc.verbose
: All possible information during the dump process. This includes, everything frominfo
plus columns that were skipped, transformed and more.
Default:
verbose
The config.js
(or any js file) will contain the configuration for masking the data.
export const configuration = {
connectionUrl: "postgresql:https://USER:PASSWORD@HOST:PORT/DATABASE",
columns: {
columnName: (value) => /* do something */ return value
},
tables: {
tableName: {
columnName: (value) => /* do something */ return value
}
},
options:{
skip:{
tableName: 'output' | 'mask'
}
}
defaultTransformer: (value) => value,
};
-
connectionUrl
: The connection to your db -
columns
: The columns to replace values of. If a column doesn't exist it is ignored (currently) -
tables
: Use this to specify table specific column transformers. These transformers will take precedence overcolumns
transformers.export const configuration = { connectionUrl: 'postgresql:https://USER:PASSWORD@HOST:PORT/DATABASE', columns: { description: (value) => 'This will not be used', }, tables: { product: { description: (value) => 'This will be used', }, }, };
For
product
table,description
column will getThis will be used
, and all other tables havingdescription
column will getThis will not be used
-
defaultTransformer
: if this is specified, then all of the columns not mentioned in thecolumns
andtables
will get this transformer. If this isn't specified, all of the undefined columns are left unchanged.You can go ahead and use something like faker or your own custom maskers for complex use cases.
-
options
: Additional options for the parser-
skip
: This property defines which table to either exclude from being masked or from being written to the output. This option accepts either 'mask' or 'output'-
mask
options: { skip: { blogs: 'mask'; } }
will make sure that the dumped sql doesn't have data masked for blogs table. This will override any other masking rules specified for this table.
-
output
options: { skip: { users: 'output'; } }
will make sure that the dumped sql doesn't have any queries related to users table. This will override any other masking rules specified for this table.
-
-
If you need more flexibility with masking, such as conditional masking then you can use middleware.
Middleware in pgfaker allows you to access the record that is about to get transformed. You can modify the records here, or avoid the masking that is about to happen. The api looks like this
export const configuration = {
connectionUrl: 'postgresql:https://USER:PASSWORD@HOST:PORT/DATABASE',
columns: {
description: (value) => 'This will not be used',
},
tables: {
users: [checkIfAdmin, maskingRules],
},
};
Where checkIfAdmin
is your middleware function and maskingRules
is your regular object of transformers
-
The
checkIfAdmin
function will receive the column values as first parameter and column names as the secondsconst checkIfAdmin = (columnValues, columnNames) => { adminColumnIndex = columnNames.indexOf('admin'); isAdmin = columnValues[adminColumnIndex]; if (isAdmin === '1') { return null; } return columnValues; };
Returning null (or any falsy value) from this function will result in any masking defined (read next point) for it to be avoided.
-
The
maskingRules
is your regular object of transformers. This can be used if you want to regular way of defining transformers along with the middleware. Note that thecolumns
transformers are also used hereconst maskingRules = { users: { flag: (value) => 'off', }, };
Please refer and raise the issues on github.
I was in search of a anonymizer for postgres and came across pg-anonymizer
. This was not suitable for my requirement as it was lacking a flexible api and additional options, so I ended up creating my own version of anonymizer with inspiration from the package.