Encryptor config #
This config is responsible for configuration of proactive security controls of AcraServer. AcraServer supports several
types of storages for encryptor_config
loading: HashiCorp Consul KV storage
and Filesystem
.
Storage type can be specified via --encryptor_config_storage_type=<filesystem|consul>
flag.
To configure filesystem as storage type, specify --encryptor_config=<path>
CLI parameter or encryptor_config
key in
YAML configuration file for AcraServer. For HashiCorp Consul
configuration, please reference Consul configuration block.
The following security controls can be configured:
- Transparent encryption
- Searchable encryption
- Masking
- Tokenization
There is full example of configuration file with all options:
database_settings:
mysql:
# [optional] [default=false]
case_sensitive_table_identifiers: true
defaults:
# [optional] [default=acrablock]
crypto_envelope: "<acrablock|acrastruct>"
# [optional] [default=false]
reencrypting_to_acrablocks: false
schemas:
- table: example_table
columns:
- id
- data_column
encrypted:
- column: data_column
##
# General options
##
# [optional] [conflicts_with=zone_id]
client_id: "<string>"
# [optional] [conflicts_with=client_id]
zone_id: "<string>"
# [optional]
crypto_envelope: "<acrablock|acrastruct>"
# [optional] [default=false]
reencrypting_to_acrablocks: <false|true>
# [optional] [conflicts_with=token_type|tokenized|consistent_tokenization]
data_type: "<str|bytes|int32|int64>"
# [optional] [conflicts_with=data_type|token_type|tokenized|consistent_tokenization]
data_type_db_identifier: <uint>
# [optional] [required_with=<data_type|data_type_db_identifier>]
response_on_fail: "<ciphertext|default_value|error>"
# [optional] [required_with=<data_type|data_type_db_identifier> and response_on_fail=default_value]
# may be a string literal or a valid int32/int64 value
default_data_value: "<string value>"
##
# Tokenization
##
# [optional] [required_with=tokenized (before 0.93.0)]
token_type: "<email|str|bytes|int32|int64>"
# [optional] [deprecated since 0.93.0] [default=false] [required_with=token_type]
tokenized: true
# [optional] [default=false] [required_with=token_type]
consistent_tokenization: true
##
# Masking
##
# [optional] [required_with=plaintext_length | plaintext_side]
masking: "xxxx"
# [optional] [required_with=masking | plaintext_side]
plaintext_length: 9
# [optional] [required_with=masking | plaintext_side]
plaintext_side: "<right|left>"
##
# Searching
##
# [optional] [default=false]
searchable: true
The encryption configuration file has two top-level sections: defaults
and schemas
.
database_settings section #
This section tells Acra some details about database configuration so that they can properly interact.
mysql #
Contains MySQL/MariaDB-specific settings.
case_sensitive_table_identifiers #
Required: false
Type: bool
When set to false
(default), Acra will convert table names from SQL queries lowercase before searching this name in schemas
.
When set to true
, Acra will use table names from SQL query “as is”, mismatch with table name inside schemas
list
will result in no encryption/decryption being performed, even if there is sumilar table name, just with wifferent case of some characters.
Should match with database behavior so that Acra will understand SQL queries the same way as the database does.
TL;DR is that in MySQL case sensitivity of table names usually depends on case sensitivity of filesystem where data is stored,
but can also be affected by configuration option.
Read more in Identifier Case Sensibility
section of MySQL docs.
postgresql #
Reserved for future, should contain PostgreSQL-specific settings, is currently empty.
defaults section #
The defaults section allows to specify common parameters for the whole configuration file, all the tables and columns, not to repeat it for each column. This section supports crypto_envelope
and reencrypting_to_acrablocks
options.
defaults:
crypto_envelope: "<acrablock|acrastruct>" # [optional] [default=acrablock]
reencrypting_to_acrablocks: true # [optional] [default=true]
crypto_envelope #
Required: false
Type: string
Default value: acrablock
Description: select which cryptographic algorithm and crypto envelope to use to encrypt fields. Don’t change if you
don’t know what it means. Supports: acrablock
(default), acrastruct
.
Read more about crypto envelopes, their purpose and difference.
reencrypting_to_acrablocks #
Required: false
Type: boolean
Default value: false
Description: turns on the re-encryption of AcraStructs generated on application side to AcraBlocks before passing it to database. It replaces less performant AcraStructs to the more performant AcraBlocks.
If application generates AcraStructs, writes once and reads often then we suggest to set reencrypting_to_acrablocks: true
.
It makes slower write operations due to decryption AcraStructs generated on application side and encrypting into
AcraBlocks on AcraServer side. But it significantly improves read operations performance due to faster decryption KEK
with symmetric cryptography in comparison with asymmetric.
If application doesn’t generate AcraStructs and rely only on transparent encryption on AcraServer side then we suggest turn it off. AcraServer will not try to recognize AcraStruct on every AcraBlock matching failure and slightly improve performance for write operations.
Read more about crypto envelopes, their purpose and difference.
schemas section #
This section defines the table schema, and how AcraServer should process each column field. This field is array with items:
schemas:
- table: <string>
columns: <array of strings>
encrypted: <array of structured items>
- table: <string>
columns: <array of strings>
encrypted: <array of structured items>
table section #
Required: true
Type: string
Description: a table name which columns will be processed (encrypted, tokenized, etc).
Keep in mind that AcraServer 0.93.0 and earlier processes table name as case-sensitive identifier. Thus, “emails” and “Emails” are processed as different tables.
columns section #
Required: false
Type: array of strings
Description: defines all table’s columns in order how they were declared in the database. AcraServer works as fully transparent proxy and doesn’t access to the database’s metadata with tables' schemas.
It’s important to declare a correct order of columns as it is the only way to process queries like
SELECT * FROM table1
or INSERT INTO table1 VALUES (...)
. For the first query, correct order helps to understand how much and which columns are expected from the database, how to expand *
value. For the second query, it helps to support INSERT
command without explicitly
declared columns due to omitted section between table name table1
and VALUES
as it could be
INSERT INTO table1 (column1, column2) VALUES(<value1>, <value2>)
with explicitly declared columns and order.
Without declared columns AcraServer will not support these kinds of queries.
encrypted #
This section declares security controls like encryption
, searchable encryption
, masking
, tokenization
on
per-column basis for the specified table.
schemas:
- table: <...>
columns:
- data_column
encrypted:
- column: data_column
# General options
client_id: "<string>" # [optional] [conflicts_with=zone_id]
zone_id: "<string>" # [optional] [conflicts_with=client_id]
crypto_envelope: "<acrablock|acrastruct>" # [optional]
data_type: "<str|bytes|int32|int64>" # [optional] [conflicts_with=token_type|tokenized|consistent_tokenization]
data_type_db_identifier: "<uint>" # [optional] [conflicts_with=data_type|token_type|tokenized|consistent_tokenization]
response_on_fail: "<ciphertext|default_value|error>" # [optional] [required_with=<data_type|data_type_db_identifier>]
default_data_value: "<string value>" # [optional] [required_with=<data_type|data_type_db_identifier>] may be string literal or valid int32/int64 yaml values
# Tokenization
token_type: "<int64|int32|str|bytes|email>" # [optional]
tokenized: true # [deprecated] [default=false] [required_with=token_type]
consistent_tokenization: true # [optional] [default=false] [required_with=token_type]
# Masking
masking: "xxxx" # [optional] [required_with=plaintext_length | plaintext_side]
plaintext_length: 9 # [optional] [required_with=masking | plaintext_side]
plaintext_side: "<right|left>" # [optional] [required_with=masking | plaintext_side]
# Searching
searchable: true # [optional] [default=false]
Required: true
Type: array of items
Description: defines which security controls to apply on column’s data. Several options are possible: transparent encryption, masking, searchable encryption, tokenization.
This section allows to configure several groups of settings:
- common
- encryption
- searchable encryption
- masking
- tokenization
Options from common
group can be used together with options from other groups. But options from other groups cannot be used together, for example, options for masking
are unique only for masking and cannot be used together with tokenization
or searchable encryption
, and vice versa.
Some of them can be used for several groups (for example data_type
can be used for
masking
, encryption
, searchable encryption
, but not for tokenization
).
One column can be configured only with options from the one group and from common
.
column #
Required: true
Type: string
Group: common
Description: column’s name which data should be processed. By default, it encrypts column’s data using crypto_envelope
value from defaults
section as envelope and ClientID
from TLS certificates or --client_id
CLI parameter.
If were specified options for masking
or tokenization
then it overrides default behaviour of encryption and applies them.
Default crypto_envelope
also applied for masking.
Keep in mind that till 0.93.0 AcraServer processes column name as case-sensitive identifier
client_id #
Required: false
Type: string
Group: common
Description: specifies which ClientID keys to use for encryption/masking/tokenizing column’s data. AcraServer will use
encryption keys related to specified ClientID. This option overrides ClientID from TLS certificate (default behavior) or
--client_id
CLI parameter.
zone_id #
Zones are deprecated since 0.94.0, will be removed in 0.95.0.
Required: false
Type: string
Group: common
Description: specifies which static ZoneID keys to use for
encryption/masking/tokenizing column’s data. AcraServer will use encryption keys related to specified ZoneID even if
--zonemode_enable=false
passed. This option overrides ClientID from TLS certificate or --client_id
CLI parameter.
crypto_envelope #
Required: false
Type: string
Values: acrablock
, acrastruct
Group: common
(doesn’t work for tokenization
)
Description: configures which crypto envelope use for transparent encryption or masking. It overrides same option in the
defaults
section.
reencrypting_to_acrablocks #
Required: false
Type: boolean
Group: encryption
, searchable encryption
Description: turns on re-encryption AcraStructs generated on application side to AcraBlock before passing it to database.
It overrides same option from defaults
section.
searchable #
Required: false
Type: boolean
Group: searchable encryption
Description: turns on searchable encryption with ability to run simple queries over
encrypted data. Final ciphertext will store more data with additional hash value at the beginning of data. Searchable encryption
supports acrablock
and acrastruct
crypto envelopes. This option cannot be used together with masking or tokenization
related options.
data_type #
Required: false
Type: string
Values: str
, int64
, int32
, bytes
Group: encryption
, searchable encryption
, masking
(int32
, int64
not supported for masking)
Description: configures how AcraServer will replace the real type of data stored in the database with application’s type. Encrypted fields are stored as blobs (binary data), but application doesn’t want to work with blobs, it wants to work with integer or strings.
AcraServer allows changing data type on the database protocol level. AcraServer will encode decrypted data to a type suitable for the application: Text, Integer, Varchar, etc.
data_type_db_identifier and data_type are interchangeable options.
How AcraServer maps types from configuration file to DB specific type:
Data type | PostgreSQL | MySql |
---|---|---|
str |
text (oid=25) | MYSQL_TYPE_STRING (0xfe) |
int32 |
integer (oid=23) | MYSQL_TYPE_LONG (0x03) |
int64 |
bigint (oid=20) | MYSQL_TYPE_LONGLONG (0x08) |
bytes |
bytea (oid=17) | MYSQL_TYPE_BLOB (0xfc) |
data_type_db_identifier #
Required: false
Type: uint32
Values: supported DB type identifiers
Group: encryption
, searchable encryption
, masking
( integer
IDs not supported for masking)
Description: configures how AcraServer will replace the real type of data stored in the database with application’s type. Due to storing data as blobs, AcraServer allows change type on DB protocol level. After that binary data will look like Text/Integer/Binary data types for application. data_type_db_identifier and data_type are interchangeable options.
By using data_type_db_identifier AcraServer uses identifiers from configuration file as DB specific type.
Currently, AcraServer supports DB identifiers for using described in the table for data_type option.
Note: Acra Enterprise Edition supports a larger set of DB identifiers of concrete types for PostgreSQL and MySQL.
response_on_fail #
Required: false
Type: ciphertext
, default_value
, error
Depends on: data_type
or data_type_db_identifier
Group: encryption
, searchable encryption
, masking
Description: specifies which action should be performed in case of a failure of some operation (decryption error, wrong data type, etc.).
The ciphertext
means that the raw (possibly encrypted) data should be returned to a client.
The default_value
requires default_data_value
and specifies that some default value should be returned instead.
The error
will produce a db-specific error, which could be handled on the client side. The message will look like encoding error in "<column_name>"
.
If not specified, the default value for response_on_fail
is ciphertext
, unless default_data_value
is defined. In that case, the response_on_fail
would implicitly become default_value
:
# ...
encrypted:
- column: implicit_ciphertext
data_value: str
# implicitly:
# response_on_fail: ciphertext
- column: implicit_default_value
data_value: str
default_data_value: some_string
# default_data_value is defined so implicitly:
# response_on_fail: default_value
There is one issue related to using the error
option and transactions in Postgres. More details on the Debugging and troubleshooting page.
default_data_value #
Required: false
Type: string
, integer
, base64
Depends on: <data_type
|data_type_db_identifier
>, response_on_fail
Group: encryption
, searchable encryption
, masking
(int32
, int64
not supported for masking)
Description: configures default value if data cannot be decrypted. Requires response_on_fail: default_value
. Type of value depends on data_type
or data_type_db_identifier
.
If the response_on_fail
is not defined, and default_data_value
is, then the response_on_fail
will be implicitly set to default_value
.
For int32
and int64
types should be integer YAML literal. For example: 123
, -321
, 0
. Value should be in the proper range
according to the type. int32
has range [-2^31, 2^31-1], int64
has range [-2^63, 2^63-1].
Type str
accepts string literals "string value"
.
Type bytes
accepts string literals with base64 values. For example to set binary array {0, 1, 2, 3}
we should
encode it to base64: "AAECAw=="
. AcraServer will decode base64 values and pass it as binary array according to database’s protocol.
tokenized (deprecated since 0.93.0) #
Required: false
Type: boolean
Group: tokenization
Description: turns on tokenization for column. Since Acra 0.93.0 this flag is deprecated. It’s enough to specify token_type
value for a column to enable tokenization for that column with specified data type.
Read more about tokenization.
token_type #
Required: false
Type: string
Values: str
, email
, int64
, int32
, bytes
, int32_string
, int64_string
Depends on: nothing (since 0.93.0), previously tokenized
(before 0.93.0)
Group: tokenization
Description: configures type of tokens. Configure a data type to enable tokenization for a column, Acra will create tokens with specified data type. Before Acra 0.93.0 token_type
worked only if tokenized
option was enabled. Since 0.93.0, token_type
doesn’t require tokenized
flag at all to simplify the config file.
AcraServer catches values in queries from application to a database, encrypts them, stores in separate token
database. Then generates new random value instead of encrypted and pass it to a database. In opposite direction it catches
tokenized value, finds it in the token database, decrypts source value and returns it to the application if it has permission
for this value. Otherwise, AcraServer returns tokenized value.
str
- generates new random string with same length as source value. To generate string Acra uses next characters: [a-zA-Z0-9]
.
bytes
- generates new random string with same length as source value.
email
- generates new random email with same length as source value.
Acra in a random way selects one of a pre-defined top-level domain (TLD), generates random string and compiles new random email.
For example if input email is: john.snow@got.com
. This email has length 17 characters. So, AcraServer randomly choose TLDs,
for example .org
. After that calculates remaining length for rest part of email: 17 - len('.org') = 13
.
Generates random 13 characters string: Zii7ydUhDPXzV
. And finally place @
character at the middle of string: Zii7yd@hDPXzV.org
.
So, eventually in the database will be stored Zii7yd@hDPXzV.org
. When owner of data will query this value from database,
Acra will replace it with source value john.snow@got.com
. If someone else will query this data (with different ClientID/ZoneID)
then Acra will return tokenized value as is: Zii7yd@hDPXzV.org
.
int64
- generates new random 64-bit value.
int32
- generates new random 32-bit value.
int64_string
- generates new random 64-bit value and pass as string literal.
int32_string
- generates new random 32-bit value and pass as string literal.
consistent_tokenization #
Required: false
Type: boolean
Depends on: token_type
(since 0.93.0), previously tokenized
(before 0.93.0).
Group: tokenization
Description: turns on consistent tokenization. Consistent tokenization means that the same plaintext will ALWAYS result in the same token. When disabled, the same plaintext will result in different tokens.
By default (consistent_tokenization: false
), Acra generates new value
for every input value. For example, if application sends query INSERT INTO table1 (age) values (25)
then Acra will
generate 25 new values (if token_type: int32
or int64
) and replace 25
with new values. If consistent_tokenization: true
then Acra will generate one random output value per input value. For 25
it will generate 76982
(just for example) and
everytime re-use it for input value 25
.
masking #
Required: false
Type: string
Depends on: plaintext_length
, plaintext_side
Group: masking
Description: turns on masking and specify pattern that will replace ciphertext. For example, a value 1234-5678-9123-4567
which is masked with plaintext_length: 4
, plaintext_side: left
and masking: "-XXXX-XXXX-XXXX"
, will look like 1234<ciphertext>
in the database, where
<ciphertext>
is AcraStruct or AcraBlock. If the user has access to data, they will get 1234-5678-9123-4567
after unmasking.
If the user doesn’t have permission, they will get masked value 1234-XXXX-XXXX-XXXX
.
Read more about masking.
plaintext_length #
Required: false
Type: integer
Depends on: masking
, plaintext_side
Group: masking
Description: configures plaintext length that will be untouched and not encrypted.
Ifplaintext_length
>=len(data)
then wholedata
will be encrypted without leaving any plaintext.
plaintext_side #
Required: false
Type: string
Depends on: plaintext_side
, plaintext_length
Group: masking
Description: configures side of plaintext that will be left untouched according to plaintext_length
.
left
- value will be stored as<plaintext[:plaintext_length]><ciphertext>
, for example “helloXXXX”.right
- value will be stored as<ciphertext[:plaintext_length]><plaintext>
, for example “XXXXl.com”.
Matrix of options compatibility #
Here is matrix of all options supported in the encrypted
section where showed which of them can be used together and
which cannot be.
— | client_id | zone_id | crypto_envelope | data_type[bytes] | data_type[str] | data_type[int32] | data_type[int64] | response_on_fail[ciphertext] | response_on_fail[default_value] | response_on_fail[error] | data_default_value | token_type | tokenized | consistent_tokenization | masking | plaintext_length | plaintext_side |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
client_id | + | - | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + |
zone_id | - | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + | + |
crypto_envelope | + | + | + | + | + | + | + | + | + | + | + | - | - | - | + | + | + |
data_type[bytes] | + | + | + | + | - | - | - | + | + | + | + | - | - | - | + | + | + |
data_type[str] | + | + | + | - | + | - | - | + | + | + | + | - | - | - | + | + | + |
data_type[int32] | + | + | + | - | - | + | - | + | + | + | + | - | - | - | - | - | - |
data_type[int64] | + | + | + | - | - | - | + | + | + | + | + | - | - | - | - | - | - |
response_on_fail[ciphertext] | + | + | + | + | + | + | + | + | - | - | - | - | - | - | - | - | - |
response_on_fail[default_value] | + | + | + | + | + | + | + | - | + | - | + | - | - | - | - | - | - |
response_on_fail[error] | + | + | + | + | + | + | + | - | - | + | - | - | - | - | - | - | - |
data_default_value | + | + | + | + | + | + | + | - | + | - | + | - | - | - | - | - | - |
token_type | + | + | - | - | - | - | - | - | - | - | - | + | + | + | - | - | - |
tokenized | + | + | - | - | - | - | - | - | - | - | - | + | + | + | - | - | - |
consistent_tokenization | + | + | - | - | - | - | - | - | - | - | - | + | + | + | - | - | - |
masking | + | + | + | + | + | - | - | - | - | - | - | - | - | - | + | + | + |
plaintext_length | + | + | + | + | + | - | - | - | - | - | - | - | - | - | + | + | + |
plaintext_side | + | + | + | + | + | - | - | - | - | - | - | - | - | - | + | + | + |
Table/column matching logic #
This section describes how AcraServer extracts table/column names from SQL queries to match them with values inside encryptor config. The behavior is slightly different depending on the database type, SQL query itself, and possibly some database configuration options.
Case-insensitive below means the identifier is converted to lowercase before comparing with values from encryptor config. Thus, encryptor config should contain lowercase version of column/table name for matching to work.
Case-sensitive means identifiers are compared with values from encryptor config “as is”. Encryptor config should contain exactly the same identifier as in database schema (of course, excluding quotes if there are any). Even the smallest mismatch will result in Acra skipping the table/column and forwarding values without encryption/decryption.
PostgreSQL #
-
Raw identifiers are case-insensitive
table1
,Table1
,TABLE1
are all the same and needtable1
in encryptor config; same applies to columns -
Identifiers wrapped with double quotes are case-sensitive
table1
and"table1"
needtable1
in encryptor config, while"Table1"
and"TABLE1"
needTable1
andTABLE1
respectively; same applies to columns
MySQL #
-
Column identifiers are always case-insensitive
userID
,userid
,`userID`
are all the same and needuserid
in encryptor config -
Table names are case-insensitive by default, could be changed with
case_sensitive_table_identifiers
option mentioned above
Backquotes do not affect case sensitivity.