Import users from files
This type of pipeline imports users from a file into the workspace's data warehouse. It is available only for source file storage connections.
Create pipeline
Create a source pipeline that imports users from a file.
Request
-
name
string RequiredThe pipeline's name.
Must be a non-empty string with a maximum of 60 characters. -
connection
int RequiredThe ID of the connection from which to read the users. It must be a source file storage.
-
target
string RequiredThe entity on which the pipeline operates, which must be
Possible values:"User"in order to create a pipeline that imports users."User". -
enabled
booleanIndicates if the pipeline is enabled once created.
-
format
string RequiredThe file format. It corresponds to the code of a file connector.
Possible values:"csv","excel","parquet"or"json". -
path
string RequiredThe file path relative to the root path defined in the file storage connection. Refer to the file storage connector documentation for details on the specific format.
Must be a non-empty string with a maximum of 1024 characters. -
sheet
string Conditionally RequiredThe sheet name. It can only be used with the "excel" format, where it is required.
When provided, it must have a length between 1 and 31 characters, not start or end with a single quote
', and cannot contain any of the following characters:*,/,:,?,[,\, and]. -
compression
stringThe compression format of the file. It is empty if the file is not compressed.
Note that an Excel file is inherently compressed, so no compression format needs to be specified unless the file has been further compressed.
Possible values:"","Zip","Gzip"or"Snappy". -
formatSettings
nullable jsonThe specific settings of the pipeline, which vary based on the file connector specified in the
formatfield.Please refer to the page that documents the settings for each connector type.
-
filter
nullable objectThe filter applied to the users in the file. If it's not null, only the users that match the filter will be included.
See the filters documentation for more details.
-
filter.logical
string Required Possible values:"and"or"or". -
filter.conditions
array of object RequiredA filter's condition.
-
property
string RequiredThe name or path of the property. If the property has a
jsontype, it can include a json path. -
operator
string RequiredThe condition's operator. The allowed values depend on the property's type.
Possible values:"is","is not","is less than","is less than or equal to","is greater than","is greater than or equal to","is between","is not between","contains","does not contain","is one of","is not one of","starts with","ends with","is before","is on or before","is after","is on or after","is true","is false","is empty","is not empty","is null","is not null","exists"or"does not exist". -
values
array of stringThe values the operator applies to, if any. These depend on both the operator and the property's type, including whether they're present and how many there are.
-
-
-
userIDColumn
string RequiredThe column in the file that uniquely identifies each user in the connection. It serves as the single, unique identifier for each user record, ensuring that each user can be distinctly referenced.
Only columns with types corresponding to the following Krenalis types can be used as an identity:
Must be a non-empty string with a maximum of 1024 characters.string,int,uuid, andjson. -
updatedAtColumn
stringThe column that stores the date when a user record was last updated. It tracks the most recent modification made to the user's data, helping to identify when changes occurred.
The value of this column is used for incremental imports, where only records that have been modified since the last import need to be processed.
Only columns with types corresponding to the following Krenalis types can be used as the update time:
It cannot be longer than 1024 characters.string,datetime,date, andjson. -
updatedAtFormat
string Conditionally RequiredThe format of the value in the update time column. It can be set to
"ISO8601"if the column value follows the ISO 8601 format. Ifformatis"excel", it can also be set to"Excel". Otherwise, it should follow a format accepted by the Python strftime function.This field is only required if the
Must be a non-empty string with a maximum of 64 characters.updatedAtColumnis provided, is not empty, and has a typestringorjson. -
incremental
booleanDetermines whether users are imported incrementally:
true: are imported only users whose update time is equal to or later than the last imported user's change time.false: all users are imported again, regardless of their update time.falseis the default value.
If set to
true, a column for the update time must be specified (i.e.,updatedAtColumnis not null). -
transformation
object Conditionally RequiredThe mapping or function responsible for transforming file users into user identities linked to the pipeline. Once the identity resolution process is complete, the user identities associated with all pipelines are merged into unified users.
One of either a mapping or a function must be provided, but not both. The one that is not provided can be either missing or set to null.
-
transformation.function
nullable object Conditionally RequiredThe transformation function. A JavaScript or Python function that given a user in the file, returns an identity.
-
transformation.function.source
string RequiredThe source code of the JavaScript or Python function.
Must be a non-empty string with a maximum of 50000 characters. -
transformation.function.language
string RequiredThe language of the function.
Possible values:"JavaScript"or"Python". -
transformation.function.preserveJSON
booleanSpecifies whether JSON values are passed to and returned from the function as strings, keeping their original format without any encoding or decoding.
-
transformation.function.inPaths
array of string RequiredThe paths of the properties that will be passed to the function. At least one path must be present.
-
transformation.function.outPaths
array of string RequiredThe paths of the properties that may be returned by the function. At least one path must be present.
-
-
-
inSchema
schema RequiredThe schema for the properties used in the filter, the identity column, the update time column, and the input properties for the transformation.
When importing users from files, this should be a subset of the file schema.
-
outSchema
schema RequiredThe schema for the output properties of the transformation.
When importing users from files, this should be a subset of the profile schema.
Response
-
id
intThe ID of the pipeline.
curl https://example.com/v1/pipelines \ -H "Authorization: Bearer api_xxxxxxx" \ --json '{ "name": "Newsletter Subscribers", "connection": 230527183, "target": "User", "enabled": true, "format": "excel", "path": "subscribers.xlsx", "sheet": "Sheet1", "formatSettings": { "HasColumnNames": true }, "filter": { "logical": "and", "conditions": [ { "property": "country", "operator": "is", "values": [ "US" ] } ] }, "userIDColumn": "email", "updatedAtColumn": "updated_at", "updatedAtFormat": "ISO8601", "incremental": true, "transformation": { "function": { "source": "def transform(user: dict) -> dict:\n\treturn {}\n", "language": "Python", "preserveJSON": false, "inPaths": [ "email", "firstName", "lastName" ], "outPaths": [ "email", "first_name", "last_name" ] } }, "inSchema": { "kind": "object", "properties": [ { "name": "email", "type": { "kind": "string" } }, { "name": "firstName", "type": { "kind": "string" } }, { "name": "lastName", "type": { "kind": "string" } }, { "name": "country", "type": { "kind": "string" } }, { "name": "updated_at", "type": { "kind": "string", "maxLength": 60 } } ] }, "outSchema": { "kind": "object", "properties": [ { "name": "first_name", "type": { "kind": "string", "maxLength": 100 }, "readOptional": true, "description": "First name" }, { "name": "last_name", "type": { "kind": "string", "maxLength": 100 }, "readOptional": true, "description": "Last name" }, { "name": "email", "type": { "kind": "string", "maxLength": 254 }, "readOptional": true, "description": "Email" } ] } }' {
"id": 285017124
}Update pipeline
Update a source pipeline that imports users from a file.
Request
-
:id
int RequiredThe ID of the source file pipeline.
-
name
string RequiredThe pipeline's name.
Must be a non-empty string with a maximum of 60 characters. -
enabled
booleanIndicates if the pipeline is enabled. Use the Set status endpoint to change only the pipeline's status.
-
format
string RequiredThe file format. It corresponds to the code of a file connector.
Possible values:"csv","excel","parquet"or"json". -
path
string RequiredThe file path relative to the root path defined in the file storage connection. Refer to the file storage connector documentation for details on the specific format.
Must be a non-empty string with a maximum of 1024 characters. -
sheet
string Conditionally RequiredThe sheet name. It can only be used with the "excel" format, where it is required.
When provided, it must have a length between 1 and 31 characters, not start or end with a single quote
', and cannot contain any of the following characters:*,/,:,?,[,\, and]. -
compression
stringThe compression format of the file. It is empty if the file is not compressed.
Note that an Excel file is inherently compressed, so no compression format needs to be specified unless the file has been further compressed.
Possible values:"","Zip","Gzip"or"Snappy". -
formatSettings
nullable jsonThe specific settings of the pipeline, which vary based on the file connector specified in the
formatfield.Please refer to the page that documents the settings for each connector type.
-
filter
nullable objectThe filter applied to the users in the file. If it's not null, only the users that match the filter will be included.
See the filters documentation for more details.
-
filter.logical
string Required Possible values:"and"or"or". -
filter.conditions
array of object RequiredA filter's condition.
-
property
string RequiredThe name or path of the property. If the property has a
jsontype, it can include a json path. -
operator
string RequiredThe condition's operator. The allowed values depend on the property's type.
Possible values:"is","is not","is less than","is less than or equal to","is greater than","is greater than or equal to","is between","is not between","contains","does not contain","is one of","is not one of","starts with","ends with","is before","is on or before","is after","is on or after","is true","is false","is empty","is not empty","is null","is not null","exists"or"does not exist". -
values
array of stringThe values the operator applies to, if any. These depend on both the operator and the property's type, including whether they're present and how many there are.
-
-
-
userIDColumn
string RequiredThe column in the file that uniquely identifies each user in the connection. It serves as the single, unique identifier for each user record, ensuring that each user can be distinctly referenced.
Only columns with types corresponding to the following Krenalis types can be used as an identity:
Must be a non-empty string with a maximum of 1024 characters.string,int,uuid, andjson. -
updatedAtColumn
stringThe column that stores the date when a user record was last updated. It tracks the most recent modification made to the user's data, helping to identify when changes occurred.
The value of this column is used for incremental imports, where only records that have been modified since the last import need to be processed.
Only columns with types corresponding to the following Krenalis types can be used as the update time:
It cannot be longer than 1024 characters.string,datetime,date, andjson. -
updatedAtFormat
string Conditionally RequiredThe format of the value in the update time column. It can be set to
"ISO8601"if the column value follows the ISO 8601 format. Ifformatis"excel", it can also be set to"Excel". Otherwise, it should follow a format accepted by the Python strftime function.This field is only required if the
Must be a non-empty string with a maximum of 64 characters.updatedAtColumnis provided, is not empty, and has a typestringorjson. -
incremental
booleanDetermines whether users are imported incrementally:
true: are imported only users whose update time is equal to or later than the last imported user's change time.false: all users are imported again, regardless of their update time.falseis the default value.
If set to
true, a column for the update time must be specified (i.e.,updatedAtColumnis not null). -
transformation
object Conditionally RequiredThe mapping or function responsible for transforming file users into user identities linked to the pipeline. Once the identity resolution process is complete, the user identities associated with all pipelines are merged into unified users.
One of either a mapping or a function must be provided, but not both. The one that is not provided can be either missing or set to null.
-
transformation.function
nullable object Conditionally RequiredThe transformation function. A JavaScript or Python function that given a user in the file, returns an identity.
-
transformation.function.source
string RequiredThe source code of the JavaScript or Python function.
Must be a non-empty string with a maximum of 50000 characters. -
transformation.function.language
string RequiredThe language of the function.
Possible values:"JavaScript"or"Python". -
transformation.function.preserveJSON
booleanSpecifies whether JSON values are passed to and returned from the function as strings, keeping their original format without any encoding or decoding.
-
transformation.function.inPaths
array of string RequiredThe paths of the properties that will be passed to the function. At least one path must be present.
-
transformation.function.outPaths
array of string RequiredThe paths of the properties that may be returned by the function. At least one path must be present.
-
-
-
inSchema
schema RequiredThe schema for the properties used in the filter, the identity column, the update time column, and the input properties for the transformation.
When importing users from files, this should be a subset of the file schema.
-
outSchema
schema RequiredThe schema for the output properties of the transformation.
When importing users from files, this should be a subset of the profile schema.
Response
No response.curl -X PUT https://example.com/v1/pipelines/705981339 \ -H "Authorization: Bearer api_xxxxxxx" \ --json '{ "name": "Newsletter Subscribers", "enabled": true, "format": "excel", "path": "subscribers.xlsx", "sheet": "Sheet1", "formatSettings": { "HasColumnNames": true }, "filter": { "logical": "and", "conditions": [ { "property": "country", "operator": "is", "values": [ "US" ] } ] }, "userIDColumn": "email", "updatedAtColumn": "updated_at", "updatedAtFormat": "ISO8601", "incremental": true, "transformation": { "function": { "source": "def transform(user: dict) -> dict:\n\treturn {}\n", "language": "Python", "preserveJSON": false, "inPaths": [ "email", "firstName", "lastName" ], "outPaths": [ "email", "first_name", "last_name" ] } }, "inSchema": { "kind": "object", "properties": [ { "name": "email", "type": { "kind": "string" } }, { "name": "firstName", "type": { "kind": "string" } }, { "name": "lastName", "type": { "kind": "string" } }, { "name": "country", "type": { "kind": "string" } }, { "name": "updated_at", "type": { "kind": "string", "maxLength": 60 } } ] }, "outSchema": { "kind": "object", "properties": [ { "name": "first_name", "type": { "kind": "string", "maxLength": 100 }, "readOptional": true, "description": "First name" }, { "name": "last_name", "type": { "kind": "string", "maxLength": 100 }, "readOptional": true, "description": "Last name" }, { "name": "email", "type": { "kind": "string", "maxLength": 254 }, "readOptional": true, "description": "Email" } ] } }' Get pipeline
Get a source pipeline that imports users from a file.
Request
-
:id
int RequiredThe ID of the source file pipeline.
Response
-
id
intThe ID of the source pipeline.
-
name
stringThe pipeline's name.
It is not longer than 60 characters. -
connector
stringThe code of the connection's connector.
-
connectorType
stringThe type of the connection's connector. It is always
Possible values:"FileStorage"when the pipeline imports users from a file."Application","Database","FileStorage","SDK","MessageBroker"or"Webhook". -
connection
intThe ID of the connection from which the file is read. It is a source file storage.
-
connectionRole
stringThe role of the pipeline's connection. It is always
Possible values:"Source"when the pipeline imports users from a file."Source"or"Destination". -
target
stringThe entity on which the pipeline operates. It is always
Possible values:"User"when the pipeline imports users from a file."User"or"Event". -
enabled
booleanIndicates if the pipeline is enabled.
-
format
stringThe file format. It corresponds to the code of a file connector.
Possible values:"csv","excel","parquet"or"json". -
path
stringThe file path relative to the root path defined in the file storage connection. Refer to the file storage connector documentation for details on the specific format.
It is not longer than 1024 characters. -
sheet
nullable stringThe name of the sheet. It is empty if the format is not "excel".
-
compression
stringThe compression format of the file. It is empty if the file is not compressed.
Note that an Excel file is inherently compressed, so no compression format needs to be specified unless the file has been further compressed.
Possible values:"","Zip","Gzip"or"Snappy". -
userIDColumn
stringThe column in the file that uniquely identifies each user in the connection.
-
updatedAtColumn
nullable stringThe column that stores the timestamp of the last update to a user record. It is null if no such column exists.
-
updatedAtFormat
nullable stringThe format of the value in the update time column. It is null if no such column exists or if the corresponding Krenalis type is
datetimeordate.It is
"ISO8601"if the column value follows the ISO 8601 format. It is"Excel"if the format is"excel"and the column value follows the Excel format. Otherwise, it follows the format accepted by the Python strftime function. -
incremental
booleanIndicates whether users are imported incrementally:
true: are imported only users whose update time is equal to or later than the last imported user's change time.false: all users are imported again, regardless of their update time.
-
transformation
objectThe mapping or function responsible for transforming file users into user identities linked to the pipeline. Once the identity resolution process is complete, the user identities associated with all pipelines are merged into unified users.
One of either a mapping or a function is present, but not both. The one that is not present is null.
-
transformation.mapping
nullable object with string valuesThe transformation mapping. A key represents a property path in the profile schema, and its corresponding value is an expression. This expression can reference columns of the file.
-
transformation.function
nullable objectThe transformation function. A JavaScript or Python function that given a user in the file, returns an identity.
-
transformation.function.source
stringThe source code of the JavaScript or Python function.
It is not longer than 50000 characters. -
transformation.function.language
stringThe language of the function.
Possible values:"JavaScript"or"Python". -
transformation.function.preserveJSON
booleanSpecifies whether JSON values are passed to and returned from the function as strings, keeping their original format without any encoding or decoding.
-
transformation.function.inPaths
array of stringThe paths of the properties that will be passed to the function. At least one path must be present.
-
transformation.function.outPaths
array of stringThe paths of the properties that may be returned by the function. At least one path must be present.
-
-
-
inSchema
schemaThe schema for the properties used in the filter, the identity column, the update time column, and the input properties for the transformation.
-
outSchema
schemaThe schema for the output properties of the transformation.
-
running
booleanIndicates if the pipeline is running.
-
scheduleStart
nullable intThe start time of the schedule in minutes, counting from 00:00. It specifies the minute when the first scheduled run of the day begins. Subsequent runs occur based on the interval defined by the scheduler period. If the scheduler is disabled, this value is null.
-
schedulePeriod
nullable stringThe schedule period, which determines how often the import runs automatically. If it is null, the scheduler is disabled, and no automatic run will occur.
To change the schedule period, use the Set schedule period endpoint.
Possible values:"5m","15m","30m","1h","2h","3h","6h","8h","12h"or"24h".
curl https://example.com/v1/pipelines/705981339 \ -H "Authorization: Bearer api_xxxxxxx" {
"id": 705981339,
"name": "Newsletter Subscribers",
"connector": "sftp",
"connectorType": "FileStorage",
"connection": 1371036433,
"connectionRole": "Source",
"target": "User",
"enabled": true,
"format": "excel",
"path": "subscribers.xlsx",
"sheet": "Sheet1",
"userIDColumn": "email",
"updatedAtColumn": "updated_at",
"updatedAtFormat": "ISO8601",
"incremental": true,
"transformation": {
"function": {
"source": "const transform = (user) => { ... }",
"language": "JavaScript",
"preserveJSON": false,
"inPaths": [
"email",
"firstName",
"lastName"
],
"outPaths": [
"email",
"first_name",
"last_name"
]
}
},
"inSchema": {
"kind": "object",
"properties": [
{
"name": "email",
"type": {
"kind": "string"
}
},
{
"name": "firstName",
"type": {
"kind": "string"
}
},
{
"name": "lastName",
"type": {
"kind": "string"
}
},
{
"name": "country",
"type": {
"kind": "string"
}
},
{
"name": "updated_at",
"type": {
"kind": "string",
"maxLength": 60
}
}
]
},
"outSchema": {
"kind": "object",
"properties": [
{
"name": "first_name",
"type": {
"kind": "string",
"maxLength": 100
},
"readOptional": true,
"description": "First name"
},
{
"name": "last_name",
"type": {
"kind": "string",
"maxLength": 100
},
"readOptional": true,
"description": "Last name"
},
{
"name": "email",
"type": {
"kind": "string",
"maxLength": 254
},
"readOptional": true,
"description": "Email"
}
]
},
"running": false,
"scheduleStart": 15,
"schedulePeriod": "1h"
}