Import users from files

This type of pipeline imports users from a file into the workspace's data warehouse. It is available only for source file storage connections.

Create pipeline

Create a source pipeline that imports users from a file.

Request

  • name

    string Required

    The pipeline's name.

    Must be a non-empty string with a maximum of 60 characters.
  • connection

    int Required

    The ID of the connection from which to read the users. It must be a source file storage.

  • target

    string Required

    The entity on which the pipeline operates, which must be "User" in order to create a pipeline that imports users.

    Possible values: "User".
  • enabled

    boolean

    Indicates if the pipeline is enabled once created.

  • format

    string Required

    The file format. It corresponds to the code of a file connector.

    Possible values: "csv", "excel", "parquet" or "json".
  • path

    string Required

    The file path relative to the root path defined in the file storage connection. Refer to the file storage connector documentation for details on the specific format.

    Must be a non-empty string with a maximum of 1024 characters.
  • sheet

    string Conditionally Required

    The sheet name. It can only be used with the "excel" format, where it is required.

    When provided, it must have a length between 1 and 31 characters, not start or end with a single quote ', and cannot contain any of the following characters: *, /, :, ?, [, \, and ].

  • compression

    string

    The compression format of the file. It is empty if the file is not compressed.

    Note that an Excel file is inherently compressed, so no compression format needs to be specified unless the file has been further compressed.

    Possible values: "", "Zip", "Gzip" or "Snappy".
  • formatSettings

    nullable json

    The specific settings of the pipeline, which vary based on the file connector specified in the format field.

    Please refer to the page that documents the settings for each connector type.

  • filter

    nullable object

    The filter applied to the users in the file. If it's not null, only the users that match the filter will be included.

    See the filters documentation for more details.

    • filter.logical

      string Required Possible values: "and" or "or".
    • filter.conditions

      array of object Required

      A filter's condition.

      • property

        string Required

        The name or path of the property. If the property has a json type, it can include a json path.

      • operator

        string Required

        The condition's operator. The allowed values depend on the property's type.

        Possible values: "is", "is not", "is less than", "is less than or equal to", "is greater than", "is greater than or equal to", "is between", "is not between", "contains", "does not contain", "is one of", "is not one of", "starts with", "ends with", "is before", "is on or before", "is after", "is on or after", "is true", "is false", "is empty", "is not empty", "is null", "is not null", "exists" or "does not exist".
      • values

        array of string

        The values the operator applies to, if any. These depend on both the operator and the property's type, including whether they're present and how many there are.

  • userIDColumn

    string Required

    The column in the file that uniquely identifies each user in the connection. It serves as the single, unique identifier for each user record, ensuring that each user can be distinctly referenced.

    Only columns with types corresponding to the following Krenalis types can be used as an identity: string, int, uuid, and json.

    Must be a non-empty string with a maximum of 1024 characters.
  • updatedAtColumn

    string

    The column that stores the date when a user record was last updated. It tracks the most recent modification made to the user's data, helping to identify when changes occurred.

    The value of this column is used for incremental imports, where only records that have been modified since the last import need to be processed.

    Only columns with types corresponding to the following Krenalis types can be used as the update time: string, datetime, date, and json.

    It cannot be longer than 1024 characters.
  • updatedAtFormat

    string Conditionally Required

    The format of the value in the update time column. It can be set to "ISO8601" if the column value follows the ISO 8601 format. If format is "excel", it can also be set to "Excel". Otherwise, it should follow a format accepted by the Python strftime function.

    This field is only required if the updatedAtColumn is provided, is not empty, and has a type string or json.

    Must be a non-empty string with a maximum of 64 characters.
  • incremental

    boolean

    Determines whether users are imported incrementally:

    • true: are imported only users whose update time is equal to or later than the last imported user's change time.
    • false: all users are imported again, regardless of their update time. false is the default value.

    If set to true, a column for the update time must be specified (i.e., updatedAtColumn is not null).

  • transformation

    object Conditionally Required

    The mapping or function responsible for transforming file users into user identities linked to the pipeline. Once the identity resolution process is complete, the user identities associated with all pipelines are merged into unified users.

    One of either a mapping or a function must be provided, but not both. The one that is not provided can be either missing or set to null.

    • transformation.function

      nullable object Conditionally Required

      The transformation function. A JavaScript or Python function that given a user in the file, returns an identity.

      • transformation.function.source

        string Required

        The source code of the JavaScript or Python function.

        Must be a non-empty string with a maximum of 50000 characters.
      • transformation.function.language

        string Required

        The language of the function.

        Possible values: "JavaScript" or "Python".
      • transformation.function.preserveJSON

        boolean

        Specifies whether JSON values are passed to and returned from the function as strings, keeping their original format without any encoding or decoding.

      • transformation.function.inPaths

        array of string Required

        The paths of the properties that will be passed to the function. At least one path must be present.

      • transformation.function.outPaths

        array of string Required

        The paths of the properties that may be returned by the function. At least one path must be present.

  • inSchema

    schema Required

    The schema for the properties used in the filter, the identity column, the update time column, and the input properties for the transformation.

    When importing users from files, this should be a subset of the file schema.

  • outSchema

    schema Required

    The schema for the output properties of the transformation.

    When importing users from files, this should be a subset of the profile schema.

Response

  • id

    int

    The ID of the pipeline.

POST /v1/pipelines
curl https://example.com/v1/pipelines \
-H "Authorization: Bearer api_xxxxxxx" \
--json '{
"name": "Newsletter Subscribers",
"connection": 230527183,
"target": "User",
"enabled": true,
"format": "excel",
"path": "subscribers.xlsx",
"sheet": "Sheet1",
"formatSettings": {
"HasColumnNames": true
},
"filter": {
"logical": "and",
"conditions": [
{
"property": "country",
"operator": "is",
"values": [
"US"
]
}
]
},
"userIDColumn": "email",
"updatedAtColumn": "updated_at",
"updatedAtFormat": "ISO8601",
"incremental": true,
"transformation": {
"function": {
"source": "def transform(user: dict) -> dict:\n\treturn {}\n",
"language": "Python",
"preserveJSON": false,
"inPaths": [
"email",
"firstName",
"lastName"
],
"outPaths": [
"email",
"first_name",
"last_name"
]
}
},
"inSchema": {
"kind": "object",
"properties": [
{
"name": "email",
"type": {
"kind": "string"
}
},
{
"name": "firstName",
"type": {
"kind": "string"
}
},
{
"name": "lastName",
"type": {
"kind": "string"
}
},
{
"name": "country",
"type": {
"kind": "string"
}
},
{
"name": "updated_at",
"type": {
"kind": "string",
"maxLength": 60
}
}
]
},
"outSchema": {
"kind": "object",
"properties": [
{
"name": "first_name",
"type": {
"kind": "string",
"maxLength": 100
},
"readOptional": true,
"description": "First name"
},
{
"name": "last_name",
"type": {
"kind": "string",
"maxLength": 100
},
"readOptional": true,
"description": "Last name"
},
{
"name": "email",
"type": {
"kind": "string",
"maxLength": 254
},
"readOptional": true,
"description": "Email"
}
]
}
}'
Response
{
  "id": 285017124
}
Errors
404
workspace does not exist
422
connection does not exist
422
format does not exist
422
format settings are not valid
422
transformation language is not supported

Update pipeline

Update a source pipeline that imports users from a file.

Request

  • :id

    int Required

    The ID of the source file pipeline.

  • name

    string Required

    The pipeline's name.

    Must be a non-empty string with a maximum of 60 characters.
  • enabled

    boolean

    Indicates if the pipeline is enabled. Use the Set status endpoint to change only the pipeline's status.

  • format

    string Required

    The file format. It corresponds to the code of a file connector.

    Possible values: "csv", "excel", "parquet" or "json".
  • path

    string Required

    The file path relative to the root path defined in the file storage connection. Refer to the file storage connector documentation for details on the specific format.

    Must be a non-empty string with a maximum of 1024 characters.
  • sheet

    string Conditionally Required

    The sheet name. It can only be used with the "excel" format, where it is required.

    When provided, it must have a length between 1 and 31 characters, not start or end with a single quote ', and cannot contain any of the following characters: *, /, :, ?, [, \, and ].

  • compression

    string

    The compression format of the file. It is empty if the file is not compressed.

    Note that an Excel file is inherently compressed, so no compression format needs to be specified unless the file has been further compressed.

    Possible values: "", "Zip", "Gzip" or "Snappy".
  • formatSettings

    nullable json

    The specific settings of the pipeline, which vary based on the file connector specified in the format field.

    Please refer to the page that documents the settings for each connector type.

  • filter

    nullable object

    The filter applied to the users in the file. If it's not null, only the users that match the filter will be included.

    See the filters documentation for more details.

    • filter.logical

      string Required Possible values: "and" or "or".
    • filter.conditions

      array of object Required

      A filter's condition.

      • property

        string Required

        The name or path of the property. If the property has a json type, it can include a json path.

      • operator

        string Required

        The condition's operator. The allowed values depend on the property's type.

        Possible values: "is", "is not", "is less than", "is less than or equal to", "is greater than", "is greater than or equal to", "is between", "is not between", "contains", "does not contain", "is one of", "is not one of", "starts with", "ends with", "is before", "is on or before", "is after", "is on or after", "is true", "is false", "is empty", "is not empty", "is null", "is not null", "exists" or "does not exist".
      • values

        array of string

        The values the operator applies to, if any. These depend on both the operator and the property's type, including whether they're present and how many there are.

  • userIDColumn

    string Required

    The column in the file that uniquely identifies each user in the connection. It serves as the single, unique identifier for each user record, ensuring that each user can be distinctly referenced.

    Only columns with types corresponding to the following Krenalis types can be used as an identity: string, int, uuid, and json.

    Must be a non-empty string with a maximum of 1024 characters.
  • updatedAtColumn

    string

    The column that stores the date when a user record was last updated. It tracks the most recent modification made to the user's data, helping to identify when changes occurred.

    The value of this column is used for incremental imports, where only records that have been modified since the last import need to be processed.

    Only columns with types corresponding to the following Krenalis types can be used as the update time: string, datetime, date, and json.

    It cannot be longer than 1024 characters.
  • updatedAtFormat

    string Conditionally Required

    The format of the value in the update time column. It can be set to "ISO8601" if the column value follows the ISO 8601 format. If format is "excel", it can also be set to "Excel". Otherwise, it should follow a format accepted by the Python strftime function.

    This field is only required if the updatedAtColumn is provided, is not empty, and has a type string or json.

    Must be a non-empty string with a maximum of 64 characters.
  • incremental

    boolean

    Determines whether users are imported incrementally:

    • true: are imported only users whose update time is equal to or later than the last imported user's change time.
    • false: all users are imported again, regardless of their update time. false is the default value.

    If set to true, a column for the update time must be specified (i.e., updatedAtColumn is not null).

  • transformation

    object Conditionally Required

    The mapping or function responsible for transforming file users into user identities linked to the pipeline. Once the identity resolution process is complete, the user identities associated with all pipelines are merged into unified users.

    One of either a mapping or a function must be provided, but not both. The one that is not provided can be either missing or set to null.

    • transformation.function

      nullable object Conditionally Required

      The transformation function. A JavaScript or Python function that given a user in the file, returns an identity.

      • transformation.function.source

        string Required

        The source code of the JavaScript or Python function.

        Must be a non-empty string with a maximum of 50000 characters.
      • transformation.function.language

        string Required

        The language of the function.

        Possible values: "JavaScript" or "Python".
      • transformation.function.preserveJSON

        boolean

        Specifies whether JSON values are passed to and returned from the function as strings, keeping their original format without any encoding or decoding.

      • transformation.function.inPaths

        array of string Required

        The paths of the properties that will be passed to the function. At least one path must be present.

      • transformation.function.outPaths

        array of string Required

        The paths of the properties that may be returned by the function. At least one path must be present.

  • inSchema

    schema Required

    The schema for the properties used in the filter, the identity column, the update time column, and the input properties for the transformation.

    When importing users from files, this should be a subset of the file schema.

  • outSchema

    schema Required

    The schema for the output properties of the transformation.

    When importing users from files, this should be a subset of the profile schema.

Response

No response.
PUT /v1/pipelines/:id
curl -X PUT https://example.com/v1/pipelines/705981339 \
-H "Authorization: Bearer api_xxxxxxx" \
--json '{
"name": "Newsletter Subscribers",
"enabled": true,
"format": "excel",
"path": "subscribers.xlsx",
"sheet": "Sheet1",
"formatSettings": {
"HasColumnNames": true
},
"filter": {
"logical": "and",
"conditions": [
{
"property": "country",
"operator": "is",
"values": [
"US"
]
}
]
},
"userIDColumn": "email",
"updatedAtColumn": "updated_at",
"updatedAtFormat": "ISO8601",
"incremental": true,
"transformation": {
"function": {
"source": "def transform(user: dict) -> dict:\n\treturn {}\n",
"language": "Python",
"preserveJSON": false,
"inPaths": [
"email",
"firstName",
"lastName"
],
"outPaths": [
"email",
"first_name",
"last_name"
]
}
},
"inSchema": {
"kind": "object",
"properties": [
{
"name": "email",
"type": {
"kind": "string"
}
},
{
"name": "firstName",
"type": {
"kind": "string"
}
},
{
"name": "lastName",
"type": {
"kind": "string"
}
},
{
"name": "country",
"type": {
"kind": "string"
}
},
{
"name": "updated_at",
"type": {
"kind": "string",
"maxLength": 60
}
}
]
},
"outSchema": {
"kind": "object",
"properties": [
{
"name": "first_name",
"type": {
"kind": "string",
"maxLength": 100
},
"readOptional": true,
"description": "First name"
},
{
"name": "last_name",
"type": {
"kind": "string",
"maxLength": 100
},
"readOptional": true,
"description": "Last name"
},
{
"name": "email",
"type": {
"kind": "string",
"maxLength": 254
},
"readOptional": true,
"description": "Email"
}
]
}
}'
Errors
404
workspace does not exist
404
pipeline does not exist
422
format does not exist
422
format settings are not valid
422
transformation language is not supported

Get pipeline

Get a source pipeline that imports users from a file.

Request

  • :id

    int Required

    The ID of the source file pipeline.

Response

  • id

    int

    The ID of the source pipeline.

  • name

    string

    The pipeline's name.

    It is not longer than 60 characters.
  • connector

    string

    The code of the connection's connector.

  • connectorType

    string

    The type of the connection's connector. It is always "FileStorage" when the pipeline imports users from a file.

    Possible values: "Application", "Database", "FileStorage", "SDK", "MessageBroker" or "Webhook".
  • connection

    int

    The ID of the connection from which the file is read. It is a source file storage.

  • connectionRole

    string

    The role of the pipeline's connection. It is always "Source" when the pipeline imports users from a file.

    Possible values: "Source" or "Destination".
  • target

    string

    The entity on which the pipeline operates. It is always "User" when the pipeline imports users from a file.

    Possible values: "User" or "Event".
  • enabled

    boolean

    Indicates if the pipeline is enabled.

  • format

    string

    The file format. It corresponds to the code of a file connector.

    Possible values: "csv", "excel", "parquet" or "json".
  • path

    string

    The file path relative to the root path defined in the file storage connection. Refer to the file storage connector documentation for details on the specific format.

    It is not longer than 1024 characters.
  • sheet

    nullable string

    The name of the sheet. It is empty if the format is not "excel".

  • compression

    string

    The compression format of the file. It is empty if the file is not compressed.

    Note that an Excel file is inherently compressed, so no compression format needs to be specified unless the file has been further compressed.

    Possible values: "", "Zip", "Gzip" or "Snappy".
  • userIDColumn

    string

    The column in the file that uniquely identifies each user in the connection.

  • updatedAtColumn

    nullable string

    The column that stores the timestamp of the last update to a user record. It is null if no such column exists.

  • updatedAtFormat

    nullable string

    The format of the value in the update time column. It is null if no such column exists or if the corresponding Krenalis type is datetime or date.

    It is "ISO8601" if the column value follows the ISO 8601 format. It is "Excel" if the format is "excel" and the column value follows the Excel format. Otherwise, it follows the format accepted by the Python strftime function.

  • incremental

    boolean

    Indicates whether users are imported incrementally:

    • true: are imported only users whose update time is equal to or later than the last imported user's change time.
    • false: all users are imported again, regardless of their update time.
  • transformation

    object

    The mapping or function responsible for transforming file users into user identities linked to the pipeline. Once the identity resolution process is complete, the user identities associated with all pipelines are merged into unified users.

    One of either a mapping or a function is present, but not both. The one that is not present is null.

    • transformation.mapping

      nullable object with string values

      The transformation mapping. A key represents a property path in the profile schema, and its corresponding value is an expression. This expression can reference columns of the file.

    • transformation.function

      nullable object

      The transformation function. A JavaScript or Python function that given a user in the file, returns an identity.

      • transformation.function.source

        string

        The source code of the JavaScript or Python function.

        It is not longer than 50000 characters.
      • transformation.function.language

        string

        The language of the function.

        Possible values: "JavaScript" or "Python".
      • transformation.function.preserveJSON

        boolean

        Specifies whether JSON values are passed to and returned from the function as strings, keeping their original format without any encoding or decoding.

      • transformation.function.inPaths

        array of string

        The paths of the properties that will be passed to the function. At least one path must be present.

      • transformation.function.outPaths

        array of string

        The paths of the properties that may be returned by the function. At least one path must be present.

  • inSchema

    schema

    The schema for the properties used in the filter, the identity column, the update time column, and the input properties for the transformation.

  • outSchema

    schema

    The schema for the output properties of the transformation.

  • running

    boolean

    Indicates if the pipeline is running.

  • scheduleStart

    nullable int

    The start time of the schedule in minutes, counting from 00:00. It specifies the minute when the first scheduled run of the day begins. Subsequent runs occur based on the interval defined by the scheduler period. If the scheduler is disabled, this value is null.

  • schedulePeriod

    nullable string

    The schedule period, which determines how often the import runs automatically. If it is null, the scheduler is disabled, and no automatic run will occur.

    To change the schedule period, use the Set schedule period endpoint.

    Possible values: "5m", "15m", "30m", "1h", "2h", "3h", "6h", "8h", "12h" or "24h".
GET /v1/pipelines/:id
curl https://example.com/v1/pipelines/705981339 \
-H "Authorization: Bearer api_xxxxxxx"
Response
{
  "id": 705981339,
  "name": "Newsletter Subscribers",
  "connector": "sftp",
  "connectorType": "FileStorage",
  "connection": 1371036433,
  "connectionRole": "Source",
  "target": "User",
  "enabled": true,
  "format": "excel",
  "path": "subscribers.xlsx",
  "sheet": "Sheet1",
  "userIDColumn": "email",
  "updatedAtColumn": "updated_at",
  "updatedAtFormat": "ISO8601",
  "incremental": true,
  "transformation": {
    "function": {
      "source": "const transform = (user) => { ... }",
      "language": "JavaScript",
      "preserveJSON": false,
      "inPaths": [
        "email",
        "firstName",
        "lastName"
      ],
      "outPaths": [
        "email",
        "first_name",
        "last_name"
      ]
    }
  },
  "inSchema": {
    "kind": "object",
    "properties": [
      {
        "name": "email",
        "type": {
          "kind": "string"
        }
      },
      {
        "name": "firstName",
        "type": {
          "kind": "string"
        }
      },
      {
        "name": "lastName",
        "type": {
          "kind": "string"
        }
      },
      {
        "name": "country",
        "type": {
          "kind": "string"
        }
      },
      {
        "name": "updated_at",
        "type": {
          "kind": "string",
          "maxLength": 60
        }
      }
    ]
  },
  "outSchema": {
    "kind": "object",
    "properties": [
      {
        "name": "first_name",
        "type": {
          "kind": "string",
          "maxLength": 100
        },
        "readOptional": true,
        "description": "First name"
      },
      {
        "name": "last_name",
        "type": {
          "kind": "string",
          "maxLength": 100
        },
        "readOptional": true,
        "description": "Last name"
      },
      {
        "name": "email",
        "type": {
          "kind": "string",
          "maxLength": 254
        },
        "readOptional": true,
        "description": "Email"
      }
    ]
  },
  "running": false,
  "scheduleStart": 15,
  "schedulePeriod": "1h"
}
Errors
404
workspace does not exist
404
pipeline does not exist