Skip to main content
CSV-Based Import Connections: Deep Dive
Updated over a month ago

BlueConic CSV Connections

BlueConic offers several connections that support bi-directional data transfer to and from other systems via CSV files. We support a range of authentication steps that depend on the system you're connecting to. For details on setting up and running these connections, see their individual Knowledge Base articles:

This article provides a deep dive into how these CSV-based connections import BlueConic data from other systems. (See individual connection pages for details on authentication and setup.) See CSV-Based Connections: Authentication Deep Dive for tips on authenticating and troubleshooting the connection setup process. For a deep-dive on how CSV-based exports work, see CSV-Based Export Connections.

This article provides a deep dive into how these CSV connections import BlueConic data from other systems. Additional resources:

Importing data via CSV Connections

The steps to set up import connection goals are essentially the same for all CSV-based connections. Make sure to make yourself familiar with the BlueConic CSV best practices.

Selecting source files to import

When you configure an import goal for a CSV connection, start by selecting which file(s) to use to import data. Typically when testing, you select one file by navigating to the correct path and selecting the correct file.

Note: If the full file path is available, click the pencil icon in the file browser to go directly to that path.

CSV-connections-file browser pencil icon.png

Using historical loads and incremental loads

After doing data validation, you probably want to do a historical load and then afterwards an incremental load. Make sure the columns are exactly the same in the historical and incremental loads, so you can reuse the configuration. Typically for incremental loads, a daily file is dropped with a date indicator in the filename. You can use a wildcard to import all files to make sure all files are picked up.

Note: When the settings change in the import goal (for example a mapping is added), all matching files will be important again. This can be a lot data to import when files are not cleaned up on the file server. To prevent all files from being processed again, you can use the “rename files” option (under the advanced options of file and field handling). In that case, all “.done” files will be ignored. It is recommended to enable this when data validation has been completed, so you are confident everything is loaded correctly.

import csv files rename files.png

Using delta files for import

Although we recommend that you use incremental or delta files to import data, not all systems are able to provide delta files. If that's the case, you can select “I do not use a delta file” (under the advanced options of file and field handling). In this case, BlueConic will download the file and compares it with the previous imported file (the file name doesn’t have to be the same).

import csv files delta files.png

Linking files for import

Linking files together for import is useful when multiple files are delivered as input for the import. This becomes especially relevant when you would like to map data to a timeline event type which contains nesting. For example, for importing orders there is typically a header file and order line file provided (this is needed because it’s a one-to-many relationship) and often also a product file. The expected result looks something like this:

csv connections linked files.png

You can link the files together in this way:

csv connections linked files 2.png

The order lines are linked to the order based on a shared order ID and the product file is linked to order line based on the product ID (could also be a SKU).

How this is handled in the import:

  1. The linked files are downloaded.

  2. Each linked file is added to a local database, grouped by the identifier for the linking. This means in this case, “orderlines.csv” and “products.csv” are downloaded. BlueConic checks each order line for an entry with the order ID in the local database; if found, the order line is added, otherwise a new entry in the database is created. The same is true for the products file. During the run you see something like this, where the number of lines is referring to the entries in the linked files:

    csv linked files3.png
  3. The “orders.csv” is handled line-by-line.

  4. Based on the linking, an array of nested objects is added to the object based on the data in the local database. The key which is used for these nested objects is the ID which is assigned to the linked file entry (generated in the connection). The outcome could look like below. In this case the “175079” key is referring to the order lines. And the “342254” key is referring to the product data. Note: although the order line - product is a 1-1 relationship, it’s still added as an array, because potentially there could be multiple (or none) matches. When you select a field from a linked file in the mapping, it is stored as a dot expression, for example: “175079.342254.id”: when this is mapped to a profile property all product IDs are stored in this profile property.

    {
    "orderid": "ORD_0",
    "date": "11/6/2022 10:59",
    "customerid": "CUST-0",
    "promotion": "limited edition",
    "shop": "SHOP-5",
    "175079": [
       {
         "orderid": "ORD_0",
         "price": "USD 178.00",
         "product": "PRD-66130",
         "quantity": "2"
         "342254": [
           {
             "id": "PRD-66130",
             "url": "https://www.taylor-store.com/product/mens-mocha-belt/",
             "name": "Men?s Mocha Belt",
             "description": "",
             "image": "https://www.taylor-store.com/wp-content/uploads/2014/07/brown-belt-1.jpg",
             "inStock": "true",
             "categories": "Men,Belts",
             "price": "$89"
           }
         ]
        },

       {
         "orderid": "ORD_0",
         "price": "USD 149.00",
         "product": "PRD-66031",
         "quantity": "1",
         "342254": [
           {
             "id": "PRD-66031",
             "url": "https://www.taylor-store.com/product/womens-black-belt/",
             "name": "Women?s Black Belt",
              "description": "We?re now offering a thinner, slim-waist style of our no-holes belt. It?s a perfect ?finishing touch? accessory for a dress or cardigan and it defines your natural waist in a flattering way. Leather strap is 7/8? (2.3 cm) wide.Each SlideBelts ratchet belt is Trim-to-Fit, which means we send you a full length (48? waist max.) leather that you can customize to your size.",
    "image": "https://www.taylor-store.com/wp-content/uploads/2014/07/wbelt.png",
             "inStock": "true",
             "categories": "Women,Women's Belts",
             "price": "$149"
           }
         ]
       }
    ]
    }
  5. This JSON structure is input for the data processors step. When you want to know the input for a data processor, you can use the Log Data Processor, which writes the JSON to the run log (which can be downloaded from the connection's run history).

Important tips about importing

  • When there is no matching product (based on the ID), the array will be empty, and the import will proceed as usual. However, without logging or entries in the run history table, it becomes less apparent that this occurred, resulting in timeline events with order lines missing product information.

  • The data from linked file will only be used when it can be related to data from the main file (based on the selected matching fields). This means when you have the following setup:

    csv linked files 4.png

    Only orders from customers who are also in the customer file will be picked up. If you want to import all orders, you have to make sure the orders file is selected as the main file. The setup above will not work correctly when the customers file only contains data from customers who updated their personal data.

Which files are picked up in which order

There is sometimes confusion regarding which files are picked up for a CSV connection and in which order. The connections work as follows:

  1. Check which files match the file pattern. For example, *.csv matches all files with the extension “csv”.

  2. All files that end with “.done” are ignored.

  3. When the connection goal settings are the same as the connection goal settings from the last connection run, all files with a last modified date older than the last run date are ignored.

  4. All the rest of the files are processed in the order of last modified: oldest first.

  5. When an import stops in the middle of a file, the line number is stored. Those lines are ignored in the next run.

This means the date timestamp in the filename is not taken into account for determining the ordering of the files. Note: we recommend using a timestamp in the file name to prevent files are overwritten and data is lost.

When a file which contains linked files is picked up, the following is applied when a wild card is used. Let’s assume the import is configured as:

csv import linked files5.png

and the following files are present:

  • TransactionHeader_01-01-2023.txt

  • TransactionHeader_02-01-2023.txt

  • LineItemTotal_01-01-2023.txt

  • LineItemTotal_02-01-2023.txt

  • MaterialMaster.txt

Because “TransactionHeader_01-01-2023.txt” is a main file (not a linked file), it will be picked. The connection looks at the file pattern definition and sees “_01-01-2023” is used instead of the asterisk. This means the same identifier (“_01-01-2023”) will be used to select the linked file (when the linked file also contains an asterisk), resulting in: “LineItemTotal_01-01-2023.txt”. Because “MasterialMaster.txt” does not contain an asterisk, this material file is used for all header files. This means for the import of “TransactionHeader_02-01-2023.txt”, only the files “LineItemTotal_02-01-2023.txt” and “MaterialMaster.txt” are used.

Applying import data processors

Using data processors makes it possible to filter out records or to change / enrich data. The mechanism basically works as follows:

data processors for csv connections.png

In the example there is just a very simple JSON as input for the chain and in the different steps you can see how the eventual output is constructed based on the defined data processors. The output of the last data processor is the input for applying the matching and mapping.

The Date format conversion data processor can be useful for these imports as well to change the format of dates being imported.

Note: The order of the data processors you use in the connection makes a difference, so consider the order as you add data processors to a connection.

Validating the date format

Before importing, make sure the date format is configured correctly. This is especially relevant when you are importing timeline events: the date will be used for the timeline event and if the format / timezone is not correct defined, new events will be created resulting in double events: not only the event ID but also the event date is taken into account to determine if an event has to be updated or created.

To validate the event date, you can do the following:

  1. In the file selection, open the preview of the file:

    csv connections timezone.png
  2. copy one of example value. It is recommended to use a value with a day higher than 12, so you can see the difference in day and month

  3. go to the advanced section of the “file and field handling” step

  4. when the date value does not contain a timezone indicator, make sure you select the timezone of the system where the data is coming from.

  5. Enter the copied date in the preview to see if it is correct parsed:

    csv connections timezone2.png
  6. If a file contains multiple date fields and they are not of the same format, you can use the Date conversion date processor to translate it to one date format. Note: the date format from the “field and file handling” is applied after the data processors, so make sure the date conversion data processor converts it to the same format as other columns and use the date format setting from the “field and file handling” step to make sure it’s stored correctly in BlueConic.

  7. It is recommended to check the date after the import.

Note: Dates in the BlueConic user interface are translated to your local timezone.

Matching

BlueConic uses matching unique identifiers during connection imports. When matching is set up as follows:

matching csv profile identifiers.png

BlueConic applies the following logic for matching:

  • if there is a matching profile based on customer ID, we use the first matching profile

  • If not, but there is a matching profile based on email, we use the first matching profile on email

  • If not, a new profile is created because the “Allow the creation of new profiles” is checked

Note: Matching is applied based on the output of the import data processors, so you can use also a field that is added by a data processor, for example “cleansed email address”.

Mapping

The mapping step is where you specify where BlueConic should store the imported data fields. The output of the last data processor is the input for the mapping. For profile properties, all matching values based on the selected field will be stored in the profile property. This means when a dot expression field which represents the product ID is selected (for example from a linked file) all matching product IDs are stored.

The following strategies can be selected:

  • Add: the values will be added to the existing values in the profile

  • Set: the new values will overwrite all values (if any) existing in the profile. Note: when the new values are empty, the existing values are kept

  • Set or clear: the same as “set” but when empty values are supplied, the existing value is cleared

  • Set if empty: the new values will only be stored in the profile when there are no existing values for this profile property in the profile

  • Sum: only relevant for number, decimal, and currency fields: the new values with be summed up with the existing values. So when the profile contains “10” and an array of “5” and “3” is supplied, the end result is “18”.

Importing Timeline event data

Importing timeline events works differently because you can have a one-to-many relationship. Let’s assume we have the following setup:

mapping timeline events.png

The following logic is applied:

  • When the outcome for the selected event ID property (in this case “orderid”) is empty, no event will be created, because the event ID is required

  • When the event ID and event date match an existing event for the same profile, the event is overwritten; otherwise, a new event is created. This can be challenging if there's a mistake in the timezone selection: after correcting the timezone and re-importing, duplicate events may occur because the event dates no longer match.

  • For nested events, the connection must determine how to loop over data to get the list of nested events. It does this by looking at the field with the smallest dot expression in the mapping but a larger dot expression then the order. So in this case “orderlines > price” is taken, and it loops over the “orderlines” to create the nested events. This means you cannot create nested events from the same file as using for the main event, which makes sense because the nested events should be a one-to-many relationship. If you still want to do this, you can define the same file as a main file and as a linked file and use the columns from the linked file in the nested events mapping

Importing groups

The import to group is very similar to profile imports. The main difference is a group can only be matched on a group ID. Typically the feed contains an identifier for the group, for example a company ID. If this is not the case, you can use a data processor (such as the JavaScript data processor) to generate a unique identifier.

Resources

Contact BlueConic Support if you need assistance with CSV-based connections.

Did this answer your question?