Add Amazon S3 as a data source for Importers

Adding a data source is required to run an Importer. This article will provide an overview on how to add Amazon S3 as a data source to import data.

Key terms for Amazon S3

Amazon S3-related terms

  • AWS Account ID: An AWS Account ID is a unique 12-digit number assigned to each Amazon Web Services (AWS) account.
  • External ID: A user-defined string that serves as a security mechanism used to authenticate and secure access from third-party entities.
  • ARN: An Amazon Resource Name (ARN) uniquely identify AWS resources. An ARN is required to specify a resource unambiguously across all of AWS, such as in IAM policies, Amazon Relational Database Service (Amazon RDS) tags, and API calls.
  • Bucket: An Amazon S3 bucket is a storage container where you can store and manage data in the form of objects. It serves as the top-level container for your data in S3 and acts like a folder in which you organize and store your files.
  • CSV Path Prefix: A path prefix refers to the logical organization of objects within a bucket. A prefix is the portion of the key that comes before the object name.

Adding Amazon S3 as a data source

Step 1: Initiating the creation of a data source in your Persona dashboard

Click Imports in the Dashboard’s navigation bar. Then click on + Add new data source in the top right corner of the Imports page.

Select a data source

When creating a data source for use in Importers, you are required to select a source from which to import data. In this case, you would select S3.

When adding a data source, click on S3 that will show Persona’s AWS Account ID (a 12 character identification number). An External ID will also be auto-generated. You will need this to connect in order to complete set up of your new data source.

Now that you have gathered the External ID and Persona's AWS Account ID, keep this browser window open while you can proceed to Step 2 in a separate browser window. In the new window, follow the directions in Step 2 below to create an AWS Reader Role that will allow Persona to connect to your Amazon S3 bucket that contains the underlying data that you want to bring into your Persona instance.

Step 2: Creating an AWS Reader Role in the AWS S3 console

In your AWS S3 console, you will create a new role. After selecting AWS account as the Trusted entity type, copy and paste the AWS Account ID and External ID from the Data source form. AWS S3 Console: Creating a new role with the Trusted Entity Type

On the “Add permissions” step, grant the role AmazonS3ReadOnlyAccess AWS S3 Console: Adding AmazonS3ReadOnlyAccess permission

Name and create your role. Amazon S3 Console: Name and create your new role

Step 3: Finish creating the data source in your Persona dashboard

This newly created Reader role from Step 2 above will have its own Amazon Resource Name (ARN), which serves as a unique identifier. Return to your browser window with the Persona dashboard open and paste the ARN to the +Add Data Source form from Step 1 above.

To complete the connection, you will need to enter the S3 region, Bucket name, and CSV Path Prefix on the data form. Once that information has been entered, click Connect.

On the next page, you will see the result of CSVs found in the bucket with CSV path prefix provided. From the list provided, select all of the CSVs to import into Persona.

Creating a data source: Reviewing found csvs An example of what a list of found csvs within the Amazon S3 bucket will look like as you complete the data source creation process.

Data source-specific considerations

Every data source has nuances specific to how the data is stored. To further normalize data and ensure Importers function successfully, there are some considerations that you may need to understand.

Importing images using an Amazon S3 data source

We will want to make sure that each image is located in the same bucket as the CSV file imported.

Importing images from S3: object_key Each image in the bucket will have its own object_key that uniquely references the image in the bucket.

The CSV file from Step 3 for your import should include a column containing the object_key of the image you want to import. During the import process, the referenced image will be downloaded and attached to the corresponding Account or Transaction.

Example

For example, in the Importer configuration below, we are importing data from a CSV file named image_import.csv.


ref_id,name_first,name_last,phone_number,birth_date,email,file_id
account_1,John,Smith,+14151111111,1967-01-01,john_smith@withpersona.com,selfie_photo_1.jpeg

Amazon S3 data source: Example of image import

When viewing the Importer in the Persona dashboard, each row in the CSV file represents a new Account in Persona, including details such as the reference_id, name, phone number, birth date, and email. The last column in the CSV contains the object_key of an image stored in the same bucket as the CSV. This image will be downloaded and attached as the account's selfie photo.

Plans Explained

Amazon S3 data source for Importers by plan

Startup Program Essential Plan Growth Plan Enterprise Plan
Amazon S3 as an available data source Available Available Available Available

Learn more about pricing and plans.

Related articles