Skip to content
User Guides
Datasets
Generating datasets

Generating a Dataset

In Fides, a dataset is a YAML configuration file that describes a collection of data such as a database. Fides uses these dataset YAML files as a map of your database to automatically process privacy requests. A dataset describes where categories of personal data (e.g. user contact info) can be found and how fields in tables or collections are related to eachother so that Fides can safely traverse the data when processing privacy requests.

You can think of Datasets as a map for any collection of data in Fides. For this reason, you'll see Datasets for database as well as third-party SaaS applications.

Dataset YAML files can be configured locally and then uploaded to Fides or using the UI-based Dataset Editor.

However, we recommend preparing the annotations and information that will be required to support privacy requests ahead of time. Please review the following to prepare:

Uploading a dataset

To upload a dataset configuration file that has been manually created:

  1. Navigate to Data mapManage datasets.
  2. Click Create new dataset.
  3. Click Upload a new dataset YAML.
  4. Paste the dataset configuration in YAML into the editor.
  5. Click Create dataset
Generate a dataset

Generating a datset in the UI

Creating a dataset

To create a new dataset using the Dataset Editor:

  1. Navigate to Data mapManage datasets.
  2. Click Create new dataset.
  3. Enter the dataset configuration into the editor.
  4. Click Create dataset
You can also create a new dataset using the dataset editor when creating an integration. Learn more in our guide for Linking datasets.
Generate a dataset

Generating a dataset from a database

To generate a new dataset by connecting to a database:

  1. Navigate to Data mapManage datasets.
  2. Click Create new dataset.
  3. Click Connect to a database.
  4. Paste the database connection string into the Database URL field.
  5. Click Generate dataset

For helping building the database connection string, please see the SQLAlchemy documentation (opens in a new tab).

PostgreSQL example: postgresql://<user>:<password>@<hostname>:<port>/<database>

Generate a dataset
Note: Licensed users can optionally classify their dataset in this step. Learn more in our guide for Classifying datasets

For more details and examples, please see our guide for Generating resources.