Cloud Storage Advanced Settings
  • 30 May 2023
  • 1 Minute to read
  • Dark
    Light
  • PDF

Cloud Storage Advanced Settings

  • Dark
    Light
  • PDF

Article Summary

Warning:

We do not recommend changing advanced settings unless you are an experienced Panoply user.

For users who have some experience working with their data in Panoply, there are a number of items that can be customized for this data source.

  1. Files Encoding: The following encodings are available:
    • UTF-8
    • ISO-8859-1
    • Windows-1251
    • Windows-1252
    • Windows-1254
    • Other
      When Other is selected then during the extraction the data source will detect automatically the file’s encoding. This is useful when the user does not know the file encoding or if he is selecting multiple files in multiple different encodings.
  2. Skip XML attributes: When collecting XML files, some of the returning XML fields might have attributes attached to them. Select this option to skip all of the XML attributes and ingest only the XML values. For example, for the data 100, Panoply will ingest the value 100 to the score column
  3. Destination Schema: This is the name of the target schema to save the data. The default schema for data warehouses built on Google BigQuery is panoply. The default schema for data warehouses built on Amazon Redshift is public. This cannot be changed once a source has been collected.
  4. Destination: This is the name of the table Panoply will create with your data. The default destination is google_cloud_storage.
  5. Primary Key: Primary Keys are the column(s) values that uniquely identify a row. Panoply automatically selects the Primary Key using an available ID column. If your data does not contain an id column, you may configure this manually by choosing the columns to use. See Primary Keys for more information.
  6. Exclude: The Exclude option allows you to exclude certain data, such as names, addresses, or other personally identifiable information. Enter the column names of the data to exclude.
  7. Parse String: If the data to be collected contains JSON, include the JSON text attributes to be parsed.
  8. Truncate: Truncate deletes all the current data stored in the destination tables, but not the tables themselves. Afterwards Panoply will recollect all the available data for this data source.
  9. Click Save Changes and then Collect.
    • The data source appears grayed out while the collection runs.
    • You may add additional data sources while this collection runs.
    • You can monitor this collection from the Jobs page or the Data Sources page.
    • After a successful collection, navigate to the Tables page to review the data results.

Was this article helpful?