MongoDB Advanced Settings
  • 05 Jul 2023
  • 3 Minutes to read
  • Dark
  • PDF

MongoDB Advanced Settings

  • Dark
  • PDF

Article summary


We do not recommend changing advanced settings unless you are an experienced Panoply user.

For users who have some experience working with their data in Panoply, there are a number of items that can be customized for this data source.

  1. Auth Source: For most sources this will be "admin".
  2. Replica Set: Enter your replicate set name.
  3. Destination Schema: This is the name of the target schema to save the data. The default schema for data warehouses built on Google BigQuery is panoply. The default schema for data warehouses built on Amazon Redshift is public. This cannot be changed once a source has been collected.
  4. Destination: Panoply selects a default destination, the tables where data is stored.
    • The default name of each destination table in Panoply is comprised of the prefix mongo and _{__collection}, where __collection is a dynamic field that represents the name of the table in your Mongo database. For example if collection name is customers, then the resulting table will be mongo_customers.
    • To prefix all table names with your own prefix, enter the desired prefix with the addition of _{__collection}. For example: my_prefix_{__collection}
  5. Primary Key - The Primary Key is the field or combination of fields that Panoply will use as the deduplication key when collecting data. Panoply sets the primary key depending on whether the data warehouse is built on BigQuery or Redshift. To learn more about primary keys in general, see Primary Keys.

Any user-entered primary key will be used across all the tables selected.

6. Incremental Key - By default, Panoply fetches all of your MongoDB data on each run. If you only want to collect some of your data, enter a column name to use as your incremental key. The column must be logically incremental. Panoply will keep track of the maximum value reached during the previous run and will start there on the next run.

  • Incremental Key configurations
    • If no Incremental Key is configured by the user, by default, Panoply collects all the Mongo data on each run for the Mongo tables or views selected.
    • If the Incremental Key is configured by column name, but not the column value, Panoply collects all data, and then automatically configures the column value at the end of a successful run.
    • If the Incremental Key is configured by column name and the column value (manually or automatically), then on the first collection, Panoply will use that value as the place to begin the collection.
      • The value is updated at the end of a successful collection to the last value collected.
      • In future collections, the new value is used as the starting value. So in future collections Panoply looks for data where the IK value is greater than where the collection ended.
    • When an Incremental Key is configured, Panoply will look for that key in each of the selected tables and views. If the table or view does not have the column indicated as the Incremental Key, it must be collected as a separate instance of the data source.
    • A table or view may have some records may have a ‘null’ value for the incremental key, or they may not capture the incremental key at all. In these situations Panoply omits these records instead of failing the entire data source.

If you set an incremental key, you can only collect one table per instance of MongoDB.

  1. Click Save Changes and then click Collect.
    • The data source appears grayed out while the collection runs.
    • You may add additional data sources while this collection runs.
    • You can monitor this collection from the Jobs page or the Data Sources page.
    • After a successful collection, navigate to the Tables page to review the data results.

Was this article helpful?