Google Sheets Data Dictionary
  • 01 Jul 2021
  • 2 Minutes to read
  • Dark
    Light
  • PDF

Google Sheets Data Dictionary

  • Dark
    Light
  • PDF

Article Summary

Because Google Sheets data comes from a spreadsheet file Panoply cannot provide a data dictionary. But Panoply does automate the data schema for the collected data. This is the useful information to know about the Panoply automations:

  • A column in a table uses the same data type for all values in that column. Panoply automatically chooses the data type for each column based on the available values. This is important to note for this data source. If even one value in a column has text, then the entire column is considered data type Text.
    • For example, the following combination of values in a single column will be data type Number:
      • 10000
      • 10,000
      • 10.10
    • For example, the following combination of values in a single column will be data type Text:
      • 10000
      • 10,000
      • 10.10
      • 10000x
  • Regarding data types, values using commas as a decimal place (such as "12,45") can be imported as data type Number with some restrictions.
    • The "location" of the Google Sheet determines if "12,45" is a number or a text. See the discussion of decimal point and comma and the Google Sheets API documentation on ValueRenderOption.
    • Someone in the United States, and using the United States version of Google Sheets, enters "12,45" into a Google Sheet cell then Google will automatically format that value as a Text. Even if you manually change the cell format to Number, Google will treat it as a Text when added to Panoply.
    • Someone in the France, and using the French version of Google Sheets, enters "12,45" into a Google Sheet cell then Google will automatically format that value as a Number.
  • Dates are formatted as formatted strings.
  • For each sheet, Panoply opens the individual sheet (tab) and collects the values row by row.
  • A column with a header but without values will be ignored. This is a limitation built into the Data Engine.
  • Empty columns and empty rows are not collected.
  • The following metadata columns are added by Panoply to the destination table(s):
    • id - If the user does not enter a primary key, and no id column exists in the source, Panoply will insert an id. Formatted as a GUID, such as 2cd570d1-a11d-4593-9d29-9e2488f0ccc2
    • __updatetime - Formatted as a datetime, such as 2018-06-26T01:26:14.695Z
    • __senttime - Formatted as a datetime, such as 2018-06-26T01:26:14.695Z
    • __tablename - The name of the sheet (tab), in Google Sheets, where the data originated. Formatted as <filename>_<sheet name>, such as app_install_metrics_app_installs.

Was this article helpful?