- 12 Jul 2023
- 2 Minutes to read
- Print
- DarkLight
- PDF
Google Drive Data Dictionary
- Updated on 12 Jul 2023
- 2 Minutes to read
- Print
- DarkLight
- PDF
Because Google Drive data comes from your files, Panoply cannot provide a complete data dictionary. However, we can provide you information about how Panoply automates the data schema and how certain file and data types are handled.
Supported File Types
Panoply supports the following file types for the Google Drive connector:
- Archive (.tar and .gzip)
- JSON (.json) See our sample .json file
- Character-delimited files (.csv and .tsv) See our sample .csv file or sample .tsv file
- Microsoft Excel (.xlsx) See our sample .xlsx file
- Plain text (.txt)
- XML (.xml)
Special Requirements / Handling
Several of the supported file types have special requirements or handling.
Excel (.xslx) Files
Panoply collects each sheet (tab) within the file as a unique table. The name of the sheet will be appended to the googledrive_<file>
table name. For example googledrive_metrics_january
. Each sheet is opened and collected as one table. Empty rows are not imported.
Text (.txt) Files
Text files txt files can be tab delimited, which Panoply will automatically use to mark the columns. Other delimiters may also be used, but the user will have to specify that delimeter on the Advanced Settings tab.
Header Handling
The first row of the spreadsheet or file with values will be considered the header for the table. If a column has data but nothing in the header row, Panoply will add a header consisting of the word column
+ the column letter, such as column A
.
For duplicate headers, Panoply will rename the second instance of the header by appending the header with column
and the column letter, such as Name Column A
. Columns with a header and no data will not be collected.
Data Types
- A column in a table uses the same data type for all values in that column. Panoply automatically chooses the data type for each column based on the available values. This is important to note for this data source. If even one value in a column has text, then the entire column is considered data type Text.
- For example, the following combination of values in a single column will be data type Number:
- 10000
- 10,000
- 10.10
- For example, the following combination of values in a single column will be data type Text:
- 10000
- 10,000
- 10.10
- 10000x
- For example, the following combination of values in a single column will be data type Number:
- Dates are formatted as formatted strings.
- Empty columns and empty rows are not collected.
Metadata
The following metadata columns are added by Panoply to the destination table(s):
id
- If the user does not enter a primary key, and no id column exists in the source, Panoply will insert anid
. Formatted as a GUID, such as2cd570d1-a11d-4593-9d29-9e2488f0ccc2
__sheetname
: This is added only for Excel files. It is the name of the worksheet where the data originated, formatted as<__file>_<__sheet name>
, such as Sales Metrics_2018Q2.__senttime
: Formatted as a datetime, such as2018-06-26T01:26:14.695Z
.__updatetime
: Formatted as a datetime, such as2018-06-26T01:26:14.695Z
.__folder
: This is the name of the folder where the file was located.__file
: This is the name of the file Panoply collected.__xml_root
: The root element (XML)__xml_item
: The name of a child element (XML) - represent a record