Tables#

Define API Datasets.

class gcloud.bigquery.table.SchemaField(name, field_type, mode='NULLABLE', description=None, fields=None)[source]#

Bases: object

Describe a single field within a table schema.

Parameters:

Parameters:	name (str) – the name of the field field_type (str) – the type of the field (one of ‘STRING’, ‘INTEGER’, ‘FLOAT’, ‘BOOLEAN’, ‘TIMESTAMP’ or ‘RECORD’) mode (str) – the type of the field (one of ‘NULLABLE’, ‘REQUIRED’, or ‘REPEATED’) description (str) – optional description for the field fields (list of `SchemaField`, or None) – subfields (requires `field_type` of ‘RECORD’).

name (str) – the name of the field
field_type (str) – the type of the field (one of ‘STRING’, ‘INTEGER’, ‘FLOAT’, ‘BOOLEAN’, ‘TIMESTAMP’ or ‘RECORD’)
mode (str) – the type of the field (one of ‘NULLABLE’, ‘REQUIRED’, or ‘REPEATED’)
description (str) – optional description for the field
fields (list of SchemaField, or None) – subfields (requires field_type of ‘RECORD’).

class gcloud.bigquery.table.Table(name, dataset, schema=())[source]#

Bases: object

Tables represent a set of rows whose values correspond to a schema.

See: https://cloud.google.com/bigquery/docs/reference/v2/tables

Parameters:	name (str) – the name of the table dataset (`gcloud.bigquery.dataset.Dataset`) – The dataset which contains the table. schema (list of `SchemaField`) – The table’s schema

create(client=None)[source]#

API call: create the dataset via a PUT request

See: https://cloud.google.com/bigquery/docs/reference/v2/tables/insert

Parameters:	client (`gcloud.bigquery.client.Client` or `NoneType`) – the client to use. If not passed, falls back to the `client` stored on the current dataset.

created#

Datetime at which the table was created.

Return type:	`datetime.datetime`, or `NoneType`
Returns:	the creation time (None until set from the server).

dataset_name#

Name of dataset containing the table.

Return type:	str
Returns:	the ID (derived from the dataset).

delete(client=None)[source]#

API call: delete the table via a DELETE request

See: https://cloud.google.com/bigquery/docs/reference/v2/tables/delete

Parameters:	client (`gcloud.bigquery.client.Client` or `NoneType`) – the client to use. If not passed, falls back to the `client` stored on the current dataset.

description#

Description of the table.

Return type:	str, or `NoneType`
Returns:	The description as set by the user, or None (the default).

etag#

ETag for the table resource.

Return type:	str, or `NoneType`
Returns:	the ETag (None until set from the server).

exists(client=None)[source]#

API call: test for the existence of the table via a GET request

See https://cloud.google.com/bigquery/docs/reference/v2/tables/get

Parameters:	client (`gcloud.bigquery.client.Client` or `NoneType`) – the client to use. If not passed, falls back to the `client` stored on the current dataset.

expires#

Datetime at which the table will be removed.

Return type:	`datetime.datetime`, or `NoneType`
Returns:	the expiration time, or None

fetch_data(max_results=None, page_token=None, client=None)[source]#

API call: fetch the table data via a GET request

See: https://cloud.google.com/bigquery/docs/reference/v2/tabledata/list

Note

This method assumes that its instance’s schema attribute is up-to-date with the schema as defined on the back-end: if the two schemas are not identical, the values returned may be incomplete. To ensure that the local copy of the schema is up-to-date, call the table’s reload method.

Parameters:	max_results (integer or `NoneType`) – maximum number of rows to return. page_token (str or `NoneType`) – token representing a cursor into the table’s rows. client (`gcloud.bigquery.client.Client` or `NoneType`) – the client to use. If not passed, falls back to the `client` stored on the current dataset.
Return type:	tuple
Returns:	`(row_data, total_rows, page_token)`, where `row_data` is a list of tuples, one per result row, containing only the values; `total_rows` is a count of the total number of rows in the table; and `page_token` is an opaque string which can be used to fetch the next batch of rows (`None` if no further batches can be fetched).

friendly_name#

Title of the table.

Return type:	str, or `NoneType`
Returns:	The name as set by the user, or None (the default).

classmethod from_api_repr(resource, dataset)[source]#

Factory: construct a table given its API representation

Parameters:	resource (dict) – table resource representation returned from the API dataset (`gcloud.bigquery.dataset.Dataset`) – The dataset containing the table.
Return type:	`gcloud.bigquery.table.Table`
Returns:	Table parsed from `resource`.

insert_data(rows, row_ids=None, skip_invalid_rows=None, ignore_unknown_values=None, template_suffix=None, client=None)[source]#

API call: insert table data via a POST request

See: https://cloud.google.com/bigquery/docs/reference/v2/tabledata/insertAll

Parameters:	rows (list of tuples) – Row data to be inserted. Each tuple should contain data for each schema field on the current table and in the same order as the schema fields. row_ids (list of string) – Unique ids, one per row being inserted. If not passed, no de-duplication occurs. skip_invalid_rows (boolean or `NoneType`) – skip rows w/ invalid data? ignore_unknown_values (boolean or `NoneType`) – ignore columns beyond schema? template_suffix (str or `NoneType`) – treat `name` as a template table and provide a suffix. BigQuery will create the table `<name> + <template_suffix>` based on the schema of the template table. See: https://cloud.google.com/bigquery/streaming-data-into-bigquery#template-tables client (`gcloud.bigquery.client.Client` or `NoneType`) – the client to use. If not passed, falls back to the `client` stored on the current dataset.
Return type:	list of mappings
Returns:	One mapping per row with insert errors: the “index” key identifies the row, and the “errors” key contains a list of the mappings describing one or more problems with the row.

location#

Location in which the table is hosted.

Return type:	str, or `NoneType`
Returns:	The location as set by the user, or None (the default).

modified#

Datetime at which the table was last modified.

Return type:	`datetime.datetime`, or `NoneType`
Returns:	the modification time (None until set from the server).

num_bytes#

The size of the table in bytes.

Return type:	integer, or `NoneType`
Returns:	the byte count (None until set from the server).

num_rows#

The number of rows in the table.

Return type:	integer, or `NoneType`
Returns:	the row count (None until set from the server).

patch(client=None, friendly_name=<object object>, description=<object object>, location=<object object>, expires=<object object>, view_query=<object object>, schema=<object object>)[source]#

API call: update individual table properties via a PATCH request

See https://cloud.google.com/bigquery/docs/reference/v2/tables/patch

Parameters:

Parameters:	client (`gcloud.bigquery.client.Client` or `NoneType`) – the client to use. If not passed, falls back to the `client` stored on the current dataset. friendly_name (str or `NoneType`) – point in time at which the table expires. description (str or `NoneType`) – point in time at which the table expires. location (str or `NoneType`) – point in time at which the table expires. expires (`datetime.datetime` or `NoneType`) – point in time at which the table expires. view_query (str) – SQL query defining the table as a view schema (list of `SchemaField`) – fields describing the schema
Raises:	ValueError for invalid value types.

client (gcloud.bigquery.client.Client or NoneType) – the client to use. If not passed, falls back to the client stored on the current dataset.
friendly_name (str or NoneType) – point in time at which the table expires.
description (str or NoneType) – point in time at which the table expires.
location (str or NoneType) – point in time at which the table expires.
expires (datetime.datetime or NoneType) – point in time at which the table expires.
view_query (str) – SQL query defining the table as a view
schema (list of SchemaField) – fields describing the schema

Raises:

ValueError for invalid value types.

path#

URL path for the table’s APIs.

Return type:	str
Returns:	the path based on project and dataste name.

project#

Project bound to the table.

Return type:	str
Returns:	the project (derived from the dataset).

reload(client=None)[source]#

API call: refresh table properties via a GET request

See https://cloud.google.com/bigquery/docs/reference/v2/tables/get

Parameters:	client (`gcloud.bigquery.client.Client` or `NoneType`) – the client to use. If not passed, falls back to the `client` stored on the current dataset.

schema#

Table’s schema.

Return type:	list of `SchemaField`
Returns:	fields describing the schema

self_link#

URL for the table resource.

Return type:	str, or `NoneType`
Returns:	the URL (None until set from the server).

table_id#

ID for the table resource.

Return type:	str, or `NoneType`
Returns:	the ID (None until set from the server).

table_type#

The type of the table.

Possible values are “TABLE” or “VIEW”.

Return type:	str, or `NoneType`
Returns:	the URL (None until set from the server).

update(client=None)[source]#

API call: update table properties via a PUT request

See https://cloud.google.com/bigquery/docs/reference/v2/tables/update

Parameters:	client (`gcloud.bigquery.client.Client` or `NoneType`) – the client to use. If not passed, falls back to the `client` stored on the current dataset.

upload_from_file(file_obj, source_format, rewind=False, size=None, num_retries=6, allow_jagged_rows=None, allow_quoted_newlines=None, create_disposition=None, encoding=None, field_delimiter=None, ignore_unknown_values=None, max_bad_records=None, quote_character=None, skip_leading_rows=None, write_disposition=None, client=None)[source]#

Upload the contents of this table from a file-like object.

The content type of the upload will either be - The value passed in to the function (if any) - text/csv.

Parameters:	file_obj (file) – A file handle opened in binary mode for reading. source_format (str) – one of ‘CSV’ or ‘NEWLINE_DELIMITED_JSON’. job configuration option; see `gcloud.bigquery.job.LoadJob()` rewind (boolean) – If True, seek to the beginning of the file handle before writing the file to Cloud Storage. size (int) – The number of bytes to read from the file handle. If not provided, we’ll try to guess the size using `os.fstat()`. (If the file handle is not from the filesystem this won’t be possible.) num_retries (integer) – Number of upload retries. Defaults to 6. allow_jagged_rows (boolean) – job configuration option; see `gcloud.bigquery.job.LoadJob()` allow_quoted_newlines (boolean) – job configuration option; see `gcloud.bigquery.job.LoadJob()` create_disposition (str) – job configuration option; see `gcloud.bigquery.job.LoadJob()` encoding (str) – job configuration option; see `gcloud.bigquery.job.LoadJob()` field_delimiter (str) – job configuration option; see `gcloud.bigquery.job.LoadJob()` ignore_unknown_values (boolean) – job configuration option; see `gcloud.bigquery.job.LoadJob()` max_bad_records (integer) – job configuration option; see `gcloud.bigquery.job.LoadJob()` quote_character (str) – job configuration option; see `gcloud.bigquery.job.LoadJob()` skip_leading_rows (integer) – job configuration option; see `gcloud.bigquery.job.LoadJob()` write_disposition (str) – job configuration option; see `gcloud.bigquery.job.LoadJob()` client (`gcloud.storage.client.Client` or `NoneType`) – Optional. The client to use. If not passed, falls back to the `client` stored on the current dataset.
Return type:	`gcloud.bigquery.jobs.LoadTableFromStorageJob`
Returns:	the job instance used to load the data (e.g., for querying status)
Raises:	`ValueError` if `size` is not passed in and can not be determined, or if the `file_obj` can be detected to be a file opened in text mode.

view_query#

SQL query defining the table as a view.

Return type:	str, or `NoneType`
Returns:	The query as set by the user, or None (the default).