How to start with google bigquery

Here is the list of thing I noticed when started to work with bigquery.
First you need to create a project — https://cloud.google.com/bigquery/docs/quickstarts/quickstart-web-ui.
That’s how you create a service key for bigquery — https://cloud.google.com/docs/authentication/getting-started. Select your project on the very top of the page, and then create JSON key. And don’t forget to set is as a Product Owner for the project (to make sure you don’t have any troubles with permissions).
After that you need to specify path to that file via environment variable GOOGLE_APPLICATION_CREDENTIALS.
Now you are ready to start the small project. Here is an example on using public data — https://gitlab.com/bullgare/bigquery-covid19.

Queries example

Bigquery is mostly an SQL, but it has some additions. For instance, it can have nested array of structures.
For instance, you could have a list of products inside your order with a scheme like that:

Orders table:
order.id
order.total
order.products.id

Products table:
product.id
product.name

You can select orders with product ids separated by a semicolon like that:

SELECT 
	o.id,
	o.total,
	ARRAY_TO_STRING(ARRAY_AGG(ARRAY(SELECT id FROM o.products)), ";", "") AS product_ids
FROM 
	[...].orders

More on that — https://stackoverflow.com/a/40943459/801426

Or you can join product names from Products table:

WITH (
	SELECT 
		o.id,
		o.total,
		p.name AS product_name
	FROM 
		[...].orders
	LEFT JOIN
		UNNEST(o.products) AS op
	INNER JOIN
		[...].products AS p
	ON
		p.id = op.id
) AS original
SELECT original.id, original.total, ARRAY_TO_STRING(ARRAY_AGG(DISTINCT original.product_name), ";", "") AS product_names

Here is a list of functions — https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators.
Here is an article on how to work with arrays — https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays.

Working with bigquery in go

You can use module cloud.google.com/go/bigquery.

Here is how you can work with query parameters — https://cloud.google.com/bigquery/docs/parameterized-queries.
Here is a little example in go:

q := client.Query(`
  SELECT
    o.order_id,
    o.total,
    DATETIME(o.order_placed_at) AS order_placed_at,
  FROM [...].orders AS o
  WHERE
    o.order_id IN UNNEST(@order_ids)
    AND
    o.order_placed_at BETWEEN TIMESTAMP_SUB(@current_date, INTERVAL @interval_days_from DAY) AND TIMESTAMP_SUB(@current_date, INTERVAL @interval_days_to DAY)
  ORDER BY o.vendor.vendor_code, order_placed_at
  LIMIT 2
  ;`)

	q.Parameters = []bigquery.QueryParameter{
		{
			Name:  "current_date",
			Value: time.Now().Add(24 * time.Hour).Format("2006-01-02"),
		},
		{
			Name:  "interval_days_from",
			Value: 3,
		},
		{
			Name:  "interval_days_to",
			Value: 1,
		},
		{
			Name:  "order_ids",
			Value: []int64(1,2,3),
		},
	}

	err := q.Read(ctx)
	if err != nil {
		return fmt.Errorf("reading data for query: %w", err)
	}

	for {
		var data [your-model]
		err := iter.Next(&data)
		if err == iterator.Done {
			break
		}
		if err != nil {
			return fmt.Errorf("iterating results: %w", err)
		}

		[work with your data]
	}


You could do a healthcheck with a query like this:

SELECT TIMESTAMP_SUB("2020-04-20", INTERVAL 0 DAY);

LEAVE A COMMENT