Playing with G Suite MDM mobile device data using Go

Our desktop support & G Suite admin folks needed a simple, fast command-line tool to query basic info about our company’s mobile devices (which are all managed using G Suite’s built-in MDM).

So I wrote one.

Since this tool needs to be run via command-line, it can’t use any interactive or browser-based authentication, so we need to use a service account for authentication.

Pre-requisites (GCP & G Suite):

  • Create a service account in your G Suite domain (Google Cloud Console –> IAM & Admin –> Service Accounts)
  • Be sure that you have checked the “Enable G Suite Domain-wide Delegation” box as this will give your new service account the appropriate permissions to query the G Suite Admin API
  • Download the service account JSON credentials file. Open it up and find the client_id. Make a note of it!
  • Now we need to authorize the client_id of your service account for the necessary API scopes. In the Admin Console for your G Suite domain (Admin Console –> Security –> Advanced Settings –> Authentication –> Manage API Client Access), add your client_id in the “Client Name” box, and add
    https://www.googleapis.com/auth/admin.directory.device.mobile.readonly
    in the “One or more API scopes” box
  • Click the “Authorize” button when you’re finished

Pre-requisites (Go)
You’ll need to “go get” a few packages:

  • go get -u golang.org/x/oauth2/google
  • go get -u google.golang.org/api/admin/directory/v1
  • go get -u github.com/dustin/go-humanize

Pre-requisites (Environment Variables)
Because it’s never good to store runtime configuration within code, you’ll notice that the code references several environment variables. Setup them up to suit your preference but something like this will do:

export GSUITE_COMPANYID="A01234567"
export SVC_ACCOUNT_CREDS_JSON="/home/rickt/dev/adminsdk/golang/creds.json
export GSUITE_ADMINUSER="foo@bar.com"

And finally, the code
Gist URL: https://gist.github.com/rickt/199ca2be87522496e83de77bd5cd7db2

Automated insert of CSV data into Bigquery via GCS bucket + Python

i wanted to try out the automatic loading of CSV data into Bigquery, specifically using a Cloud Function that would automatically run whenever a new CSV file was uploaded into a Google Cloud Storage bucket.

it worked like a champ. here’s what i did to PoC:

  1. generate a CSV file with 1000 lines of dummy data via https://www.mockaroo.com/ with their default schema so it looks like:
$ head -3 testdata.csv

id,first_name,last_name,email,gender,ip_address

1,Andres,Berthot,aberthot0@pbs.org,Male,104.204.241.0

2,Iris,Zwicker,izwicker1@icq.com,Female,61.87.224.4
  • make a test Google Cloud Storage bucket:
     $ gsutil mb gs://csvtestbucket 
  • install the necessary python bits & pieces:
    $ pip3 install google-cloud-bigquery --upgrade
  • make a Bigquery dataset:
    $ bq mk --dataset rickts-dev-project:csvtestdataset
  • make a table within that dataset to match the CSV schema:
  • $ bq mk -t csvtestdataset.csvtable \
    
    id:INTEGER,first_name:STRING,last_name:STRING,email:STRING,gender:STRING,ip_address:STRING
  • eyeball the table in the Bigquery dataset and verify it is clean and fresh:
  • now its time to write some python. here’s mine:
  • because you don’t want to hardcode things like bucket/table/dataset names in code, create a yaml file that will store your deployment-specific configuration in environment variables
  •  BUCKET: csvtestbucket
    
    DATASET: csvtestdataset
    
    TABLE: csvtable
    
    VERSION: v14
  • create a requirements.txt file for the necessary imports
  • google-cloud
    
    google-cloud-bigquery
  • create a .gcloudignore file so that your yaml or CSV files will not be deployed into GCP
  • *csv
    
    *yaml
  • at this point, your folder should look something like this:
  • $ ls
    
    env.yaml  main.py  requirements.txt  testdata.csv
  • now we are ready to deploy the cloud function, we will add a trigger on the storage bucket that will fire every time a new file is added to the bucket. here we are creating a cloud function named “csv_loader”:
  • $ gcloud beta functions deploy csv_loader \
    
    --runtime=python37 \
    
    --trigger-resource=gs://csvtestbucket \
    
    --trigger-event=google.storage.object.finalize \
    
    --entry-point=csv_loader \
    
    --env-vars-file=env.yaml
  • ok, the function is deployed, yay! copy your test data CSV into the test bucket:
  • $ gsutil cp testdata.csv gs://csvtestbucket/
    
    Copying file://testdata.csv [Content-Type=text/csv]...
    
    - [1 files][ 60.4 KiB/ 60.4 KiB]
    
    Operation completed over 1 objects/60.4 KiB.  
  •  now that we have copied a CSV file in the bucket, the function should fire! check the cloud function logs:
  • $ gcloud functions logs read
    
    [ ... snipped for brevity ... ]
    
    D      csv_loader  274732139359754  2018-10-22 20:48:27.852  Function execution started
    
    I      csv_loader  274732139359754  2018-10-22 20:48:28.492  Starting job 9ca2f39c-539f-454d-aa8e-3299bc9f7287
    
    I      csv_loader  274732139359754  2018-10-22 20:48:28.492  Function=csv_loader, Version=v14
    
    I      csv_loader  274732139359754  2018-10-22 20:48:28.492  File: testdata2.csv
    
    I      csv_loader  274732139359754  2018-10-22 20:48:31.022  Job finished.
    
    I      csv_loader  274732139359754  2018-10-22 20:48:31.136  Loaded 1000 rows.
    
    D      csv_loader  274732139359754  2018-10-22 20:48:31.139  Function execution took 3288 ms, finished with status: 'ok'

    looks like the function ran as expected!

  • lets eyeball the bigquery table again, and see if the row count has changed
  • $ bq show csvtestdataset.csvtable
    
    Table rickts-dev-project:csvtestdataset.csvtable
    
    Last modified           Schema           Total Rows   Total Bytes   Expiration   Time Partitioning   Labels
    
    ----------------- ----------------------- ------------ ------------- ------------ ------------------- --------
    
    22 Oct 13:48:29   |- id: integer          1000         70950
    
    |- first_name: string
    
    |- last_name: string
    
    |- email: string
    
    |- gender: string
    
    |- ip_address: string

    great! there are now 1000 rows. looking good.

  • as a final check, lets compare the first 3 rows of the CSV
  • $ egrep '^[1,2,3],' testdata.csv
    
    1,Andres,Berthot,aberthot0@pbs.org,Male,104.204.241.0
    
    2,Iris,Zwicker,izwicker1@icq.com,Female,61.87.224.4
    
    3,Aime,Gladdis,agladdis2@hugedomains.com,Female,29.55.250.191

    with the first 3 rows of the bigquery table

    $ bq query 'select * from csvtestdataset.csvtable \
    
    where id IN (1,2,3)'
    
    Waiting on bqjob_r6a3239576845ac4d_000001669d987208_1 ... (0s) Current status: DONE
    
    +----+------------+-----------+---------------------------+--------+---------------+
    
    | id | first_name | last_name |           email           | gender |  ip_address   |
    
    +----+------------+-----------+---------------------------+--------+---------------+
    
    |  1 | Andres     | Berthot   | aberthot0@pbs.org         | Male   | 104.204.241.0 |
    
    |  2 | Iris       | Zwicker   | izwicker1@icq.com         | Female | 61.87.224.4   |
    
    |  3 | Aime       | Gladdis   | agladdis2@hugedomains.com | Female | 29.55.250.191 |
    
    +----+------------+-----------+---------------------------+--------+---------------+

    and whaddyaknow, they match! w00t!

    proof of concept: complete!

    conclusion: cloud functions are pretty great.