Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: automate clean-acr with github action workflow #1735

Merged
merged 53 commits into from
Nov 21, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
421854f
hello workflow
niehaus59 Nov 19, 2022
6d103ec
make pre-commit executable
niehaus59 Nov 19, 2022
2170a82
clean acr
niehaus59 Nov 19, 2022
69c37ca
gh-set-secret
niehaus59 Nov 20, 2022
558d6e9
Merge branch 'master' of https://github.com/niehaus59/SynapseML
niehaus59 Nov 20, 2022
c086f2d
Merge branch 'master' of https://github.com/niehaus59/SynapseML
niehaus59 Nov 20, 2022
de51224
pat->github
niehaus59 Nov 20, 2022
351e2e9
echo
niehaus59 Nov 20, 2022
60f6303
env
niehaus59 Nov 20, 2022
d27c842
Create gh-set-secret.yml
niehaus59 Nov 20, 2022
cdc680f
Update gh-set-secret.yml
niehaus59 Nov 20, 2022
c26b06d
Update gh-set-secret.yml
niehaus59 Nov 20, 2022
7e88474
merge
niehaus59 Nov 20, 2022
0b86690
Create manual.yml
niehaus59 Nov 20, 2022
621b081
Update manual.yml
niehaus59 Nov 20, 2022
fb14fe5
merge
niehaus59 Nov 20, 2022
3c65a26
Update manual.yml
niehaus59 Nov 20, 2022
e801a5a
Update manual.yml
niehaus59 Nov 20, 2022
cb812a4
Update manual.yml
niehaus59 Nov 20, 2022
faf1c1e
use azurecli for keyvault access
niehaus59 Nov 20, 2022
00bfc9a
Merge branch 'master' of https://github.com/niehaus59/SynapseML
niehaus59 Nov 20, 2022
f0bca3d
remove pip cache
niehaus59 Nov 21, 2022
9ee2004
remove columns
niehaus59 Nov 21, 2022
9f26227
fix indent
niehaus59 Nov 21, 2022
26b7cf3
fix env var name
niehaus59 Nov 21, 2022
db12711
split off script file
niehaus59 Nov 21, 2022
c019718
azurecli@v1
niehaus59 Nov 21, 2022
1091033
shorten path
niehaus59 Nov 21, 2022
8d04a2a
lengthen path
niehaus59 Nov 21, 2022
4b31429
add query option to az cmd
niehaus59 Nov 21, 2022
f1b208a
re-indent
niehaus59 Nov 21, 2022
a903133
re-indent again
niehaus59 Nov 21, 2022
3266182
echo
niehaus59 Nov 21, 2022
655dd15
print
niehaus59 Nov 21, 2022
d04e50b
test maniehtestkv
niehaus59 Nov 21, 2022
6cd81c4
back to azure kv task
niehaus59 Nov 21, 2022
5932567
back to mmlspark-keys
niehaus59 Nov 21, 2022
7724e3c
quot arg
niehaus59 Nov 21, 2022
0486a65
typo
niehaus59 Nov 21, 2022
f2d1e7a
use popen for pipeline-run
niehaus59 Nov 21, 2022
3102332
run through deletions in whatif mode
niehaus59 Nov 21, 2022
fafafb8
print result code
niehaus59 Nov 21, 2022
7ac27bc
delete result code
niehaus59 Nov 21, 2022
3b5483a
format changes and check result of transfer
niehaus59 Nov 21, 2022
fecd849
remove tqdn
niehaus59 Nov 21, 2022
77e42cd
remove tqdn
niehaus59 Nov 21, 2022
293efca
restore actual deletion
niehaus59 Nov 21, 2022
d51f268
formatize prints
niehaus59 Nov 21, 2022
9c4b30d
delete cruft
niehaus59 Nov 21, 2022
fd6fb90
switch from manual to cron
niehaus59 Nov 21, 2022
cb63756
sundays at 1am
niehaus59 Nov 21, 2022
b45e33f
chmod pre-commit
niehaus59 Nov 21, 2022
a126c9f
Merge branch 'master' into manieh/clean-acr-3
mhamilton723 Nov 21, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions .github/workflows/clean-acr.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Notes: To access key vault and grab the connection string, we first need a service principal.
# We need to add that service principal as a Reader in the RBAC for the key vault in question,
# as well as adding it with Get and List permissions in the key vault's access policies.
# Then we need to store that service principal's info as a GitHub secret.
# We then use that secret here as the credentials for logging into Azure.
# Instructions are here: https://learn.microsoft.com/en-us/azure/developer/github/github-key-vault
# In our case, the service principal is called synapseml-clean-acr.
# The github secret is a repository secret called clean_acr.
# It is backed up in the mmlspark-keys vault by secret clean-acr-github-actions-info.
# The secret has an expiration date (currently 11/20/2024), so it will need to be renewed at some point.

name: Clean ACR

on:
schedule:
- cron: "0 1 * * 0" # every sunday at 1am

jobs:
clean-acr:
name: Clean ACR
runs-on: ubuntu-latest
steps:
- name: Azure Login
uses: azure/login@v1
with:
creds: ${{ secrets.clean_acr }}
# TODO: The docs say that Azure/get-keyvault-secrets@v1 is deprecated but are vague on what to use instead.
# Keep an eye on how this continues to work.
- name: Get connection string
uses: Azure/get-keyvault-secrets@v1
with:
keyvault: "mmlspark-keys"
secrets: "clean-acr-connection-string"
id: getSecret
- name: checkout repo content
uses: actions/checkout@v2 # checkout the repo
- name: setup python
uses: actions/setup-python@v4
with:
python-version: '3.x'
- run: pip install azure-storage-blob azure-identity
- name: execute clean acr
run: python .github/workflows/scripts/clean-acr.py "${{ steps.getSecret.outputs.clean-acr-connection-string }}"
shell: sh
54 changes: 54 additions & 0 deletions .github/workflows/scripts/clean-acr.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
import os
import json
from azure.storage.blob import BlobClient
from azure.identity import DefaultAzureCredential
import sys
import subprocess

credential = DefaultAzureCredential()
"""
run this if sas expires and place result in keyvault under secret name

IMPORT_SAS=?$(az storage container generate-sas \
--name acrbackup \
--account-name mmlspark \
--expiry 2023-01-01 \
--permissions rawdl \
--https-only \
--output tsv)
echo $IMPORT_SAS
"""

acr = "mmlsparkmcr"
container = "acrbackup"
rg = "marhamil-mmlspark"
pipeline = "mmlsparkacrexport3"

conn_string = sys.argv[1]

os.system('az extension add --name acrtransfer')

repos = json.loads(os.popen(f"az acr repository list -n {acr}").read())
for repo in repos:
tags = json.loads(os.popen(
f"az acr repository show-tags -n {acr} --repository {repo} --orderby time_desc").read())

for tag in tags:
target_blob = repo + "/" + tag + ".tar"
image = repo + ":" + tag

backup_exists = BlobClient.from_connection_string(
conn_string, container_name=container, blob_name=target_blob).exists()
if not backup_exists:
result = os.system(f"az acr pipeline-run create --resource-group {rg} --registry {acr} --pipeline {pipeline} --name {str(abs(hash(target_blob)))} --pipeline-type export --storage-blob {target_blob} -a {image}")
assert result == 0
print(f"Transferred {target_blob}")
else:
print(f"Skipped existing {image}")

backup_exists = BlobClient.from_connection_string(
conn_string, container_name=container, blob_name=target_blob).exists()
if backup_exists:
print(f"Deleting {image}")
result = os.system(f"az acr repository delete --name {acr} --image {image} --yes")
assert result == 0