Skip to content

1.6.0

Compare
Choose a tag to compare
@burritojustice burritojustice released this 09 Oct 06:52
· 118 commits to master since this release

Version 1.6 of the Data Hub CLI makes it even easier to upload your data, with support for zipped shapefiles and xls/xlsx files, as well as batch uploads! It also has enhancements to the groupby option, and here xyz join has a new property search feature that makes virtual spaces even easier to use. Also, tokens we generate for show are now temporary by default.

💾 💿 📼 Upload enhancements 🔢 🔤 🔠

More filetypes, batch uploads, and better error detection and correction!

  • you can now upload zipped shapefiles and Excel spreadsheets (.xls,.xlsx)
  • have 100 GeoJSON files sitting in a folder? batch uploads are now possible! upload -f /directory/path/ --batch filetype
  • you can also upload multiple files or URLs at once using comma separated paths + filenames -- here xyz upload -f file1,file2,/path1/file3,/path2/file4 or here xyz upload -f "url1","url2","url3"
  • if a streaming upload is failing because a chunk is too large for the API gateway, we automatically reduce the size of the chunk until it successfully uploads, or we determine that the feature on its own is too large, in which case we notify you and move along to the next chunk
  • we notify you if coordinates in a feature are > 180 or < -180 and it cannot be uploaded

changes in how we assign the Feature ID:

  • We've changed the default behavior of the CLI during upload to respect a GeoJSON feature ID if it is present. (Previously we made it a property hash by default, but we've seen an evolution in user behavior where more of you have actually unique and meaningful IDs.) As before, if there is no feature ID, the CLI will generate a hash of the feature properties and use that as the feature ID. This means if there are duplicate features, only one will be uploaded.
  • -o now overrides an existing feature ID and generate a hash of the properties to use as the ID, or you can choose an existing property/properties as the feature ID with -i -- these options are useful when you are uploading multiple datasets with unique features but where the feature IDs overlap, especially from public data portals.

👨👧👶 --groupby enhancements 👫 👨‍👦‍👦 👨‍👩‍👦‍👦

  • --flatten: instead of a nested object, create a string delimited by : to reflect the logical hierarchy
  • --promote: hoists a list of properties that don't need to be repeated in each grouped feature (e.g. the name of a region in electoral riding results, grouped by party)
  • you can learn more about the power of groupby

👫 👭 👬 join enhancements 👨‍👩‍👧‍👦 👨‍👩‍👧 👨‍👩‍👧‍👧

  • use Data Hub's property search to create a Virtual Space based on the feature ID in one space and a property in another space. (If a property is not found, we add a no_match tag)
  • use --filter to restrict the property search by yet another property (e.g. only look for counties in a particular state in a national dataset where county names are not unique)

📗 ✏️ 🚫 set a space to read-only 📒 ✏️ 🚫

  • here xyz config --readonly true
  • useful in preventing unintended modifications to an upstream virtual space

🚀 🛰 👽 space specific tokens 🚀 🛰 👽

  • show -w and show -v now generate a token just for that space. By default, a temporary token lasting 48 hours is generated. You can generate a permanent token for that space using --permanent or -x