Change Log

v0.4.0 (2019-11-21)

New Features

  • Added ability to delete branch names/pointers from a local repository via both API and CLI. (#128) @rlizzo
  • Added local keyword arg to arrayset key/value iterators to return only locally available samples (#131) @rlizzo
  • Ability to change the backend storage format and options applied to an arrayset after initialization. (#133) @rlizzo
  • Added blosc compression to HDF5 backend by default on PyPi installations. (#146) @rlizzo
  • Added Benchmarking Suite to Test for Performance Regressions in PRs. (#155) @rlizzo
  • Added new backend optimized to increase speeds for fixed size arrayset access. (#160) @rlizzo

Improvements

  • Removed msgpack and pyyaml dependencies. Cleaned up and improved remote client/server code. (#130) @rlizzo
  • Multiprocess Torch DataLoaders allowed on Linux and MacOS. (#144) @rlizzo
  • Added CLI options commit, checkout, arrayset create, & arrayset remove. (#150) @rlizzo
  • Plugin system revamp. (#134) @hhsecond
  • Documentation Improvements and Typo-Fixes. (#156) @alessiamarcolini
  • Removed implicit removal of arrayset schema from checkout if every sample was removed from arrayset. This could potentially result in dangling accessors which may or may not self-destruct (as expected) in certain edge-cases. (#159) @rlizzo
  • Added type codes to hash digests so that calculation function can be updated in the future without breaking repos written in previous Hangar versions. (#165) @rlizzo

Bug Fixes

  • Programatic access to repository log contents now returns branch heads alongside other log info. (#125) @rlizzo
  • Fixed minor bug in types of values allowed for Arrayset names vs Sample names. (#151) @rlizzo
  • Fixed issue where using checkout object to access a sample in multiple arraysets would try to create a namedtuple instance with invalid field names. Now incompatible field names are automatically renamed with their positional index. (#161) @rlizzo
  • Explicitly raise error if commit argument is set while checking out a repository with write=True. (#166) @rlizzo

Breaking changes

  • New commit reference serialization format is incompatible with repositories written in version 0.3.0 or earlier.

v0.3.0 (2019-09-10)

New Features

  • API addition allowing reading and writing arrayset data from a checkout object directly. (#115) @rlizzo
  • Data importer, exporters, and viewers via CLI for common file formats. Includes plugin system for easy extensibility in the future. (#103) (@rlizzo, @hhsecond)

Improvements

  • Added tutorial on working with remote data. (#113) @rlizzo
  • Added Tutorial on Tensorflow and PyTorch Dataloaders. (#117) @hhsecond
  • Large performance improvement to diff/merge algorithm (~30x previous). (#112) @rlizzo
  • New commit hash algorithm which is much more reproducible in the long term. (#120) @rlizzo
  • HDF5 backend updated to increase speed of reading/writing variable sized dataset compressed chunks (#120) @rlizzo

Bug Fixes

  • Fixed ML Dataloaders errors for a number of edge cases surrounding partial-remote data and non-common keys. (#110) ( @hhsecond, @rlizzo)

Breaking changes

  • New commit hash algorithm is incompatible with repositories written in version 0.2.0 or earlier

v0.2.0 (2019-08-09)

New Features

  • Numpy memory-mapped array file backend added. (#70) @rlizzo
  • Remote server data backend added. (#70) @rlizzo
  • Selection heuristics to determine appropriate backend from arrayset schema. (#70) @rlizzo
  • Partial remote clones and fetch operations now fully supported. (#85) @rlizzo
  • CLI has been placed under test coverage, added interface usage to docs. (#85) @rlizzo
  • TensorFlow and PyTorch Machine Learning Dataloader Methods (Experimental Release). (#91) lead: @hhsecond, co-author: @rlizzo, reviewed by: @elistevens

Improvements

  • Record format versioning and standardization so to not break backwards compatibility in the future. (#70) @rlizzo
  • Backend addition and update developer protocols and documentation. (#70) @rlizzo
  • Read-only checkout arrayset sample get methods now are multithread and multiprocess safe. (#84) @rlizzo
  • Read-only checkout metadata sample get methods are thread safe if used within a context manager. (#101) @rlizzo
  • Samples can be assigned integer names in addition to string names. (#89) @rlizzo
  • Forgetting to close a write-enabled checkout before terminating the python process will close the checkout automatically for many situations. (#101) @rlizzo
  • Repository software version compatability methods added to ensure upgrade paths in the future. (#101) @rlizzo
  • Many tests added (including support for Mac OSX on Travis-CI). lead: @rlizzo, co-author: @hhsecond

Bug Fixes

  • Diff results for fast forward merges now returns sensible results. (#77) @rlizzo
  • Many type annotations added, and developer documentation improved. @hhsecond & @rlizzo

Breaking changes

  • Renamed all references to datasets in the API / world-view to arraysets.
  • These are backwards incompatible changes. For all versions > 0.2, repository upgrade utilities will be provided if breaking changes occur.

v0.1.1 (2019-05-24)

Bug Fixes

  • Fixed typo in README which was uploaded to PyPi

v0.1.0 (2019-05-24)

New Features

  • Remote client-server config negotiation and administrator permissions. (#10) @rlizzo
  • Allow single python process to access multiple repositories simultaneously. (#20) @rlizzo
  • Fast-Forward and 3-Way Merge and Diff methods now fully supported and behaving as expected. (#32) @rlizzo

Improvements

  • Initial test-case specification. (#14) @hhsecond
  • Checkout test-case work. (#25) @hhsecond
  • Metadata test-case work. (#27) @hhsecond
  • Any potential failure cases raise exceptions instead of silently returning. (#16) @rlizzo
  • Many usability improvements in a variety of commits.

Bug Fixes

  • Ensure references to checkout arrayset or metadata objects cannot operate after the checkout is closed. (#41) @rlizzo
  • Sensible exception classes and error messages raised on a variety of situations (Many commits). @hhsecond & @rlizzo
  • Many minor issues addressed.

API Additions

  • Refer to API documentation (#23)

Breaking changes

  • All repositories written with previous versions of Hangar are liable to break when using this version. Please upgrade versions immediately.

v0.0.0 (2019-04-15)

  • First Public Release of Hangar!