Changelog

2.28.0 - 2020-09-25

2.27.0 - 2020-09-18

2.26.0 - 2020-09-11

2.25.0 - 2020-08-28

2.24.0 - 2020-08-22

  • Move toolbar to above and fix y axis (#4043) Julia Signell

  • Make behavior clearer for how to get worker dashboard (#4047) Julia Signell

  • Worker dashboard clean up (#4046) Julia Signell

  • Add a default argument to the datasets and a possibility to override datasets (#4052) Nils Braun

  • Discover HTTP endpoints (#3744) Martin Durant

2.23.0 - 2020-08-14

2.22.0 - 2020-07-31

2.21.0 - 2020-07-17

2.20.0 - 2020-07-02

2.19.0 - 2020-06-19

2.18.0 - 2020-06-05

2.17.0 - 2020-05-26

2.16.0 - 2020-05-08

2.15.2 - 2020-05-01

2.15.1 - 2020-04-28

2.15.0 - 2020-04-24

2.14.0 - 2020-04-03

2.13.0 - 2020-03-25

2.12.0 - 2020-03-06

2.11.0 - 2020-02-19

2.10.0 - 2020-01-28

2.9.3 - 2020-01-17

2.9.2 - 2020-01-16

2.9.1 - 2019-12-27

2.9.0 - 2019-12-06

2.8.1 - 2019-11-22

2.8.0 - 2019-11-14

2.7.0 - 2019-11-08

This release drops support for Python 3.5

2.6.0 - 2019-10-15

2.5.2 - 2019-10-04

2.5.1 - 2019-09-27

2.5.0 - 2019-09-27

2.4.0 - 2019-09-13

2.3.2 - 2019-08-23

2.3.1 - 2019-08-22

2.3.0 - 2019-08-16

2.2.0 - 2019-07-31

2.1.0 - 2019-07-08

2.0.1 - 2019-06-26

We neglected to include python_requires= in our setup.py file, resulting in confusion for Python 2 users who erroneously get packages for 2.0.0. This is fixed in 2.0.1 and we have removed the 2.0.0 files from PyPI.

2.0.0 - 2019-06-25

1.28.1 - 2019-05-13

This is a small bugfix release due to a config change upstream.

1.28.0 - 2019-05-08

1.27.1 - 2019-04-29

1.27.0 - 2019-04-12

1.26.1 - 2019-03-29

1.26.0 - 2019-02-25

1.25.3 - 2019-01-31

1.25.2 - 2019-01-04

1.25.1 - 2018-12-15

1.25.0 - 2018-11-28

1.24.2 - 2018-11-15

1.24.1 - 2018-11-09

1.24.0 - 2018-10-26

1.23.3 - 2018-10-05

1.23.2 - 2018-09-17

1.23.1 - 2018-09-06

1.23.0 - 2018-08-30

1.22.1 - 2018-08-03

1.22.0 - 2018-06-14

1.21.8 - 2018-05-03

1.21.7 - 2018-05-02

1.21.6 - 2018-04-06

1.21.5 - 2018-03-31

1.21.4 - 2018-03-21

1.21.3 - 2018-03-08

1.21.2 - 2018-03-05

1.21.1 - 2018-02-22

1.21.0 - 2018-02-09

1.20.2 - 2017-12-07

1.20.1 - 2017-11-26

1.20.0 - 2017-11-17

1.19.3 - 2017-10-16

  • Handle None case in profile.identity (GH#1456)

  • Asyncio rewrite (GH#1458)

  • Add rejoin function partner to secede (GH#1462)

  • Nested compute (GH#1465)

  • Use LooseVersion when comparing Bokeh versions (GH#1470)

1.19.2 - 2017-10-06

  • as_completed doesn’t block on cancelled futures (GH#1436)

  • Notify waiting threads/coroutines on cancellation (GH#1438)

  • Set Future(inform=True) as default (GH#1437)

  • Rename Scheduler.transition_story to story (GH#1445)

  • Future uses default client by default (GH#1449)

  • Add keys= keyword to Client.call_stack (GH#1446)

  • Add get_current_task to worker (GH#1444)

  • Ensure that Client remains asynchornous before ioloop starts (GH#1452)

  • Remove “click for worker page” in bokeh plot (GH#1453)

  • Add Client.current() (GH#1450)

  • Clean handling of restart timeouts (GH#1442)

1.19.1 - September 25th, 2017

  • Fix tool issues with TaskStream plot (GH#1425)

  • Move profile module to top level (GH#1423)

1.19.0 - September 24th, 2017

  • Avoid storing messages in message log (GH#1361)

  • fileConfig does not disable existing loggers (GH#1380)

  • Offload upload_file disk I/O to separate thread (GH#1383)

  • Add missing SSLContext (GH#1385)

  • Collect worker thread information from sys._curent_frames (GH#1387)

  • Add nanny timeout (GH#1395)

  • Restart worker if memory use goes above 95% (GH#1397)

  • Track workers memory use with psutil (GH#1398)

  • Track scheduler delay times in workers (GH#1400)

  • Add time slider to profile plot (GH#1403)

  • Change memory-limit keyword to refer to maximum number of bytes (GH#1405)

  • Add cancel(force=) keyword (GH#1408)

1.18.2 - September 2nd, 2017

  • Silently pass on cancelled futures in as_completed (GH#1366)

  • Fix unicode keys error in Python 2 (GH#1370)

  • Support numeric worker names

  • Add dask-mpi executable (GH#1367)

1.18.1 - August 25th, 2017

  • Clean up forgotten keys in fire-and-forget workloads (GH#1250)

  • Handle missing extensions (GH#1263)

  • Allow recreate_exception on persisted collections (GH#1253)

  • Add asynchronous= keyword to blocking client methods (GH#1272)

  • Restrict to horizontal panning in bokeh plots (GH#1274)

  • Rename client.shutdown to client.close (GH#1275)

  • Avoid blocking on event loop (GH#1270)

  • Avoid cloudpickle errors for Client.get_versions (GH#1279)

  • Yield on Tornado IOStream.write futures (GH#1289)

  • Assume async behavior if inside a sync statement (GH#1284)

  • Avoid error messages on closing (GH#1297), (GH#1296) (GH#1318) (GH#1319)

  • Add timeout= keyword to get_client (GH#1290)

  • Respect timeouts when restarting (GH#1304)

  • Clean file descriptor and memory leaks in tests (GH#1317)

  • Deprecate Executor (GH#1302)

  • Add timeout to ThreadPoolExecutor.shutdown (GH#1330)

  • Clean up AsyncProcess handling (GH#1324)

  • Allow unicode keys in Python 2 scheduler (GH#1328)

  • Avoid leaking stolen data (GH#1326)

  • Improve error handling on failed nanny starts (GH#1337), (GH#1331)

  • Make Adaptive more flexible

  • Support --contact-address and --listen-address in worker (GH#1278)

  • Remove old dworker, dscheduler executables (GH#1355)

  • Exit workers if nanny process fails (GH#1345)

  • Auto pep8 and flake (GH#1353)

1.18.0 - July 8th, 2017

1.17.1 - June 14th, 2017

  • Remove Python 3.4 testing from travis-ci (GH#1157)

  • Remove ZMQ Support (GH#1160)

  • Fix memoryview nbytes issue in Python 2.7 (GH#1165)

  • Re-enable counters (GH#1168)

  • Improve scheduler.restart (GH#1175)

1.17.0 - June 9th, 2017

  • Reevaluate worker occupancy periodically during scheduler downtime (GH#1038) (GH#1101)

  • Add AioClient asyncio-compatible client API (GH#1029) (GH#1092) (GH#1099)

  • Update Keras serializer (GH#1067)

  • Support TLS/SSL connections for security (GH#866) (GH#1034)

  • Always create new worker directory when passed --local-directory (GH#1079)

  • Support pre-scattering data when using joblib frontent (GH#1022)

  • Make workers more robust to failure of sizeof function (GH#1108) and writing to disk (GH#1096)

  • Add is_empty and update methods to as_completed (GH#1113)

  • Remove _get coroutine and replace with get(..., sync=False) (GH#1109)

  • Improve API compatibility with async/await syntax (GH#1115) (GH#1124)

  • Add distributed Queues (GH#1117) and shared Variables (GH#1128) to enable inter-client coordination

  • Support direct client-to-worker scattering and gathering (GH#1130) as well as performance enhancements when scattering data

  • Style improvements for bokeh web dashboards (GH#1126) (GH#1141) as well as a removal of the external bokeh process

  • HTML reprs for Future and Client objects (GH#1136)

  • Support nested collections in client.compute (GH#1144)

  • Use normal client API in asynchronous mode (GH#1152)

  • Remove old distributed.collections submodule (GH#1153)

1.16.3 - May 5th, 2017

  • Add bokeh template files to MANIFEST (GH#1063)

  • Don’t set worker_client.get as default get (GH#1061)

  • Clean up logging on Client().shutdown() (GH#1055)

1.16.2 - May 3rd, 2017

  • Support async with Client syntax (GH#1053)

  • Use internal bokeh server for default diagnostics server (GH#1047)

  • Improve styling of bokeh plots when empty (GH#1046) (GH#1037)

  • Support efficient serialization for sparse arrays (GH#1040)

  • Prioritize newly arrived work in worker (GH#1035)

  • Prescatter data with joblib backend (GH#1022)

  • Make client.restart more robust to worker failure (GH#1018)

  • Support preloading a module or script in dask-worker or dask-scheduler processes (GH#1016)

  • Specify network interface in command line interface (GH#1007)

  • Client.scatter supports a single element (GH#1003)

  • Use blosc compression on all memoryviews passing through comms (GH#998)

  • Add concurrent.futures-compatible Executor (GH#997)

  • Add as_completed.batches method and return results (GH#994) (GH#971)

  • Allow worker_clients to optionally stay within the thread pool (GH#993)

  • Add bytes-stored and tasks-processing diagnostic histograms (GH#990)

  • Run supports non-msgpack-serializable results (GH#965)

1.16.1 - March 22nd, 2017

  • Use inproc transport in LocalCluster (GH#919)

  • Add structured and queryable cluster event logs (GH#922)

  • Use connection pool for inter-worker communication (GH#935)

  • Robustly shut down spawned worker processes at shutdown (GH#928)

  • Worker death timeout (GH#940)

  • More visual reporting of exceptions in progressbar (GH#941)

  • Render disk and serialization events to task stream visual (GH#943)

  • Support async for / await protocol (GH#952)

  • Ensure random generators are re-seeded in worker processes (GH#953)

  • Upload sourcecode as zip module (GH#886)

  • Replay remote exceptions in local process (GH#894)

1.16.0 - February 24th, 2017

  • First come first served priorities on client submissions (GH#840)

  • Can specify Bokeh internal ports (GH#850)

  • Allow stolen tasks to return from either worker (GH#853), (GH#875)

  • Add worker resource constraints during execution (GH#857)

  • Send small data through Channels (GH#858)

  • Better estimates for SciPy sparse matrix memory costs (GH#863)

  • Avoid stealing long running tasks (GH#873)

  • Maintain fortran ordering of NumPy arrays (GH#876)

  • Add --scheduler-file keyword to dask-scheduler (GH#877)

  • Add serializer for Keras models (GH#878)

  • Support uploading modules from zip files (GH#886)

  • Improve titles of Bokeh dashboards (GH#895)

1.15.2 - January 27th, 2017

  • Fix a bug where arrays with large dtypes or shapes were being improperly compressed (GH#830 GH#832 GH#833)

  • Extend as_completed to accept new futures during iteration (GH#829)

  • Add --nohost keyword to dask-ssh startup utility (GH#827)

  • Support scheduler shutdown of remote workers, useful for adaptive clusters (:pr: 811 GH#816 GH#821)

  • Add Client.run_on_scheduler method for running debug functions on the scheduler (GH#808)

1.15.1 - January 11th, 2017

  • Make compatibile with Bokeh 0.12.4 (GH#803)

  • Avoid compressing arrays if not helpful (GH#777)

  • Optimize inter-worker data transfer (GH#770) (GH#790)

  • Add –local-directory keyword to worker (GH#788)

  • Enable workers to arrive to the cluster with their own data. Useful if a worker leaves and comes back (GH#785)

  • Resolve thread safety bug when using local_client (GH#802)

  • Resolve scheduling issues in worker (GH#804)

1.15.0 - January 2nd, 2017

  • Major Worker refactor (GH#704)

  • Major Scheduler refactor (GH#717) (GH#722) (GH#724) (GH#742) (GH#743

  • Add check (default is False) option to Client.get_versions to raise if the versions don’t match on client, scheduler & workers (GH#664)

  • Future.add_done_callback executes in separate thread (GH#656)

  • Clean up numpy serialization (GH#670)

  • Support serialization of Tornado v4.5 coroutines (GH#673)

  • Use CPickle instead of Pickle in Python 2 (GH#684)

  • Use Forkserver rather than Fork on Unix in Python 3 (GH#687)

  • Support abstract resources for per-task constraints (GH#694) (GH#720) (GH#737)

  • Add TCP timeouts (GH#697)

  • Add embedded Bokeh server to workers (GH#709) (GH#713) (GH#738)

  • Add embedded Bokeh server to scheduler (GH#724) (GH#736) (GH#738)

  • Add more precise timers for Windows (GH#713)

  • Add Versioneer (GH#715)

  • Support inter-client channels (GH#729) (GH#749)

  • Scheduler Performance improvements (GH#740) (GH#760)

  • Improve load balancing and work stealing (GH#747) (GH#754) (GH#757)

  • Run Tornado coroutines on workers

  • Avoid slow sizeof call on Pandas dataframes (GH#758)

1.14.3 - November 13th, 2016

  • Remove custom Bokeh export tool that implicitly relied on nodejs (GH#655)

  • Clean up scheduler logging (GH#657)

1.14.2 - November 11th, 2016

  • Support more numpy dtypes in custom serialization, (GH#627), (GH#630), (GH#636)

  • Update Bokeh plots (GH#628)

  • Improve spill to disk heuristics (GH#633)

  • Add Export tool to Task Stream plot

  • Reverse frame order in loads for very many frames (GH#651)

  • Add timeout when waiting on write (GH#653)

1.14.0 - November 3rd, 2016

  • Add Client.get_versions() function to return software and package information from the scheduler, workers, and client (GH#595)

  • Improved windows support (GH#577) (GH#590) (GH#583) (GH#597)

  • Clean up rpc objects explicitly (GH#584)

  • Normalize collections against known futures (GH#587)

  • Add key= keyword to map to specify keynames (GH#589)

  • Custom data serialization (GH#606)

  • Refactor the web interface (GH#608) (GH#615) (GH#621)

  • Allow user-supplied Executor in Worker (GH#609)

  • Pass Worker kwargs through LocalCluster

1.13.3 - October 15th, 2016

  • Schedulers can retire workers cleanly

  • Add Future.add_done_callback for concurrent.futures compatibility

  • Update web interface to be consistent with Bokeh 0.12.3

  • Close streams explicitly, avoiding race conditions and supporting more robust restarts on Windows.

  • Improved shuffled performance for dask.dataframe

  • Add adaptive allocation cluster manager

  • Reduce administrative overhead when dealing with many workers

  • dask-ssh --log-directory . no longer errors

  • Microperformance tuning for the scheduler

1.13.2

  • Revert dask_worker to use fork rather than subprocess by default

  • Scatter retains type information

  • Bokeh always uses subprocess rather than spawn

1.13.1

  • Fix critical Windows error with dask_worker executable

1.13.0

  • Rename Executor to Client (GH#492)

  • Add --memory-limit option to dask-worker, enabling spill-to-disk behavior when running out of memory (GH#485)

  • Add --pid-file option to dask-worker and --dask-scheduler (GH#496)

  • Add upload_environment function to distribute conda environments. This is experimental, undocumented, and may change without notice. (GH#494)

  • Add workers= keyword argument to Client.compute and Client.persist, supporting location-restricted workloads with Dask collections (GH#484)

  • Add upload_environment function to distribute conda environments. This is experimental, undocumented, and may change without notice. (GH#494)

    • Add optional dask_worker= keyword to client.run functions that gets provided the worker or nanny object

    • Add nanny=False keyword to Client.run, allowing for the execution of arbitrary functions on the nannies as well as normal workers

1.12.2

This release adds some new features and removes dead code

  • Publish and share datasets on the scheduler between many clients (GH#453). See Publish Datasets.

  • Launch tasks from other tasks (experimental) (GH#471). See Launch Tasks from Tasks.

  • Remove unused code, notably the Center object and older client functions (GH#478)

  • Executor() and LocalCluster() is now robust to Bokeh’s absence (GH#481)

  • Removed s3fs and boto3 from requirements. These have moved to Dask.

1.12.1

This release is largely a bugfix release, recovering from the previous large refactor.

  • Fixes from previous refactor
    • Ensure idempotence across clients

    • Stress test losing scattered data permanently

  • IPython fixes
    • Add start_ipython_scheduler method to Executor

    • Add %remote magic for workers

    • Clean up code and tests

  • Pool connects to maintain reuse and reduce number of open file handles

  • Re-implement work stealing algorithm

  • Support cancellation of tuple keys, such as occur in dask.arrays

  • Start synchronizing against worker data that may be superfluous

  • Improve bokeh plots styling
    • Add memory plot tracking number of bytes

    • Make the progress bars more compact and align colors

    • Add workers/ page with workers table, stacks/processing plot, and memory

  • Add this release notes document

1.12.0

This release was largely a refactoring release. Internals were changed significantly without many new features.

  • Major refactor of the scheduler to use transitions system

  • Tweak protocol to traverse down complex messages in search of large bytestrings

  • Add dask-submit and dask-remote

  • Refactor HDFS writing to align with changes in the dask library

  • Executor reconnects to scheduler on broken connection or failed scheduler

  • Support sklearn.external.joblib as well as normal joblib