Progress Bars¶
Three progress bar utilities are provided, all leveraging the excellent tqdm library.
progbar¶
A simple iterable wrapper, much like the default tqdm
wrapper. It can be used on any iterable to display a progress bar as it gets iterated:
for x in progbar(my_list):
do_something_slow(x)
However, unlike the standard tqdm
function, this code has two additional, useful behaviors: first, it automatically leverages the ipywidgets
progress bar when run inside a jupyter notebook; second, if given an integer, it automatically creates range(n)
to iterate on. Both of these features are available in the tqdm
library, but as separate functions. progbar
wraps them all into a single intuitive call. It even includes a verbose
flag that can be disabled to eliminate the progress bar based on runtime variables, if so desired.
-
miniutils.progress_bar.
progbar
(iterable, *a, verbose=True, **kw)[source]¶ Prints a progress bar as the iterable is iterated over
Parameters: - iterable – The iterator to iterate over
- a – Arguments to get passed to tqdm (or tqdm_notebook, if in a Jupyter notebook)
- verbose – Whether or not to print the progress bar at all
- kw – Keyword arguments to get passed to tqdm
Returns: The iterable that will report a progress bar
parallel_progbar¶
A parallel mapper based on multiprocessing
that replaces Pool.map
. In attempting to use Pool.map
, I’ve had issues with unintuitive errors and, of course, wanting a progress bar of my map job’s progress. Both of these are solved in parallel_progbar
:
results = parallel_progbar(do_something_slow, my_list)
# Equivalent to a parallel version of [do_something_slow(x) for x in my_list]
This produces a pool of processes, and performs a map function in parallel on the items of the provided list.
Starmap behavior:
results = parallel_progbar(do_something_slow, my_list, starmap=True)
# [do_something_slow(*x) for x in my_list]
And/or flatmap behavior:
results = parallel_progbar(make_more_things, my_things, flatmap=True)
# Equivalent to a parallel version of [y for x in my_things for y in make_more_things(x)]
It also supports runtime disabling, limited number of parallel processes, shuffling before mapping (in case the order of your list puts, say, a few slowest items near the end), and even an optional second progress bar when performing a flatmap. This second bar just reports the number of items output (y
in the case above), while the main progress bar counts down the number of finished inputs (x
).
-
miniutils.progress_bar.
parallel_progbar
(*args, **kwargs)[source]¶ Performs a parallel mapping of the given iterable, reporting a progress bar as values get returned
Parameters: - mapper – The mapping function to apply to elements of the iterable
- iterable – The iterable to map
- nprocs – The number of processes (defaults to the number of cpu’s)
- starmap – If true, the iterable is expected to contain tuples and the mapper function gets each element of a tuple as an argument
- flatmap – If true, flatten out the returned values if the mapper function returns a list of objects
- shuffle – If true, randomly sort the elements before processing them. This might help provide more uniform runtimes if processing different objects takes different amounts of time.
- verbose – Whether or not to print the progress bar
- verbose_flatmap – If performing a flatmap, whether or not to report each object as it’s returned
- timeout – The number of seconds to wait for each worker process after completing
- kwargs – Any other keyword arguments to pass to the progress bar (see
progbar
)
Returns: A list of the returned objects, in the same order as provided
iparallel_progbar¶
This has the exact same behavior as parallel_progbar
, but produces an unordered generator instead of a list, yielding results as soon as they’re available. It also permits a max_cache
argument that allows you to limit the number of computed results available to the generator.
for result in iparallel_progbar(do_something_slow, my_list):
print("Result {} done!".format(result))
-
miniutils.progress_bar.
iparallel_progbar
(*args, **kwargs)[source]¶ Performs a parallel mapping of the given iterable, reporting a progress bar as values get returned. Yields objects as soon as they’re computed, but does not guarantee that they’ll be in the correct order.
Parameters: - mapper – The mapping function to apply to elements of the iterable
- iterable – The iterable to map
- nprocs – The number of processes (defaults to the number of cpu’s)
- starmap – If true, the iterable is expected to contain tuples and the mapper function gets each element of a tuple as an argument
- flatmap – If true, flatten out the returned values if the mapper function returns a list of objects
- shuffle – If true, randomly sort the elements before processing them. This might help provide more uniform runtimes if processing different objects takes different amounts of time.
- verbose – Whether or not to print the progress bar
- verbose_flatmap – If performing a flatmap, whether or not to report each object as it’s returned
- max_cache – Maximum number of mapped objects to permit in the queue at once
- timeout – The number of seconds to wait for each worker process after completing
- kwargs – Any other keyword arguments to pass to the progress bar (see
progbar
)
Returns: A list of the returned objects, in whatever order they’re done being computed