Progress Bars¶

Three progress bar utilities are provided, all leveraging the excellent tqdm library.

progbar¶

A simple iterable wrapper, much like the default tqdm wrapper. It can be used on any iterable to display a progress bar as it gets iterated:

for x in progbar(my_list):
    do_something_slow(x)

However, unlike the standard tqdm function, this code has two additional, useful behaviors: first, it automatically leverages the ipywidgets progress bar when run inside a jupyter notebook; second, if given an integer, it automatically creates range(n) to iterate on. Both of these features are available in the tqdm library, but as separate functions. progbar wraps them all into a single intuitive call. It even includes a verbose flag that can be disabled to eliminate the progress bar based on runtime variables, if so desired.

miniutils.progress_bar.progbar(iterable, *a, verbose=True, **kw)[source]¶

Prints a progress bar as the iterable is iterated over

Parameters:	iterable – The iterator to iterate over a – Arguments to get passed to tqdm (or tqdm_notebook, if in a Jupyter notebook) verbose – Whether or not to print the progress bar at all kw – Keyword arguments to get passed to tqdm
Returns:	The iterable that will report a progress bar

parallel_progbar¶

A parallel mapper based on multiprocessing that replaces Pool.map. In attempting to use Pool.map, I’ve had issues with unintuitive errors and, of course, wanting a progress bar of my map job’s progress. Both of these are solved in parallel_progbar:

results = parallel_progbar(do_something_slow, my_list)
# Equivalent to a parallel version of [do_something_slow(x) for x in my_list]

This produces a pool of processes, and performs a map function in parallel on the items of the provided list.

Starmap behavior:

results = parallel_progbar(do_something_slow, my_list, starmap=True)
# [do_something_slow(*x) for x in my_list]

And/or flatmap behavior:

results = parallel_progbar(make_more_things, my_things, flatmap=True)
# Equivalent to a parallel version of [y for x in my_things for y in make_more_things(x)]

It also supports runtime disabling, limited number of parallel processes, shuffling before mapping (in case the order of your list puts, say, a few slowest items near the end), and even an optional second progress bar when performing a flatmap. This second bar just reports the number of items output (y in the case above), while the main progress bar counts down the number of finished inputs (x).

miniutils.progress_bar.parallel_progbar(*args, **kwargs)[source]¶

Performs a parallel mapping of the given iterable, reporting a progress bar as values get returned

Parameters:

mapper – The mapping function to apply to elements of the iterable
iterable – The iterable to map
nprocs – The number of processes (defaults to the number of cpu’s)
starmap – If true, the iterable is expected to contain tuples and the mapper function gets each element of a tuple as an argument
flatmap – If true, flatten out the returned values if the mapper function returns a list of objects
shuffle – If true, randomly sort the elements before processing them. This might help provide more uniform runtimes if processing different objects takes different amounts of time.
verbose – Whether or not to print the progress bar
verbose_flatmap – If performing a flatmap, whether or not to report each object as it’s returned
timeout – The number of seconds to wait for each worker process after completing
kwargs – Any other keyword arguments to pass to the progress bar (see progbar)

Returns:

A list of the returned objects, in the same order as provided

iparallel_progbar¶

This has the exact same behavior as parallel_progbar, but produces an unordered generator instead of a list, yielding results as soon as they’re available. It also permits a max_cache argument that allows you to limit the number of computed results available to the generator.

for result in iparallel_progbar(do_something_slow, my_list):
    print("Result {} done!".format(result))

miniutils.progress_bar.iparallel_progbar(*args, **kwargs)[source]¶

Performs a parallel mapping of the given iterable, reporting a progress bar as values get returned. Yields objects as soon as they’re computed, but does not guarantee that they’ll be in the correct order.

Parameters:

mapper – The mapping function to apply to elements of the iterable
iterable – The iterable to map
nprocs – The number of processes (defaults to the number of cpu’s)
starmap – If true, the iterable is expected to contain tuples and the mapper function gets each element of a tuple as an argument
flatmap – If true, flatten out the returned values if the mapper function returns a list of objects
shuffle – If true, randomly sort the elements before processing them. This might help provide more uniform runtimes if processing different objects takes different amounts of time.
verbose – Whether or not to print the progress bar
verbose_flatmap – If performing a flatmap, whether or not to report each object as it’s returned
max_cache – Maximum number of mapped objects to permit in the queue at once
timeout – The number of seconds to wait for each worker process after completing
kwargs – Any other keyword arguments to pass to the progress bar (see progbar)

Returns:

A list of the returned objects, in whatever order they’re done being computed