Progress Bars

Three progress bar utilities are provided, all leveraging the excellent tqdm library.

progbar

A simple iterable wrapper, much like the default tqdm wrapper. It can be used on any iterable to display a progress bar as it gets iterated:

for x in progbar(my_list):
    do_something_slow(x)

However, unlike the standard tqdm function, this code has two additional, useful behaviors: first, it automatically leverages the ipywidgets progress bar when run inside a jupyter notebook; second, if given an integer, it automatically creates range(n) to iterate on. Both of these features are available in the tqdm library, but as separate functions. progbar wraps them all into a single intuitive call. It even includes a verbose flag that can be disabled to eliminate the progress bar based on runtime variables, if so desired.

miniutils.progress_bar.progbar(iterable, *a, verbose=True, **kw)[source]

Prints a progress bar as the iterable is iterated over

Parameters:
  • iterable – The iterator to iterate over
  • a – Arguments to get passed to tqdm (or tqdm_notebook, if in a Jupyter notebook)
  • verbose – Whether or not to print the progress bar at all
  • kw – Keyword arguments to get passed to tqdm
Returns:

The iterable that will report a progress bar

parallel_progbar

A parallel mapper based on multiprocessing that replaces Pool.map. In attempting to use Pool.map, I’ve had issues with unintuitive errors and, of course, wanting a progress bar of my map job’s progress. Both of these are solved in parallel_progbar:

results = parallel_progbar(do_something_slow, my_list)
# Equivalent to a parallel version of [do_something_slow(x) for x in my_list]

This produces a pool of processes, and performs a map function in parallel on the items of the provided list.

Starmap behavior:

results = parallel_progbar(do_something_slow, my_list, starmap=True)
# [do_something_slow(*x) for x in my_list]

And/or flatmap behavior:

results = parallel_progbar(make_more_things, my_things, flatmap=True)
# Equivalent to a parallel version of [y for x in my_things for y in make_more_things(x)]

It also supports runtime disabling, limited number of parallel processes, shuffling before mapping (in case the order of your list puts, say, a few slowest items near the end), and even an optional second progress bar when performing a flatmap. This second bar just reports the number of items output (y in the case above), while the main progress bar counts down the number of finished inputs (x).

miniutils.progress_bar.parallel_progbar(mapper, iterable, nprocs=None, starmap=False, flatmap=False, shuffle=False, verbose=True, verbose_flatmap=None)[source]

Performs a parallel mapping of the given iterable, reporting a progress bar as values get returned

Parameters:
  • mapper – The mapping function to apply to elements of the iterable
  • iterable – The iterable to map
  • nprocs – The number of processes (defaults to the number of cpu’s)
  • starmap – If true, the iterable is expected to contain tuples and the mapper function gets each element of a tuple as an argument
  • flatmap – If true, flatten out the returned values if the mapper function returns a list of objects
  • shuffle – If true, randomly sort the elements before processing them. This might help provide more uniform runtimes if processing different objects takes different amounts of time.
  • verbose – Whether or not to print the progress bar
  • verbose_flatmap – If performing a flatmap, whether or not to report each object as it’s returned
Returns:

A list of the returned objects, in the same order as provided

iparallel_progbar

This has the exact same behavior as parallel_progbar, but produces an unordered generator instead of a list, yielding results as soon as they’re available. It also permits a max_cache argument that allows you to limit the number of computed results available to the generator.

for result in iparallel_progbar(do_something_slow, my_list):
    print("Result {} done!".format(result))
miniutils.progress_bar.iparallel_progbar(mapper, iterable, nprocs=None, starmap=False, flatmap=False, shuffle=False, verbose=True, verbose_flatmap=None, max_cache=-1)[source]

Performs a parallel mapping of the given iterable, reporting a progress bar as values get returned. Yields objects as soon as they’re computed, but does not guarantee that they’ll be in the correct order.

Parameters:
  • mapper – The mapping function to apply to elements of the iterable
  • iterable – The iterable to map
  • nprocs – The number of processes (defaults to the number of cpu’s)
  • starmap – If true, the iterable is expected to contain tuples and the mapper function gets each element of a tuple as an argument
  • flatmap – If true, flatten out the returned values if the mapper function returns a list of objects
  • shuffle – If true, randomly sort the elements before processing them. This might help provide more uniform runtimes if processing different objects takes different amounts of time.
  • verbose – Whether or not to print the progress bar
  • verbose_flatmap – If performing a flatmap, whether or not to report each object as it’s returned
  • max_cache
Returns:

A list of the returned objects, in whatever order they’re done being computed