.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/time_series/plot_qge_numbacs_vs_scipy_reduced.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_time_series_plot_qge_numbacs_vs_scipy_reduced.py: NumbaCS vs SciPy/NumPy -- QGE ============================= Compare run times for flow map and FTLE between NumbaCS and a pure SciPy/NumPy implementation for the QGE. .. GENERATED FROM PYTHON SOURCE LINES 9-24 .. code-block:: Python # Author: ajarvis # Hardware: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz, Cores = 4, Threads = 8 import numpy as np from scipy.interpolate import RegularGridInterpolator from scipy.integrate import odeint from numbacs.integration import flowmap_grid_2D from numbacs.flows import get_interp_arrays_2D, get_flow_2D from numbacs.diagnostics import ftle_grid_2D import time from math import copysign, log from multiprocessing import Pool from functools import partial .. GENERATED FROM PYTHON SOURCE LINES 25-29 Get flow data and create NumbaCS flow ------------------------------------- Load velocity data, set up domain, set the integration span and direction, create interpolant of velocity data and retrieve necessary arrays. .. GENERATED FROM PYTHON SOURCE LINES 29-65 .. code-block:: Python # load in qge velocity data u = np.load("../data/qge/qge_u.npy") v = np.load("../data/qge/qge_v.npy") # set up domain nt, nx, ny = u.shape x = np.linspace(0, 1, nx) y = np.linspace(0, 2, ny) t = np.linspace(0, 1, nt) dx = x[1] - x[0] dy = y[1] - y[0] # use reduced domain or scipy will take much too long s = 4 x = x[::s] y = y[::s] t = t[::s] u = u[::s, ::s, ::s] v = v[::s, ::s, ::s] nx = len(x) ny = len(y) # set integration span and integration direction t0 = 0.0 T = 0.1 params = np.array([copysign(1, T)]) # important this is an array of type float # get interpolant arrays of velocity field grid_vel, C_eval_u, C_eval_v = get_interp_arrays_2D(t, x, y, u, v) # get flow to be integrated funcptr = get_flow_2D(grid_vel, C_eval_u, C_eval_v, extrap_mode="linear") .. GENERATED FROM PYTHON SOURCE LINES 66-69 Scipy interpolant and ODE function ---------------------------------- Create interpolant and function for SciPy ode solver. .. GENERATED FROM PYTHON SOURCE LINES 69-81 .. code-block:: Python ui = RegularGridInterpolator((t, x, y), u, method="linear", bounds_error=False, fill_value=0.0) vi = RegularGridInterpolator((t, x, y), v, method="linear", bounds_error=False, fill_value=0.0) def odeint_fun(yy, tt): pt = np.array([tt, yy[0], yy[1]]) return ui(pt)[0], vi(pt)[0] .. GENERATED FROM PYTHON SOURCE LINES 82-88 SciPy flow map and FTLE functions --------------------------------- Create functions to compute flow maps and FTLE using standard SciPy/Numpy methods. Uses scipy.integrate.odeint (implements LSODA method) for particle integration. The scipy.integrate.solve_ivp function is newer and allows the use of other solvers but odeint is faster even when solve_ivp uses LSODA as its method. .. GENERATED FROM PYTHON SOURCE LINES 88-120 .. code-block:: Python tspan = np.array([t0, t0 + T]) def scipy_odeint_flowmap_par(t0, y0): tspan = np.array([t0, t0 + T]) sol = odeint(odeint_fun, y0, tspan, rtol=1e-6, atol=1e-8) flowmap = sol[-1, :] return flowmap def numpy_ftle_par(fm, inds): i, j = inds absT = abs(T) dxdx = (fm[i + 1, j, 0] - fm[i - 1, j, 0]) / (2 * dx) dxdy = (fm[i, j + 1, 0] - fm[i, j - 1, 0]) / (2 * dy) dydx = (fm[i + 1, j, 1] - fm[i - 1, j, 1]) / (2 * dx) dydy = (fm[i, j + 1, 1] - fm[i, j - 1, 1]) / (2 * dy) off_diagonal = dxdx * dxdy + dydx * dydy C = np.array([[dxdx**2 + dydx**2, off_diagonal], [off_diagonal, dxdy**2 + dydy**2]]) max_eig = np.linalg.eigvalsh(C)[-1] if max_eig > 1: ftle = 1 / (2 * absT) * log(max_eig) else: ftle = 0 return ftle .. GENERATED FROM PYTHON SOURCE LINES 121-126 Compute SciPy/Numpy flow map, FTLE ---------------------------------- Compute flowmap, FTLE, and calculate run times for the SciPy/NumPy implementation. For this problem on this hardware, computing flow map and FTLE parallel in space (as opposed to parallel in time) was the faster implementation. .. GENERATED FROM PYTHON SOURCE LINES 126-172 .. code-block:: Python # set initial conditions n = 2 t0span = np.linspace(0, 0.1, n) [X, Y] = np.meshgrid(x, y, indexing="ij") Y0 = np.column_stack((X.ravel(), Y.ravel())) sftle = np.zeros((n, nx - 2, ny - 2), np.float64) # set parallel pool to use maximum number of threads for this hardware, # open pool num_threads = 8 pl = Pool(num_threads) # create inds to pass to ftle function xinds = np.arange(1, nx - 1) yinds = np.arange(1, ny - 1) [I, J] = np.meshgrid(xinds, yinds, indexing="ij") inds = np.column_stack((I.ravel(), J.ravel())) # compute flowmap and ftle parallel in space sfmtt = 0 sftt = 0 for k, t0 in enumerate(t0span): ks = time.perf_counter() func = partial(scipy_odeint_flowmap_par, t0) res = np.array(pl.map(func, Y0)).reshape(nx, ny, 2) kf = time.perf_counter() sfmtt += kf - ks fks = time.perf_counter() func2 = partial(numpy_ftle_par, res) sftle[k, :, :] = np.array(pl.map(func2, inds)).reshape(nx - 2, ny - 2) fkf = time.perf_counter() sftt += fkf - fks pl.close() pl.terminate() print("SciPy/NumPy flowmap and FTLE took " + f"{sfmtt + sftt:.5f} seconds for {n} iterates") print("Mean time for SciPy/NumPy flowmap and FTLE -- " + f"{(sfmtt + sftt) / n:.5f} seconds\n") print(f"Scipy flowmap took {sfmtt:.5} seconds for {n:1d} iterates") print(f"Mean time for Scipy flowmap -- {sfmtt / n:.5} seconds\n") print(f"NumPy ftle took {sftt:.5} seconds for {n:1d} iterates") print(f"Mean time for NumPy ftle -- {sftt / n:.5} seconds\n") .. rst-class:: sphx-glr-script-out .. code-block:: none SciPy/NumPy flowmap and FTLE took 820.20150 seconds for 2 iterates Mean time for SciPy/NumPy flowmap and FTLE -- 410.10075 seconds Scipy flowmap took 820.06 seconds for 2 iterates Mean time for Scipy flowmap -- 410.03 seconds NumPy ftle took 0.14228 seconds for 2 iterates Mean time for NumPy ftle -- 0.071141 seconds .. GENERATED FROM PYTHON SOURCE LINES 173-178 Compute NumbaCS flow map, FTLE -------------------------------- Compute flowmap, FTLE, and calculate run times for the NumbaCS implementation. For this problem on this hardware, computing flow map and FTLE parallel in space (as opposed to parallel in time) was the faster implementation. .. GENERATED FROM PYTHON SOURCE LINES 178-223 .. code-block:: Python ftle = np.zeros((n, nx, ny), np.float64) # first call and record warmup times wfm = time.perf_counter() flowmap_wu = flowmap_grid_2D(funcptr, t0, T, x, y, params) wu_fm = time.perf_counter() - wfm print(f"Flowmap with warm-up took {wu_fm:.5f} seconds") wf = time.perf_counter() ftle[0, :, :] = ftle_grid_2D(flowmap_wu, T, dx, dy) wu_f = time.perf_counter() - wf print(f"FTLE with warm-up took {wu_f:.5f} seconds") # initialize runtime counters fmtt = wu_fm ftt = wu_f # loop over initial times, compute flowmap and ftle for k, t0 in enumerate(t0span[1:]): ks = time.perf_counter() flowmap = flowmap_grid_2D(funcptr, t0, T, x, y, params) kf = time.perf_counter() fmtt += kf - ks fks = time.perf_counter() ftle[k, :, :] = ftle_grid_2D(flowmap, T, dx, dy) fkf = time.perf_counter() ftt += fkf - fks print("NumbaCS flowmap and FTLE took " + f"{fmtt + ftt:.5f} for {n:1d} iterates") print(f"Mean time for flowmap and FTLE -- {(fmtt + ftt) / n:.5f} seconds (w/ warmup)") print( "Mean time for flowmap and FTLE -- " + f"{(fmtt - wu_fm + ftt - wu_f) / (n - 1):.5f} seconds (w/o warmup)\n" ) print(f"NumbaCS flowmap_grid_2D took {fmtt:.5f} seconds for {n:1d} iterates") print(f"Mean time for flowmap_grid_2D -- {fmtt / n:.5f} seconds (w/ warmup)") print( "Mean time for flowmap_grid_2D -- " + f"{(fmtt - wu_fm) / (n - 1):.5f} seconds (w/o warmup)\n" ) print(f"NumbaCS ftle_grid_2D took {ftt:.5f} seconds for {n:1d} iterates") print(f"Mean time for ftle_grid_2D -- {ftt / n:.5f} seconds (w/ warmup)") print(f"Mean time for ftle_grid_2D -- {(ftt - wu_f) / (n - 1):.5f} seconds (w/o warmup)") .. rst-class:: sphx-glr-script-out .. code-block:: none Flowmap with warm-up took 1.69785 seconds FTLE with warm-up took 0.00160 seconds NumbaCS flowmap and FTLE took 1.83230 for 2 iterates Mean time for flowmap and FTLE -- 0.91615 seconds (w/ warmup) Mean time for flowmap and FTLE -- 0.13285 seconds (w/o warmup) NumbaCS flowmap_grid_2D took 1.82910 seconds for 2 iterates Mean time for flowmap_grid_2D -- 0.91455 seconds (w/ warmup) Mean time for flowmap_grid_2D -- 0.13125 seconds (w/o warmup) NumbaCS ftle_grid_2D took 0.00320 seconds for 2 iterates Mean time for ftle_grid_2D -- 0.00160 seconds (w/ warmup) Mean time for ftle_grid_2D -- 0.00159 seconds (w/o warmup) .. GENERATED FROM PYTHON SOURCE LINES 224-231 Compare timings --------------- Compare timings and quantify speed-up. The second and third columns quantify the speed-up gained using NumbaCS. The second column includes warm-up time, the speed-up would increase as *n* grows larger. The third column ignores the warm-up time and quantifies the speed-up as *n* goes to infinity and the warm-up time becomes negligible. This represents the theoretical speed-up. .. GENERATED FROM PYTHON SOURCE LINES 231-256 .. code-block:: Python stt = sfmtt + sftt ntt = fmtt + ftt stpi = (sfmtt + sftt) / n ntpi = (ntt - wu_fm - wu_f) / (n - 1) d1 = 5 d2 = 2 data = [ [round(stt, d1), "--", "--"], [round(ntt, d1), round(stt / ntt, d2), round(stpi / ntpi, d2)], ] times = [f"total time (n={n})", "speedup", "speedup (n->inf)"] methods = ["SciPy/NumPy", "NumbaCS"] format_row = "{:>25}" * (len(data[0]) + 1) print(format_row.format("", *times)) for name, vals in zip(methods, data): print(format_row.format(name, *vals)) .. rst-class:: sphx-glr-script-out .. code-block:: none total time (n=2) speedup speedup (n->inf) SciPy/NumPy 820.2015 -- -- NumbaCS 1.8323 447.63 3087.04 .. GENERATED FROM PYTHON SOURCE LINES 257-266 .. note:: The SciPy interpolation package creates a bottleneck when used to solve odes and has a large effect on the overall runtime. For this reason, we only run for 2 iterates or the code would take much too long. As *n* increaes, the speed-up would increase quite quickly as the warm-up time of the NumbaCS implementation becomes less significant. Regardless, the NumbaCS implementation achieves a drastic speed-up when used on numerical velocity data. This is largely achieved by the numbalsoda and interpolation packages, both of which utilize Numba. .. rst-class:: sphx-glr-timing **Total running time of the script:** (13 minutes 42.299 seconds) .. _sphx_glr_download_auto_examples_time_series_plot_qge_numbacs_vs_scipy_reduced.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_qge_numbacs_vs_scipy_reduced.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_qge_numbacs_vs_scipy_reduced.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_qge_numbacs_vs_scipy_reduced.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_