Each section includes timed examples sorted from slowest to fastest.


Use pathlib instead of os

Python's pathlib offers a set of abstractions for working with paths and in many cases it's much faster than the os module.

import pathlib
import os

directory = "parent/child/"
new_folder = "folder"
new_file = "file.txt"

Joining pathname components

%timeit os.path.join(directory, new_folder, new_file)
%timeit str(pathlib.Path(directory) / new_folder / new_file)
854 ns ± 13.5 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
7.33 µs ± 198 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Get the current working directory

%timeit os.getcwd()
%timeit pathlib.Path.cwd()
410 ns ± 9.45 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
3.88 µs ± 190 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Find the basename for a path

%timeit os.path.basename("/path/file.suffix")
%timeit pathlib.Path("/path/file.suffix").name
372 ns ± 7.28 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
2.6 µs ± 38.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Use NumPy for arrays

NumPy has a lot of built int functions that are highly optimized for array operations and usually faster than your own or a Python math implementation.

import math
import numpy as np

a = range(10000)
%timeit [i**2 for i in a]
%timeit [math.pow(i, 2) for i in a]
%timeit np.square(np.array(a))
1.84 ms ± 38.2 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
1.08 ms ± 26.2 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
583 µs ± 54.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit [i**0.5 for i in a]
%timeit [math.sqrt(i) for i in a]
%timeit np.sqrt(np.array(a))
909 µs ± 47.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
766 µs ± 20.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
542 µs ± 13.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Note: If you apply a math function using a builtin function like "map" it might be even faster than NumPy!
%timeit list(map(math.sqrt, a))
413 µs ± 7.19 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Use built in functions

Built in functions like map, min, max, all, etc. is faster than applying a function in a loop and here list comprehension is faster than a loop.

words = ["one", "two", "three"] * 1000


def loop_apply(words):
    L = []
    for word in words:
        L.append(word.upper())


def list_comprehension_apply(words):
    [i.upper() for i in words]


def map_apply(words):
    list(map(str.upper, words))
%timeit loop_apply(words)
%timeit list_comp_apply(words)
%timeit map_apply(words)
202 µs ± 15.9 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
134 µs ± 1.9 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
90.8 µs ± 1.63 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Importing libraries

Importing a library and then using a sub-functions is usually faster than specific imports, but be careful of larger libraries.

%timeit from math import sqrt; sqrt(50)
%timeit import math; math.sqrt(50)
459 ns ± 28.7 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
127 ns ± 6.92 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
%timeit from numpy import square; square(2)
%timeit import numpy; numpy.square(2)
1.07 µs ± 84.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
732 ns ± 7.72 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Note: When importing a larger library, like Pandas, gains can be minimal or even slower.
%timeit import pandas; pandas.DataFrame()
%timeit from pandas import DataFrame; DataFrame()
98.7 µs ± 2.19 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
101 µs ± 2.06 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

String formatting

F-string formatting is the cleanest, fastest solution.

%timeit str(12) + " is a number"
%timeit "{} is a number".format(12)
%timeit "%s is a number" % (12)
%timeit f"{12} is a number"
143 ns ± 1.83 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
143 ns ± 6.12 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
98.4 ns ± 8.2 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
80.4 ns ± 2.36 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

Note: %s formatting can be better for readability of longer strings.
long_string = (
    "This is a slightly longer string that needs %s, %s, %s, and %s in it."
    % (123, 456, 789, 101112)
)
long_string = f"This is a slightly longer string that needs {123}, {456}, {789}, and {101112} in it."

String concatenation

Use the join method for faster string concatenations.

%%timeit
output = ""
for word in words:
    output += word
165 µs ± 5.45 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%%timeit
output = "".join(words)
14.9 µs ± 434 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Nested iteration

Itertools is a great option for good readability, but list comprehensions are FAST!

import itertools

a = range(100)


def nested_for_loop(a, b, c):
    L = []
    for i in a:
        for j in b:
            for k in c:
                L.append((i, j, k))


def nested_itertools(a, b, c):
    L = []
    for p in itertools.product(a, b, c):
        L.append(p)


def nested_list_comprehension(a, b, c):
    L = [(i, j, k) for i in a for j in b for k in c]
%timeit nested_for_loop(a, a, a)
%timeit nested_itertools(a, a, a)
%timeit nested_list_comprehension(a, a, a)
87 ms ± 2.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
87.1 ms ± 10.2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
62 ms ± 939 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Initialization of lists and dictionaries

Using [] and {} to initialize an empty list or dictionary is faster the using the list() or dict() methods.

%timeit list()
%timeit []
57 ns ± 9.35 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
13.6 ns ± 0.293 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)
%timeit dict()
%timeit {}
70.1 ns ± 1.69 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
14 ns ± 0.355 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)