👀 AOT and JIT compilation


Just-in-time:

  • PyPy: compile the most frequent code
  • Translate some code to LLVM (Numba). then compile it

Ahead-of-time:

  • Python C-API: Writing code in C/C++/Rust and making an interface understandable for CPython
  • Cython (uses C-API of Python interpreter)
  • Numba can do AOT

📗 Back to the notebook…



🏁 Any questions?

[1]: %timeit -n100 X.dot(Y)

148 µs ± 6.04 µs per loop


🐍 Python: multiprocessing/multithreading


🐍 Python: threadingmultiprocessing/multi


🐍 thonPy: multi/multithreading/processing


Process

  • Process is a launched program
  • Every process has isolated resources:
    • virtual memory space
    • pointer to execution place
    • call stack
    • system resources, e.g. file descriptors
  • Alternative: a thread

Thread

  • Threads are executed independently
  • Threads are executed inside some process and can share memory space and system resources
  • Managed by OS

Concurrency vs. Parallelism


dEnqiA2.png


Regular stuff

import math
def integrate(f, a, b, *, n_iter=1000):
... acc=0
... step=(b-a)/n_iter
... for i in range(n_iter):
...     acc += f(a + i * step) * step
... return acc
...
>>> integrate(math.cos, 0, math.pi / 2) 1.0007851925466296
>>> integrate(math.sin, 0, math.pi) 1.9999983550656637

Multithreading!

from functools import partial
def integrate_faster(f, a, b, *, n_jobs, n_iter=1000):
    executor = ThreadPoolExecutor(max_workers=n_jobs)
    spawn = partial(executor.submit, integrate, f, 
                    n_iter=n_iter // n_jobs)
    step=(b-a)/n_jobs 
    fs=[spawn(a+i*step,a+(i+1)*step)
        for i in range(n_jobs)]
    return sum(f.result() for f in as_completed(fs))
 
>>> integrate_faster(math.cos, 0, math.pi / 2, n_jobs=2) 1.0003926476775074
>>> integrate_faster(math.sin, 0, math.pi, n_jobs=2) 1.9999995887664657

Benchmark

In [1]: timeit -n100
   ...: integrate_faster(math.cos, 0, math.pi / 2,
   ...:                 n_iter=10**6,
   ...:                 n_jobs=2)
100 loops, best of 3: 283 ms per loop
In [3]: cython
...: from libc.math cimport cos
...: def integrate(f, double a, double b, long n_iter):
...: #             ^
   ...:     cdef double
   ...:     cdef double
   ...:     cdef long i
   ...:     with nogil:
   ...:         for i in range(n_iter):
   ...:             acc += cos(a + i * step) * step
   ...:     return acc
In [3]: timeit -n100
   ...: integrate_faster(math.cos, 0, math.pi / 2,
   ...:                 n_iter=10**6, n_jobs=4)
100 loops, best of 3: 7.95 ms per loop

>>> import multiprocessing as mp
>>> p = mp.Process(target=countdown, args=(5, )) >>> p.start()
>>> 4 left
3 left
2 left
1 left
0 left
>>> p.name, p.pid
('Process-2', 65130)
>>> p.daemon
False
>>> p.join()
>>> p.exitcode
0

>>> def ponger(conn):
...     conn.send("pong")
...
>>> parent_conn, child_conn = mp.Pipe() >>> p = mp.Process(target=ponger,
... args=(child_conn, )) >>> p.start()
>>> parent_conn.recv()
'pong'
>>> p.join()

joblib

from joblib import Parallel, delayed
def integrate_faster(f, a, b, *, n_jobs, n_iter=1000, backend=None):
    step = (b - a) / n_jobs
    with Parallel(n_jobs=n_jobs,
                  backend=backend) as parallel: 
        fs = (delayed(integrate)(a + i * step,
                  a + (i + 1) * step,
                  n_iter=n_iter // n_jobs) 
              for i in range(n_jobs))
    return sum(parallel(fs))

🏁 Any questions?


Stuff to discuss

  • how to stay sane
  • practical deep learning
  • course work
  • exam

Useful links

In case you know something interesting, don’t hesitate to ping me!

Books

  1. Luciano Ramalho, Fluent Python, 1st edition, 2015.
  2. David M. Beazley, Brian K. Jones, Python Cookbook, 3rd edition, 2013.

Blogs

Python Tutorials – Real Python

Hynek’s Blog

Ship better Python software, faster

Courses

Learn computer programming | Online courses from JetBrains Academy

The Missing Semester of Your CS Education

GitHub Student Developer Pack

Many useful tools for developers. You can get this after confirming your education mail.

See education.github.com/pack.