iph.ar
  • home
  • Misc.
  • Gaming
  • (de)Engineering
    • Using LLMs to Simulate Wild Thinking and Convergence
    • Visualizing Song Structure with Self-Similarity Matrices
    • PyHPC – Self-Guided Parallel Programming Workshop with Python
      • 00 - How This Course Was Made
      • 01 - Multiprocessing
        • 1.1 - HPC landscape overview
        • 1.2 - Introduction to the multiprocessing module
        • 1.3 - Using Process, Queue, Pipe, and Pool
        • 1.4 - Introduction to concurrent.futures
        • 1.5 - Examples of CPU-bound tasks with performance profiling
        • 1.6 - Comparison with sequential execution
        • 1.7 - Synchronization and locking considerations
        • 1.8 - Profiling CPU-bound vs I/O-bound tasks
        • Solution: Parallel prime number calculation
        • Solution: Signal filtering comparison
      • 02 - Multithreading and GIL
      • 03 - MPI with mpi4py
      • 04 - GPU with PyCUDA/Numba
      • 05 - Parallel Libraries
    • Mapping Consonants as Percussion: A Small Experiment with Whisper and Audio Analysis
    • Semantic Self-Similarity or How I Split a Conversation into Scenes Using Language Models
    • Modeling the Noise: Building a Tinnitus Generator in Python
    • Making A Humble OpenGL Rotating Cube
  • IT
  • home
  • Misc.
  • Gaming
  • (de)Engineering
    • Using LLMs to Simulate Wild Thinking and Convergence
    • Visualizing Song Structure with Self-Similarity Matrices
    • PyHPC – Self-Guided Parallel Programming Workshop with Python
      • 00 - How This Course Was Made
      • 01 - Multiprocessing
        • 1.1 - HPC landscape overview
        • 1.2 - Introduction to the multiprocessing module
        • 1.3 - Using Process, Queue, Pipe, and Pool
        • 1.4 - Introduction to concurrent.futures
        • 1.5 - Examples of CPU-bound tasks with performance profiling
        • 1.6 - Comparison with sequential execution
        • 1.7 - Synchronization and locking considerations
        • 1.8 - Profiling CPU-bound vs I/O-bound tasks
        • Solution: Parallel prime number calculation
        • Solution: Signal filtering comparison
      • 02 - Multithreading and GIL
      • 03 - MPI with mpi4py
      • 04 - GPU with PyCUDA/Numba
      • 05 - Parallel Libraries
    • Mapping Consonants as Percussion: A Small Experiment with Whisper and Audio Analysis
    • Semantic Self-Similarity or How I Split a Conversation into Scenes Using Language Models
    • Modeling the Noise: Building a Tinnitus Generator in Python
    • Making A Humble OpenGL Rotating Cube
  • IT

1.2 - Introduction to the `multiprocessing` module

September 2025

Key Concept

The multiprocessing module allows you to run code in parallel using multiple processor cores, enabling faster execution of computationally intensive tasks. It achieves this by creating and managing separate processes, each with its own memory space.

The `multiprocessing` module in Python provides a way to bypass the limitations of the Global Interpreter Lock (GIL) and achieve true parallelism, allowing programs to utilize multiple CPU cores simultaneously. This is particularly beneficial for CPU-bound tasks, where the execution time is dominated by computations rather than waiting for I/O operations. Instead of running code sequentially, the `multiprocessing` module allows you to distribute work across multiple processes, effectively speeding up your program.

Topics

Process Creation: Creating new processes to execute tasks concurrently.

At its core, the `multiprocessing` module enables the creation of independent processes. A process is essentially a separate instance of the Python interpreter, with its own memory space and execution context. This isolation is crucial because it allows processes to run concurrently without interfering with each other. You create a process using the `Process` class, passing a function to be executed within that process. The `start()` method then initiates the execution of this function in the new process. The `Process` class also provides methods for retrieving the return value of the function executed in the process, and for handling exceptions that occur within the process. For example, you can define a function and then create a `Process` object that runs that function, and then retrieve the result when the process completes.

Process Communication: Mechanisms for processes to exchange data (e.g., queues, pipes).

Since processes have separate memory spaces, they cannot directly access each other's variables. Therefore, a mechanism for communication is required. The `multiprocessing` module offers several ways to facilitate this communication. These include:

  • Pipes: Pipes provide a unidirectional flow of data between processes. A Pipe object creates a connection between two processes, allowing one process to write data to the pipe and the other to read from it.
    • Queues: Queues are thread-safe data structures that allow processes to exchange data. They are particularly useful for passing data between processes that are not directly related.
      • Shared Memory: Shared memory allows processes to access the same region of memory. This is the fastest form of communication, but it requires careful synchronization to avoid race conditions. The multiprocessing module provides mechanisms for creating and managing shared memory objects.
        • Managers: Managers provide a way to share objects between processes, such as lists, dictionaries, and other data structures. They offer a convenient way to manage shared resources and avoid the complexities of manual synchronization.

          Process Pools: Simplified way to manage a group of worker processes.

          Creating and managing individual processes can be cumbersome, especially when dealing with a large number of tasks. The multiprocessing module provides the Pool class, which simplifies the process of distributing tasks across a group of worker processes. A Pool object creates a pool of worker processes and allows you to submit tasks to the pool, which will then be executed by the worker processes. The Pool class provides methods for map, apply_async, and imap, which allow you to submit tasks to the pool in different ways. mapapplies a function to each element of an iterable and returns a list of results. apply_async submits a single task to the pool and returns an AsyncResult object, which can be used to retrieve the result of the task. imap is similar to map, but it returns an iterator that yields results as they become available.

          Global Interpreter Lock (GIL): Understanding its limitations on true parallelism in Python.

          The Global Interpreter Lock (GIL) is a mechanism in the Python interpreter that allows only one thread to execute Python bytecode at a time, even on multi-core processors. This limitation can hinder the performance of multi-threaded programs, especially those that are CPU-bound. The multiprocessing module circumvents the GIL by creating separate processes, each with its own interpreter and memory space. This allows multiple processes to execute Python bytecode concurrently, effectively utilizing all available CPU cores. While the GIL prevents true parallelism within a single process, the multiprocessing module enables parallelism across multiple processes, leading to significant performance improvements for CPU-intensive tasks. It's important to note that communication between processes can introduce overhead, so it's crucial to carefully consider the trade-offs between parallelism and communication costs.

          Exercise

          Briefly consider a scenario where you could benefit from parallel processing. What tasks would be suitable?

          Answer: Imagine you’re working with large satellite images (e.g., 10,000 × 10,000 pixels). You need to apply a filter (like edge detection or Gaussian blur).

          • Why parallel? Each pixel operation is mostly independent of others. The image can be divided into tiles (e.g., 100 × 100 chunks). Multiple processes can work on different tiles at the same time, and then you stitch the results together.
            • Similar tasks:
              • Applying transformations to frames in a video.
                • Running simulations on a grid (e.g., heat diffusion).
                  • Large matrix multiplications (already heavily parallelized in libraries).

                  Parallel Image Tiles Example

                  from multiprocessing import Process, Queue
                  import numpy as np
                  
                  def process_tile(tile, q, idx):
                      # Example: simple inversion filter
                      q.put((idx, 255 - tile))
                  
                  if __name__ == "__main__":
                      img = np.random.randint(0, 256, (2000, 2000), dtype=np.uint8)  # fake grayscale image
                      tiles = np.array_split(img, 4)  # split into 4 chunks
                      q = Queue()
                      procs = []
                  
                      for i, tile in enumerate(tiles):
                          p = Process(target=process_tile, args=(tile, q, i))
                          p.start()
                          procs.append(p)
                  
                      results = [q.get() for _ in range(len(tiles))]
                      results.sort()  # sort by index
                      img_out = np.vstack([r[1] for r in results])  # stitch back
                  
                      for p in procs:
                          p.join()
                  
                      print("Processed image shape:", img_out.shape)
                  

                  Common Pitfalls

                  • Data Sharing: Remember that processes have separate memory spaces; data needs to be explicitly shared.
                    • Overhead: Creating and managing processes has overhead; it's not always faster for small tasks.

                      Best Practices

                      • Minimize Data Transfer: Reduce the amount of data passed between processes.
                        • Use Process Pools: Leverage process pools for efficient task distribution.
                          ⟵ 1.1 - HPC landscape overview 1.3 - Using `Process`, `Queue`, `Pipe`, and `Pool` ⟶
                          • © 2025 iph.ar