Lean 4 (Meta)programming Cookbook

Deadlocking the Task System🔗

Contributors: Anirudh Gupta

Here we describe about deadlocks and how to prevent yourself from falling into this trap. This is less of a recipe but more of a conceptual understanding about blindly spawning too many Tasks.

To know basics of about Tasks, check out Spawning Tasks and Worker Threads before this.

What is a Deadlock? (The Stuck Pizza Shop)🔗

Imagine a pizza shop with only 4 chefs. These chefs are the Worker Threads. They are the only ones who can actually cook.

A Deadlock happens when all the chefs stop working because they are waiting for each other. Imagine 4 customers order a "Mystery Pizza."

  1. Each chef starts making the dough.

  2. Then, each chef realizes they need a secret sauce made by another chef.

  3. The Mistake: Instead of doing other work, every chef stands perfectly still with their hands out, saying: "I will not move until I get my sauce!"

Because all 4 chefs are standing still waiting, there is nobody left to actually cook the sauce. The shop is stuck forever. In programming, we call this Thread Starvation.

The Deadlocked Code🔗

In this example, we try to run 100000 tasks, this will throw an error as mentioned below. Since today machine's are modern, more powerful with multithreading and multicore processing, this number increases before the thread creation stops.

def potentialDeadlock (n : Nat := 100000) : IO Unit := do -- We try to start n tasks let tasks (List.range n).mapM fun i => IO.asTask do let subTask IO.asTask (pure i) -- ERROR: IO.wait blocks the Chef (Thread). -- If all Chefs are waiting here, -- nobody can start the subTask! match ( IO.wait subTask) with | .ok val => pure (val+1) | .error e => throw e -- The program will likely hang here forever for t in tasks do match ( IO.wait t) with | .ok res => IO.println s!"Result: {res}" | .error e => IO.println s!"Error: {e}" /- libc++abi: terminating due to uncaught exception of type lean::exception: failed to create thread -/ -- #eval potentialDeadlock

Why it fails: When you call IO.wait inside a task, you are telling the Worker Threads to sit down and wait. Since the number of threads is finite (limited), once they are all "sitting and waiting," there is no one left to run the subTask.

Solution Using IO.bindTask (The "Sticky Note" Way)🔗

The Safe Solution is to use Asynchronous Composition. Instead of making a chef wait, we give them a "Sticky Note."

When a chef finishes the dough, they write a note: "When the sauce is ready, whoever is free should finish this pizza." Then, the chef leaves the kitchen so another chef can use their spot to make the sauce!

def safeFromDeadlock (n : Nat := 1000000) : IO Unit := do let tasks (List.range n).mapM fun i => do let t1 IO.asTask (pure i) -- Use bindTask to "chain" the next part. -- does NOT block a thread but registers a callback. IO.bindTask t1 fun | .ok val => IO.asTask (pure (val + 1)) | .error e => throw e -- Now it's safe to wait from the "Outside" (Main Thread) for t in tasks do match ( IO.wait t) with | .ok res => IO.println s!"Result: {res}" | .error e => IO.println s!"Error: {e}" -- No error here, but it will take time since `n` is huge. -- #eval safeFromDeadlock

Why this is Better🔗

  • No Waiting: IO.bindTask doesn't make a chef stand still. It tells the shop manager to handle the hand-off later.

  • Thread Recycling: As soon as the first part of the task is done, the Worker Thread is released. It can immediately go back to the pool to work on the next task or a sub-task.

  • Efficiency: This allows you to handle thousands of tasks even if you only have a few CPU cores, because no thread is ever wasted just "sitting and waiting."