Lean 4 (Meta)programming Cookbook

Handling JSONL File๐Ÿ”—

Contributors: Anirudh Gupta, swarnava1207

JSONL (JSON Lines) is a format where each line is a valid JSON object. This is particularly useful for large datasets as it allows for stream processing and is less sensitive to file corruption than a single large JSON array. Check out the section on JSON for working with regular JSON files.

How to Read JSONL files๐Ÿ”—

To read a JSONL file, we use IO.FS.lines to get an array of strings and then parse each non-empty line.

def readJsonlFile (path : System.FilePath) : IO (Array Json) := do let lines โ† IO.FS.lines path let mut result := #[] for line in lines do let line := line.`String.trim` has been deprecated: Use `String.trimAscii` instead Note: The updated constant has a different type: String โ†’ String.Slice instead of String โ†’ Stringtrim if line.isEmpty then continue match Json.parse line with | .ok j => result := result.push j | .error err => throw <| IO.userError s!"Failed to parse line: {err}" return result

How to Write JSONL files๐Ÿ”—

When writing JSONL, each object must be rendered as a single line without internal newlines. You can use Lean.Json.compress to ensure the output is compact.

  1. Writing an array of JSON objects

def writeJsonlFile (path : System.FilePath) (data : Array Json) : IO Unit := do let lines := data.map (ยท.compress) IO.FS.writeFile path (String.intercalate "\n" lines.toList ++ "\n")
  1. Writing custom structures to JSONL

For large logs or datasets, it is better to use a handle and write line by line.

structure LogEntry where timestamp : String level : String message : String deriving ToJson def writeLogs (path : System.FilePath) (logs : Array LogEntry) : IO Unit := do IO.FS.withFile path .write fun handle => do for log in logs do handle.putStrLn (toJson log).compress