Use pnpm's @pnpm/cli.default-reporter for terminal output (via NDJSON)

## Summary

Introduce a typed `Reporter` trait inside pacquet whose call sites emit
strongly-typed log events. Ship one implementation that serializes those
events to newline-delimited JSON in pnpm's `@pnpm/core-loggers` schema and
pipes them through pnpm's reporter (`@pnpm/cli.default-reporter`) to render
terminal output. This gets us pnpm-identical progress, lifecycle, stats,
deprecation, peer-dependency, and summary output without reimplementing a
non-trivial reporter in Rust — and leaves a clean seam for a future native
Rust reporter that consumes the same events with no JSON round-trip.

The roadmap (#299) already lists "Implement progress reporting to the
terminal that looks the same as the one printed by pnpm" under Stage 1. This
issue proposes the cheapest path to satisfying that line item.

## Why pnpm's reporter, rather than a native Rust one

- Rendering parity with pnpm is the cardinal rule of this project. Reusing
  pnpm's reporter makes parity automatic instead of a perpetual chase. Any
  visual change pnpm makes lands in pacquet's output for free.
- The reporter is not small: ~20 log channels (`pnpm:progress`,
  `pnpm:fetching-progress`, `pnpm:stage`, `pnpm:lifecycle`, `pnpm:stats`,
  `pnpm:deprecation`, `pnpm:peer-dependency-issues`, `pnpm:summary`,
  `pnpm:request-retry`, `pnpm:execution-time`, etc.), driven through rxjs
  with throttling, an `ansi-diff` repaint loop, and chalk/boxen formatting.
  Reimplementing it just to throw it away when pacquet integrates into the
  pnpm CLI directly would be wasteful.
- pnpm already supports `--reporter=ndjson`, so the NDJSON schema (defined by
  `@pnpm/core-loggers`) is a stable, documented contract. Anything that emits
  records in that schema is automatically consumable by
  `@pnpm/cli.default-reporter`. We don't have to invent anything; we have to
  match.
- Once pacquet is integrated into the pnpm CLI as an install backend (the
  Integration Milestone in the roadmap), pnpm's reporter is what will be
  consuming pacquet's log events anyway. Aligning the schema now is a
  prerequisite for that work, not extra effort.

## Architecture: typed events, pluggable sinks

The crucial point is that **emission is a typed-event API, NDJSON is one
sink**. Call sites never construct JSON; they hand a Rust enum to a
`Reporter` trait. The wire format lives in the sink, not in the call sites.

`Reporter` follows the repo's established dependency-injection pattern
(#339): capability methods take **no `&self`** and are associated functions,
production sinks are unit structs, and call sites bind the implementation
through a generic parameter. This gives zero-cost dispatch (monomorphised
and inlined) and matches the shape already used in `crates/modules-yaml`
and `crates/cmd-shim`. Reporter-internal state — throttle map, MPSC sender
to the writer task, etc. — lives in module-level `static`s, not in `self`.

```rust
// Mirrors `@pnpm/core-loggers` 1:1, one variant per pnpm log channel.
#[derive(Serialize)]
#[serde(tag = "name", rename_all = "kebab-case")]
pub enum LogEvent {
    #[serde(rename = "pnpm:progress")]          Progress(ProgressLog),
    #[serde(rename = "pnpm:fetching-progress")] FetchingProgress(FetchingProgressLog),
    #[serde(rename = "pnpm:stage")]             Stage(StageLog),
    #[serde(rename = "pnpm:lifecycle")]         Lifecycle(LifecycleLog),
    // ... ~20 variants total
}

// Capability trait per the pnpm/pacquet#339 pattern: associated function, no `&self`.
pub trait Reporter {
    fn emit(event: &LogEvent);
}

// Today's sink. Unit struct; throttle state and the writer-task MPSC sender
// live in module-level `static`s initialised at startup. See Implementation
// notes for the concurrency model and error handling.
pub struct NdjsonReporter;
impl Reporter for NdjsonReporter {
    fn emit(e: &LogEvent) {
        // Serialize into a thread-local buffer, try_send onto a static MPSC
        // sender. Failures are swallowed (optionally `tracing::debug!`-d).
    }
}

// Tomorrow's sink: same shape, no JSON path. Renderer state (ansi-diff,
// throttling) also lives in module-level `static`s.
pub struct NativeReporter;
impl Reporter for NativeReporter {
    fn emit(e: &LogEvent) {
        match e {
            LogEvent::Progress(p)  => render_progress(p),
            LogEvent::Lifecycle(l) => render_lifecycle(l),
            // ...
        }
    }
}

// `--reporter=silent` is a no-op impl.
pub struct SilentReporter;
impl Reporter for SilentReporter {
    fn emit(_: &LogEvent) {}
}

// Call sites bind the generic; `R::emit` monomorphises away.
pub fn install<R: Reporter>(opts: InstallOpts) -> Result<()> {
    R::emit(&LogEvent::Stage(StageLog { /* ... */ }));
    // ...
}

// Production entry point turbofishes the chosen sink.
match reporter_type {
    ReporterType::Default | ReporterType::Ndjson => install::<NdjsonReporter>(opts),
    ReporterType::Silent                          => install::<SilentReporter>(opts),
}
```

Design rules that make this replacement-friendly:

- **Define `LogEvent` to mirror `@pnpm/core-loggers` 1:1.** Both sinks share
  the same vocabulary; a future Rust reporter doesn't have to invent a new
  event taxonomy, it just renders the same events differently.
- **Don't let the wire format leak into call sites.** camelCase field names,
  the `"pnpm:..."` name strings, the bunyan-style envelope (`level`, `time`,
  `prefix`) — all of that lives in serde attributes, not in `LogEvent`'s
  in-memory shape. Call sites work with idiomatic Rust enums.
- **Carry data, not pre-formatted strings.** If a call site emits
  `msg: "Resolving foo@1.2.3..."`, both sinks are stuck with that wording
  forever. Emit a `package_id` plus a structured stage and let the reporter
  format. (pnpm's existing logs are mostly already shaped this way; follow
  upstream's lead.)
- **Throttling, repainting, and aggregation belong in the sink, not the
  emission layer.** A future native sink will reimplement those (rxjs-style
  throttle, `ansi-diff`-equivalent repaint loop) — but that's reporter-
  internal work, doesn't touch any call site, and doesn't require any JSON
  round-trip.

## Implementation notes (Rust idioms)

Concerns a Rust-native reviewer would raise; folding the answers in up front
so the design captured here is what we'd build.

- **Capability shape (no `&self`, generic threading).** Follows the DI
  pattern documented in pnpm/pacquet#339 and shipped in `crates/modules-yaml` /
  `crates/cmd-shim`. `Reporter::emit` is an associated function, not a
  method; functions that emit take a generic `R: Reporter` and call
  `R::emit(...)`; the production entry point turbofishes the concrete
  sink (`install::<NdjsonReporter>(opts)`). Result: dispatch monomorphises
  away, no `dyn` vtable, no `Arc` clone, and the shape stays consistent
  with the rest of the workspace. Reporter-internal state (throttle map,
  MPSC sender) lives in module-level `static`s — there is one production
  sink active per process, so global state is appropriate. Tests follow
  pnpm/pacquet#339's `static`-per-test pattern: a unit-struct fake declared inside the
  `#[test]` fn, recording into a `static Mutex<Vec<LogEvent>>` declared in
  the same body.
- **Relationship to `tracing`.** pacquet already uses `tracing` for
  developer-facing diagnostics (`crates/diagnostics/src/local_tracing.rs`),
  and that stays. The community-reflex question is "why not a
  `tracing::Subscriber`?" — pnpm's schema is a closed channel-keyed union
  with rich payloads, while `tracing` events are open key-value bags whose
  `Visit` API forces every sink to re-parse the same fields. A typed enum
  dispatching through a capability trait is a better fit for user-facing
  output. `tracing` continues to handle developer-facing logs, with no
  overlap.
- **Error handling.** `emit` must not panic or propagate errors. A
  serialization or pipe-write failure is swallowed (optionally surfaced
  via `tracing::debug!`) rather than crashing an install.
- **Sync emit, async-friendly offload.** `emit` is sync but called from
  async contexts. The recommended NDJSON path is a bounded MPSC channel
  feeding one dedicated writer task: emitters pre-serialize into a
  thread-local buffer and `try_send` onto a `static OnceLock<Sender<_>>`,
  and the writer drains and `write_all`s to stderr in batches. This
  decouples emit latency from pipe-write latency and avoids serializing
  tokio tasks on a stderr lock. (`std::io::stderr().lock()` per emit is
  the simpler fallback if benchmarks show contention isn't a problem.)
- **No `async fn emit`.** Making `emit` async would force `.await` at every
  call site for a fundamentally fire-and-forget operation, and would also
  conflict with the no-`self` capability shape. The MPSC offload above
  gets the same decoupling without coloring every emission point.
- **`emit(&LogEvent)` vs. per-channel methods.** Cargo's `Shell` exposes
  per-purpose methods (`status()`, `note()`, `warn()`). We're constrained
  to mirror pnpm's ~20 channels, so a single `emit(&LogEvent)` is the
  right call: smaller surface area, simpler to fake in tests, and the
  construct-an-enum cost is sub-millisecond on a multi-second install.
  Per-channel methods would shave nanoseconds per emit at the cost of
  ~20 trait methods and a more rigid surface — not worth it.

## Proposed shape

1. Define `LogEvent` mirroring `@pnpm/core-loggers`' `Log` union, plus
   serde attributes that produce pnpm's wire format (bunyan envelope:
   `level`, `time`, `name`, plus channel-specific payload).
2. Define `trait Reporter` per pnpm/pacquet#339: associated `fn emit(event: &LogEvent)`
   with no `&self`. Functions that emit take a generic `R: Reporter` and
   call `R::emit(...)`. The production entry point turbofishes the chosen
   sink (`install::<NdjsonReporter>(opts)`).
3. Wire the existing call sites (tarball fetching, store linking, lifecycle
   scripts, install summary, etc.) to call `R::emit(...)`.
4. Implement `NdjsonReporter` as a unit struct that writes to stderr, one
   record per line. The writer-task MPSC sender, throttle map, and any
   other reporter-internal state live in module-level `static`s,
   initialised at startup (see Implementation notes).
5. By default, spawn `@pnpm/cli.default-reporter` (or pnpm itself in
   `--reporter=default` consumer mode) as a child process and pipe
   pacquet's stderr to its stdin. Open question — see below.
6. Support `--reporter=ndjson` (raw passthrough, no child process) and
   `--reporter=silent` (a `SilentReporter` unit struct with a no-op
   `emit`) to match pnpm's surface. Reporter selection happens at the
   entry point and threads through as the generic parameter — a small,
   bounded number of monomorphised copies of the install pipeline.
7. A native Rust reporter is explicitly out of scope for this issue. The
   capability shape makes it a drop-in addition (another unit-struct impl)
   later if and when the Integration Milestone is delayed or shelved.

## Performance

Cost decomposition per emit: enum construction + (NDJSON path) serialization
+ writer handoff. Dispatch is monomorphised away by the generic capability
shape, so there is no vtable cost. At install scale (~5K–10K events for a
1300-package install), the dominant Rust-side cost is allocation in the
event payloads, not serialization.

Hygiene that keeps the Rust path near-optimal in both phases:

1. **Intern identifiers; don't allocate per event.** A naive
   `LogEvent::Progress(ProgressLog { name: pkg.to_string(), version: ver.to_string() })`
   allocates twice per emit. Intern package names/versions to `Arc<str>`
   (or via `lasso` / `ustr`) once when a package enters the resolver, then
   pass references through every event. This is the single largest
   Rust-side win and pays off again after the native swap.
2. **Serialize to a thread-local `Vec<u8>` then one `write_all`.**
   `serde_json::to_writer` issues many small writes; buffering matches what
   `tracing-subscriber::fmt` does and avoids amplifying writer-lock
   contention.
3. **MPSC channel for backpressure decoupling.** See Implementation notes.
   Bounded channel; on overflow, decide between drop (lossy, no stalls) and
   apply-backpressure (matches pnpm's in-process behavior).
4. **Throttle on the emit side, not just downstream.** pnpm's reporter
   throttles per-package progress to 200ms; if pacquet emits every event,
   serialization + pipe cost is paid for events the JS side will drop.
   Dedup `pnpm:fetching-progress` per package within a small window before
   serializing. Care: throttle within a channel only, not across channels,
   so the JS reporter's diff/repaint expectations stay intact.

**Ceiling, today (NDJSON sink):** with the four rules above, Rust-side cost
is sub-1% of install wall time. The dominant cost is the JS reporter and
the pipe — structural, not fixable on the Rust side.

**Ceiling, after native swap:** the JS-side mask disappears, so Rust-side
cost matters more in absolute terms. The same hygiene continues to pay; the
renderer (`ansi-diff`-equivalent diff and repaint) becomes the dominant
cost and is reporter-internal work, unaffected by the trait shape. The
unified `emit(&LogEvent)` design loses on the order of nanoseconds per
event vs. per-channel methods — below the noise floor of an actual
install.

**Measurement.** This needs to be measured, not asserted.
`tasks/integrated-benchmark` is the vehicle. Benchmarks worth running once
an implementation lands:

- Cold install with `--reporter=silent` vs. `--reporter=ndjson` (raw, no
  Node) vs. `--reporter=default` (piped to Node) — isolates Rust-side cost
  from JS-side cost.
- The same three on a 1300-package install to expose any per-event
  linearity issue.
- Post-native-swap: native vs. ndjson-piped-to-Node, same install —
  quantifies how much of the gap is structural (Node) vs. fixable
  (renderer).

## Open questions

- **How is the JS reporter delivered?** Options:
  - Require a system Node + run `npx @pnpm/cli.default-reporter` (zero
    bundling, but adds a runtime dependency users have to satisfy).
  - Bundle a single-file `reporter.js` alongside the pacquet binary and
    require Node on PATH only.
  - Embed via an N-API addon once the Integration Milestone lands, removing
    the spawn entirely.
- **Schema version pinning.** `@pnpm/core-loggers` evolves. We should pin a
  specific pnpm version per pacquet release and add a CI check that the
  emitted schema still parses.
- **Error stream separation.** pnpm's reporter expects to control both
  stdout and stderr. We need to decide whether pacquet's own diagnostics
  (currently `tracing`-based) keep flowing to the user's stderr alongside
  rendered output, or get folded into the log stream as `pnpm:` events.

## References

- pnpm reporter package — `cli/default-reporter` ([v11 source](https://github.com/pnpm/pnpm/blob/3b12eb27de/cli/default-reporter/src/index.ts))
- Reporter dispatch wiring (`initReporter`, `--reporter=ndjson` branch) — [pnpm/src/reporter/index.ts](https://github.com/pnpm/pnpm/blob/3b12eb27de/pnpm/src/reporter/index.ts)
- Log type definitions — [`@pnpm/core-loggers`](https://github.com/pnpm/pnpm/tree/3b12eb27de/core/core-loggers/src)
- Roadmap line item — pnpm/pnpm#11633, "Implement progress reporting to the terminal that looks the same as the one printed by pnpm"



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use pnpm's @pnpm/cli.default-reporter for terminal output (via NDJSON) #344

Summary

Why pnpm's reporter, rather than a native Rust one

Architecture: typed events, pluggable sinks

Implementation notes (Rust idioms)

Proposed shape

Performance

Open questions

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Use pnpm's @pnpm/cli.default-reporter for terminal output (via NDJSON) #344

Description

Summary

Why pnpm's reporter, rather than a native Rust one

Architecture: typed events, pluggable sinks

Implementation notes (Rust idioms)

Proposed shape

Performance

Open questions

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions