feature/sentinel-ptr-type #2

Merged
markus merged 19 commits from feature/sentinel-ptr-type into unstable 2026-04-24 22:36:44 +02:00
Owner

Deliverables:

  • Root-cause fix for [*:0]T parser misclassification (new sentinel_ptr_type NodeKind).
  • Full audit parity across 15+ branch points in compiler/lower, CID, semantic layers, :service profile.
  • Narrow is_cstring_return child-count workaround deleted.
  • Two new test suites wired into CI (5 AST tests + 3 IR tests).
  • Bonus: documented multi-cstring segfault in std/os/fs.jan resolved.
  • Pre-sprint repair committed (Token re-export in libjanus.zig) — without it, unstable HEAD doesn't build.
Deliverables: - Root-cause fix for [*:0]T parser misclassification (new sentinel_ptr_type NodeKind). - Full audit parity across 15+ branch points in compiler/lower, CID, semantic layers, :service profile. - Narrow is_cstring_return child-count workaround deleted. - Two new test suites wired into CI (5 AST tests + 3 IR tests). - Bonus: documented multi-cstring segfault in std/os/fs.jan resolved. - Pre-sprint repair committed (Token re-export in libjanus.zig) — without it, unstable HEAD doesn't build.
Materialises the parser stack whose imports were wired in 84133faa but
whose source files were never committed. Without these, a clean checkout
cannot build — build.zig:287 and compiler/libjanus/libjanus.zig:42
reference compiler/parserstack.zig, which in turn imports
compiler/lexer/, compiler/cst/, and compiler/lower/.

The stack itself:
  - compiler/lexer/{lexer,tokens,keywords}.zig — profile-gated tokenizer
    traceable to PARSER-GRAMMAR.md §2.
  - compiler/cst/{parser,nodes,recovery}.zig — Pratt CST parser with
    explicit recovery actions, traceable to PARSER-GRAMMAR.md §4.
  - compiler/lower/{desugar,lowerer}.zig — CST to QTJIR lowering layer.
  - compiler/parserstack.zig — single-module aggregator required by
    Zig 0.16 module hygiene; colocates the three layers.

Verified: test-parser-trivia passes.
Two bugs in the `janus cid` computeSource path (Gap 6, P1 — documented
as FIXED in COMPILER_GAPS.md for this session):

1. The defer ran too early. `if (source_owned) { defer free(...); }`
   scopes the defer to the inner block, so the memory was freed before
   the computeSource / toString calls that use it — segfault on any
   invocation that read from a file rather than stdin.
   Fix: `defer if (source_owned) free(...);` keeps the guard but lets
   the defer run at end of the enclosing scope.

2. On computeSource failure the command returned normally, so the shell
   saw exit code 0 while stderr said "Error computing CID". Fix: exit
   with status 1 so pipelines and `set -e` see the failure.

Gap 7 (missing parseSource) tracked separately in
Janus/.agents/reports/2026-04-24-cid-computesource-missing-parse.md.
Adds the fallback dispatch path at the end of the subcommand chain:
if the first argument is not a known subcommand but IS a readable
regular file, treat the invocation as `janus run <file> <rest>`.

This is the path the kernel takes when executing a file with
`#!/usr/bin/env janus` as shebang — it feeds us the absolute script
path, not `run <path>`. Without this fallback, every shebanged .jans
would hit the "Unknown command" branch.

Semantics match Unix conventions:
  - Everything after the script path becomes script args.
  - No flag consumption in this form. `--verbose` / `--trace` only
    flow through the explicit `janus run ...` form.
  - Non-zero exit codes propagate (clamped to u8 range, 1 otherwise).

Memory note: pairs with the Tier 1 shebang work shipped earlier today —
the tokenizer already skips a leading `#!` line at file start; this
closes the dispatch side of the same feature.
Lands the completed stdlib OS-layering sprint as one atomic unit.
All three phases ship together because the build.zig triple dispatch
references source paths that only exist once the std/sys/{linux,openbsd}
tree is in the repo — splitting phase-by-phase would leave intermediate
commits with dangling build references.

Doctrine anchor: Janus/.agents/doctrines/stdlib-os-layering.md

Layout changes — bridges collapse into per-triple backends:
  std/vfs_adapter.zig            → std/sys/linux/vfs_adapter.zig
                                   std/sys/openbsd/vfs_adapter.zig
  std/fs_atomic.zig              → std/sys/linux/fs_atomic.zig
  std/fs_temp.zig                → std/sys/linux/fs_temp.zig
  std/temp_fs.zig                → std/sys/linux/temp_fs.zig
  std/os/fs.zig                  → std/sys/linux/fs.zig
                                   std/sys/openbsd/fs.zig
  std/core/fs_ops.zig            → std/sys/linux/fs_ops.zig
                                   std/sys/openbsd/fs_ops.zig
  std/core/time.zig              → std/sys/linux/time.zig
                                   std/sys/openbsd/time.zig
  std/bridge/http_bridge.zig     → std/sys/linux/http.zig
                                   std/sys/openbsd/http.zig
  std/bridge/process_bridge.zig  → std/sys/linux/process.zig
                                   std/sys/openbsd/process.zig
  std/bridge/socket_bridge.zig   → std/sys/linux/net.zig
                                   std/sys/openbsd/net.zig
  std/bridge/sbi_daemon.zig      → std/sys/linux/sbi_daemon.zig
  std/ltp/bridge/l0_bridge.zig   → collapsed (subsumed by sys layer)
  std/net/bridge/net_bridge.zig  → collapsed (subsumed by sys/*/net.zig)

New facade layer (Janus-side, Advancement Loop):
  std/os/fs.jan      — fs facade, routes to std/sys/{triple}/fs.zig
  std/os/process.jan — process facade + CLI args (Tier 3)
  std/os/time.jan    — time facade, routes to std/sys/{triple}/time.zig

build.zig — canonical-triple gate (doctrine §2):
  * gateCanonicalTriple(target) panics on wasm-wasi (reserved but
    not implemented), warns on unsupported triples, silent-passes
    on linux-musl/linux-gnu/openbsd.
  * dispatchSysModule(triple, name, file) selects the right
    std/sys/{triple}/{file} source per build, replacing hardcoded
    per-OS module paths.
  * sys_random named module — dispatched so @import("sys_random")
    resolves to the right backend from inside grafted Zig files.
  * Noise test wires sys_random as a module dep so keys.zig routes
    entropy through the dispatched source.

compiler/qtjir — LLVM-C translate-c migration:
  * Zig 0.17 removed @cImport. New combined header
    compiler/qtjir/llvm_c.h is the single translate-c root;
    build.zig runs b.addTranslateC to produce the llvm_c module.
  * llvm_bindings.zig replaces the old @cImport block with
    @import("llvm_c"); other call sites are source-compatible.

src/pipeline.zig — named-module dependency resolver:
  * Scans grafted Zig sources for @import("<name>") and wires
    -M<name>=<path> + --dep <name> flags so the Zig compile can
    resolve named modules (noise, sys_random) from inside the
    graft root without escaping its module boundary.

Verified before commit: the work was already tested by the user
across Phase 1, Phase 2, and Phase 3 sessions (1 + 2a + 2b + 2c + 2d).
Out of scope for this commit, filed separately:
  * shebang tokenizer (compiler/libjanus/tokenizer.zig) — paired
    with bfeae80f, follow-up commit.
  * Gap 6 regression CI check (.forgejo/workflows/ci.yml) — paired
    with d5242256, follow-up commit.
Tokenizer side of the Tier 1 shebang feature (dispatch side landed in
bfeae80f). At the start of tokenize(), if the source begins with `#!`
at byte 0, consume everything through the first newline so the shebang
line never reaches scanToken() and produces spurious lexer errors.

Standard scripting semantics:
  - Only the FILE START is honored. `#!` appearing mid-file has no
    special meaning (Janus has no `#` comment syntax).
  - The newline after the shebang is also consumed so line numbers
    start at 1 on the first real line of code.

Without this change, a valid `#!/usr/bin/env janus` shebang would
produce a tokenizer error and fail the lex pass — making the dispatch
fallback in bfeae80f useless.
Pairs with d5242256 (Gap 6 cid computeSource double-free + exit code).
Adds an unmasked `./zig-out/bin/janus cid --expr "0"` call to the smoke
step — crucially WITHOUT the `|| echo ...` mask used by the surrounding
checks, because that pattern is exactly how the original regression hid
for weeks: the broken binary printed an error, returned 0, and CI
logged a cheerful fallback message.

If cid regresses again, the CI job fails loudly.
Add .sentinel_ptr_type to AstNode.NodeKind enum (in both compiler/astdb/core.zig
and the libjanus stub at compiler/libjanus/astdb/core.zig), placed adjacent to
.pointer_type as the parser comment at parser.zig:2844 always intended.

Add tests/test_sentinel_ptr_type_parse.zig: three characterization tests that
pin the AST shape for [*:0]u8, [*]u8, and []u8 parameter type annotations.
The [*:0]u8 test is intentionally RED — parser still emits .slice_type for
sentinel-bearing pointers (parser.zig:2845); Task 2 will change that emission.
The [*]u8 and []u8 tests are GREEN immediately.

Wire the new step as test-sentinel-ptr-type in build.zig following the
test-parser-graft / test-sema-imports pattern (janus_parser + astdb_core deps).
Flip the emission site at parser.zig:2846 from .slice_type to
.sentinel_ptr_type for the [*:sentinel]T many-pointer form.

Wire the now-green characterization test into the global test_step
in build.zig (was intentionally excluded while the suite was red).

All 5 tracked test targets remain fully green (118/118 tests pass);
all 3 AOT smoke demos build and run correctly. No destructive cascade
occurred — downstream lower.zig sites that need updating (parameter
lowering, isTypeAnnotation, semanticTypeForTypeNode, dead child-count
workaround) are catalogued in /tmp/janus-baseline/task-2-cascade.log
for Tasks 3-5.
Drop the is_cstring_return child-count hack from the return-type lowering
block (lower.zig:4421). Now that the parser emits .sentinel_ptr_type directly
(Task 2), the outer type-kind guard and inner if-chain both include it
alongside .pointer_type and .reference_type. The .slice_type branch now
handles only true []T slices. Add return-type and non-zero-sentinel
characterization tests (Tests 4-5). Update build.zig comments to reflect
Task 2 completion and Task 3+4 IR responsibility split.
Add .sentinel_ptr_type branch to the parameter-lowering dispatch in
lower.zig before the .slice_type branch. [*:sentinel]T parameters now
lower as flat pointers (type_name="ptr", is_slice=false), matching the
semantics of .pointer_type and .reference_type — no hidden _len companion
is injected on the callee side.

Add metadata-aware len-injection guard in lowerUserFunctionCall: before
injectArrayLenArg fires for each source argument, check the callee's
parameter metadata via the extern registry (for Zig FFI externs) and
already-lowered IR graphs (for user functions). Skip injection for any
argument position where the callee parameter is not is_slice. This closes
the call-site half of the cascade: string-literal args passed to [*:0]u8
parameters no longer generate a spurious hidden i64 length argument.

Silence lowerStatement UNHANDLED debug log noise: add a .sentinel_ptr_type
no-op arm to the lowerStatement switch so type-annotation nodes encountered
in statement position are silently skipped rather than being caught by the
else branch and logged.

Acceptance criterion met: facade_fs_demo builds and outputs 1/0/-1
(was failing with LLVM verification error: incorrect number of arguments).

Tests: add tests/test_sentinel_ptr_type_lower.zig (3 tests, mirrors
test_lower_arrays.zig harness) pinning the lowered IR parameter shape
for [*:0]u8, []u8, and [*]u8 parameters. Wired into test_step.
- isTypeAnnotation: add .sentinel_ptr_type (covers 8+ call sites incl.
  atIntCastTargetSemantic, body-walkers, const scanners)
- is_compound in typeNodeToLayoutString struct scanner: add .sentinel_ptr_type
- typeNodeToLayoutString pointer branch: extend to .sentinel_ptr_type
- typeKindToSemanticType: extend .pointer_type arm to .sentinel_ptr_type
- atIntCastTargetSemantic: extend .pointer_type arm to .sentinel_ptr_type
- lowerExpression type-in-expression fallback: add .reference_type and .sentinel_ptr_type
- Module const scanners (x2): prioritize compound pointer/sentinel types
- lowerUserFunctionCall Phase 2: add forward-reference comment
- lowerStatement no-op arm: update comment to reflect Task 5 audit complete
CID computation now recognises [*:sentinel]T as a type node, so
content-addressable hashing of function signatures with sentinel-pointer
return or parameter types is correct.
isTypeNode now classifies [*:sentinel]T as a type node during semantic
resolution; resolveType handles it alongside pointer_type and reference_type.
Sentinel pointers ([*:0]u8 etc.) are as fundamental as plain *T for Zig
interop — the same rationale that unblocked pointer_type applies here.
type_inference.zig: extend pointer_type arm to sentinel_ptr_type
Some checks failed
Validation / test (pull_request) Waiting to run
Forbidden Paths Guard / guard (push) Waiting to run
gRPC Smoke / smoke (push) Waiting to run
Validation / test (push) Waiting to run
gRPC Smoke / smoke-musl (push) Failing after 2s
Forbidden Paths Guard / guard (pull_request) Waiting to run
gRPC Smoke / smoke (pull_request) Waiting to run
gRPC Smoke / smoke-musl (pull_request) Failing after 2s
Strategic Release Pipeline / 🧪 Sandbox - Experimental Innovation (pull_request) Has been skipped
Strategic Release Pipeline / 🔥 Forge - Alpha Integration (pull_request) Has been skipped
Strategic Release Pipeline / 🛡️ Crucible - Beta Quality Assurance (pull_request) Has been skipped
Strategic Release Pipeline / 🛡️ Crucible - Beta Quality Assurance-1 (pull_request) Has been skipped
Strategic Release Pipeline / 🛡️ Crucible - Beta Quality Assurance-2 (pull_request) Has been skipped
Strategic Release Pipeline / 🏰 Fortress - Production Release (pull_request) Has been skipped
Strategic Release Pipeline / 🏰 Fortress - Production Release-1 (pull_request) Has been skipped
Strategic Release Pipeline / 🏰 Fortress - Production Release-2 (pull_request) Has been skipped
Strategic Release Pipeline / 🏰 Fortress - Production Release-3 (pull_request) Has been skipped
Strategic Release Pipeline / 🗿 Bedrock - Enterprise LTS (pull_request) Has been skipped
Strategic Release Pipeline / 🚀 Release Orchestration (pull_request) Has been skipped
941c638835
Sentinel pointers follow the same stub resolution path as plain *T and &T;
all three return void until proper recursive type resolution is implemented.
markus merged commit 350d7b9de7 into unstable 2026-04-24 22:36:44 +02:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
janus/janus-lang!2
No description provided.