Summary
apm install against an Artifactory VCS proxy fronting GitLab cannot resolve projects that live at 3+ subgroup depth (e.g. group/subgroup/project, group/sub-a/sub-b/project). The downloader hits HTTP 404 because parse-time chops the path at the second segment and treats the rest as an in-repo virtual sub-path — the proxy then receives the wrong archive URL.
PR #1472 addresses this by replacing the parse-time heuristic with an authoritative install-time boundary probe.
Background
JFrog Artifactory exposes upstream Git hosts via a VCS-remote endpoint:
https://<artifactory-host>/<repo-key>/<owner>/<project>/archive/refs/heads/<ref>.zip
When the upstream behind a VCS remote is GitHub, paths always have the fixed owner/repo shape (two segments). When the upstream is GitLab, a project can sit at any subgroup depth:
GitHub : acme/widget
GitLab : acme/widget # flat
GitLab : acme/platform/widget # 1-level subgroup
GitLab : acme/platform/auth/widget # 2-level subgroup
GitLab : acme/platform/auth/v2/widget # arbitrary depth, no upper bound
APM exposes this proxy with two env vars:
PROXY_REGISTRY_URL=https://art.example.com/artifactory/<repo-key> — base URL for the proxy.
PROXY_REGISTRY_ONLY=1 — strict mode; refuse direct VCS fallback.
The bug
For a 3+ segment dep under proxy-only mode:
# apm.yml
dependencies:
apm:
- group/subgroup/project#main
PROXY_REGISTRY_URL=https://art.example.com/artifactory/apm \
PROXY_REGISTRY_ONLY=1 \
apm install
APM (pre-fix) parses group/subgroup/project as owner=group, repo=subgroup, virtual_path=project — treating the third segment as an in-repo sub-path. It then asks the proxy for:
https://art.example.com/artifactory/apm/group/subgroup/archive/refs/heads/main.zip
which 404s on the proxy because no project sits at group/subgroup — the real project is one level deeper at group/subgroup/project. The install fails:
Failed to download package group/subgroup#main from Artifactory
(art.example.com/artifactory/apm). Last error: HTTP 404 ...
Why parse-time heuristics can't solve this
Earlier attempts tried to detect the boundary by inspecting segments for "well-known" marker directory names (skills/, prompts/, agents/, collections/, instructions/) and virtual file extensions (.prompt.md, .instructions.md, .chatmode.md, .agent.md). The list-based approach has two structural problems:
- Marker names are ambiguous. A GitLab subgroup or repo legitimately named
agents (or prompts, etc.) is indistinguishable from the marker. Parse-time can't tell group/agents/project (where agents is a real subgroup) apart from group/agents/foo (where agents/foo is a virtual sub-path of repo group).
- The hard-coded list drifts. Every new APM primitive that introduces a marker has to be added to the constants in two places (the parser and the resolver). New file extensions need the same dual-update.
The result was a parse-time guess that was wrong often enough for nested-group paths that an end-user apm install would 404 with a confusing error pointing at the wrong owner/repo split.
The fix (PR #1472)
The boundary is determined at install time, not at parse time, by HEAD-probing the Artifactory archive URLs:
- Enumerate every plausible
(owner, repo, virtual_path) split, shallow-first.
- For each candidate,
HEAD the archive URL on the proxy (allow_redirects=False so a Bearer token can't leak cross-host on a redirect).
- The first candidate that responds
2xx or 3xx is the verified boundary.
- Rebuild the dependency reference at that split.
If every candidate is rejected the resolver raises — explicitly distinguishing "missing repo" (every 4xx) from "auth problem" (every 401/403) so a misconfigured PROXY_REGISTRY_TOKEN no longer masquerades as a missing repo. There is no silent fallback to a guess.
The mechanism mirrors the existing native-GitLab pattern (_try_resolve_gitlab_direct_shorthand) but uses the archive URL itself as the existence signal, so no separate metadata API call is needed against the proxy.
Coverage:
- Mode 1 — explicit FQDN deps (
<artifactory-host>/<prefix>/<owner>/<project>[/<more>]).
- Mode 2 — bare shorthand (
<owner>/<project>[/<more>]) under PROXY_REGISTRY_URL + PROXY_REGISTRY_ONLY=1. Audience-correct auth: Mode 2 uses the proxy's own PROXY_REGISTRY_TOKEN, not the upstream Git host token.
For users who need an explicit, deterministic answer without probing (e.g. air-gapped CI), the // empty-segment notation marks the repo/virtual boundary unambiguously:
<artifactory-host>/<prefix>/<owner>/<deep>/<project>//<virtual/sub-path>
Linked PR
Summary
apm installagainst an Artifactory VCS proxy fronting GitLab cannot resolve projects that live at 3+ subgroup depth (e.g.group/subgroup/project,group/sub-a/sub-b/project). The downloader hits HTTP 404 because parse-time chops the path at the second segment and treats the rest as an in-repo virtual sub-path — the proxy then receives the wrong archive URL.PR #1472 addresses this by replacing the parse-time heuristic with an authoritative install-time boundary probe.
Background
JFrog Artifactory exposes upstream Git hosts via a VCS-remote endpoint:
When the upstream behind a VCS remote is GitHub, paths always have the fixed
owner/reposhape (two segments). When the upstream is GitLab, a project can sit at any subgroup depth:APM exposes this proxy with two env vars:
PROXY_REGISTRY_URL=https://art.example.com/artifactory/<repo-key>— base URL for the proxy.PROXY_REGISTRY_ONLY=1— strict mode; refuse direct VCS fallback.The bug
For a 3+ segment dep under proxy-only mode:
APM (pre-fix) parses
group/subgroup/projectasowner=group,repo=subgroup,virtual_path=project— treating the third segment as an in-repo sub-path. It then asks the proxy for:which 404s on the proxy because no project sits at
group/subgroup— the real project is one level deeper atgroup/subgroup/project. The install fails:Why parse-time heuristics can't solve this
Earlier attempts tried to detect the boundary by inspecting segments for "well-known" marker directory names (
skills/,prompts/,agents/,collections/,instructions/) and virtual file extensions (.prompt.md,.instructions.md,.chatmode.md,.agent.md). The list-based approach has two structural problems:agents(orprompts, etc.) is indistinguishable from the marker. Parse-time can't tellgroup/agents/project(whereagentsis a real subgroup) apart fromgroup/agents/foo(whereagents/foois a virtual sub-path of repogroup).The result was a parse-time guess that was wrong often enough for nested-group paths that an end-user
apm installwould 404 with a confusing error pointing at the wrong owner/repo split.The fix (PR #1472)
The boundary is determined at install time, not at parse time, by HEAD-probing the Artifactory archive URLs:
(owner, repo, virtual_path)split, shallow-first.HEADthe archive URL on the proxy (allow_redirects=Falseso a Bearer token can't leak cross-host on a redirect).2xxor3xxis the verified boundary.If every candidate is rejected the resolver raises — explicitly distinguishing "missing repo" (every 4xx) from "auth problem" (every 401/403) so a misconfigured
PROXY_REGISTRY_TOKENno longer masquerades as a missing repo. There is no silent fallback to a guess.The mechanism mirrors the existing native-GitLab pattern (
_try_resolve_gitlab_direct_shorthand) but uses the archive URL itself as the existence signal, so no separate metadata API call is needed against the proxy.Coverage:
<artifactory-host>/<prefix>/<owner>/<project>[/<more>]).<owner>/<project>[/<more>]) underPROXY_REGISTRY_URL+PROXY_REGISTRY_ONLY=1. Audience-correct auth: Mode 2 uses the proxy's ownPROXY_REGISTRY_TOKEN, not the upstream Git host token.For users who need an explicit, deterministic answer without probing (e.g. air-gapped CI), the
//empty-segment notation marks the repo/virtual boundary unambiguously:Linked PR
fix(artifactory): deterministic boundary probe for nested GitLab paths