[INFO][Security] Data handling & client IT sign-off — VPC install model

## Overview

**Type:** INFO — Security & data handling (client IT / finance review)
**Audience:** Client security teams, finance/compliance reviewers, delivery engineers
**Related:** [#13 Deployment](https://github.com/EngineXV/engineX/issues/13) · [#14 Storage](https://github.com/EngineXV/engineX/issues/14) · [#15 Integration + IT sign-off](https://github.com/EngineXV/engineX/issues/15) · [#4 Multi-tenant](https://github.com/EngineXV/engineX/issues/4) (Phase 2 — not required for VPC pilots)

This ticket is the **canonical security reference** for EngineX OSS deployed **in the client's cloud** (recommended pilot model). Use it for security questionnaires, IT sign-off, and finance due diligence.

---

## Executive summary (for security reviewers)

| Question | Answer |
|----------|--------|
| Where does customer data live? | **Client's cloud** — on disk at `~/.engine/` (or `ENGINE_HOME`) |
| Does EngineX require a customer database? | **No** — platform state is JSON files, not client Postgres |
| Is data sent to EngineX vendor cloud by default? | **No** — OSS runs entirely in client VPC unless optional Engine Cloud is configured |
| Multi-tenant isolation in OSS? | **No** — one install = one organization; separate install per client ([#14](https://github.com/EngineXV/engineX/issues/14)) |
| Are credentials encrypted at rest? | **Yes** — Fernet encryption when `ENGINE_CREDENTIAL_KEY` is set |
| Is the dashboard authenticated out of the box? | **No** — OSS `./engine serve` has **no built-in login**; client must use VPN/SSO/reverse proxy ([#15](https://github.com/EngineXV/engineX/issues/15) Section 3) |
| Does agent data go to LLM providers? | **Yes** — if using cloud LLM (Anthropic/OpenAI); prompts may contain business data — client chooses provider and DPA |
| SOC2 / ISO certified? | **Not claimed** — client evaluates risk for pilot; see checklist below |

---

## Deployment model (security boundary)

```mermaid
flowchart TB
 subgraph ClientVPC["Client VPC / private network"]
 Proxy["Reverse proxy TLS + SSO/VPN"]
 Eng["EngineX runtime CLI + optional :8787"]
 Vol["Encrypted volume ~/.engine/"]
 Eng --> Vol
 Proxy --> Eng
 end

 subgraph External["Outbound only (allowlisted)"]
 LLM["LLM API (client's choice)"]
 Int["Integrations Slack, Grafana, CRM, …"]
 end

 Users["Client users / approvers"] --> Proxy
 Eng --> LLM
 Eng --> Int
```

**Key principle:** EngineX vendor does **not** host customer production data in the default OSS model. The client controls network, disk encryption, access, and backup.

---

## Data classification

| Data type | Location | Encrypted at rest | Leaves client VPC? |
|-----------|----------|-------------------|---------------------|
| Session state, checkpoints | `~/.engine/agents/.../` | Volume encryption (client responsibility) | No |
| OAuth tokens, API keys | `~/.engine/credentials/` | **Yes** (Fernet + `ENGINE_CREDENTIAL_KEY`) | No |
| Credential bootstrap key | `~/.engine/secrets/credential_key` or env | File permissions / secret manager | No |
| Agent business inputs (documents, tickets) | Session memory + checkpoint JSON | Same as volume | Only if sent to **LLM** or **integration APIs** |
| Runtime logs | `~/.engine/.../runtime_logs/` | Volume encryption | No |
| LLM prompts/completions | Transient / provider-side | Provider-dependent | **Yes** — to chosen LLM vendor |

**Client business databases** (Postgres, warehouse, ERP) are accessed **only if an agent tool is wired to them** — not for EngineX platform storage ([#15](https://github.com/EngineXV/engineX/issues/15)).

---

## Credentials & secrets

### Encryption at rest

- **`ENGINE_CREDENTIAL_KEY`** — Fernet key; encrypts OAuth tokens and stored API keys
- Storage: `core/engine/credentials/` → `EncryptedFileStorage`, `CredentialStore.with_encrypted_storage()`
- Bootstrap: key from env, `~/.engine/secrets/credential_key`, or generated on first setup

### Secret injection (client responsibility)

- [ ] LLM keys (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`) via env or secret manager — **not** in git/images
- [ ] Integration secrets (Grafana, Slack, CRM) via env or encrypted vault
- [ ] OAuth client secrets (`HUBSPOT_CLIENT_ID` / `SECRET`, etc.) for dashboard Connect flow
- [ ] Rotate keys per environment (dev/staging/prod separate installs)

### What EngineX does NOT do (OSS today)

- No built-in secrets rotation scheduler
- No HSM integration (client can inject via env from their vault)
- No automatic redaction of PII in logs (operational discipline required)

---

## Network security

### Recommended production controls ([#15](https://github.com/EngineXV/engineX/issues/15) Section 3)

| Control | Requirement |
|---------|-------------|
| **Dashboard (`:8787`)** | Never public internet without TLS + auth |
| **Access path** | VPN, Zero Trust, or corporate network + reverse proxy |
| **TLS** | Terminated at nginx/ALB/Cloudflare in front of EngineX |
| **Bind address** | Prefer `127.0.0.1:8787` behind proxy |
| **Inbound firewall** | Deny by default; allow only proxy/VPN sources |
| **Outbound firewall** | Allowlist LLM endpoints + agreed integration APIs |
| **Headless-only mode** | No `:8787` exposure; outbound-only |

### OSS dashboard authentication gap (honest)

`./engine serve` today:

- **No** built-in user login / RBAC on the HTTP API
- **Mitigation:** Client IT places SSO (OIDC/SAML) or VPN in front of dashboard; or run headless only
- **Future:** Multi-tenant auth is [#4](https://github.com/EngineXV/engineX/issues/4) Phase 2 / Engine Cloud — **not required** for dedicated VPC installs

---

## Isolation & tenancy

| Model | Isolation mechanism | DB required? |
|-------|---------------------|--------------|
| **Client self-host (recommended)** | Separate install + separate `~/.engine` volume per client | No |
| **Dev vs prod** | Separate installs or separate volumes | No |
| **Multi-tenant SaaS** | Not in OSS — see [#4](https://github.com/EngineXV/engineX/issues/4) | Phase 2 |

**Never** run unrelated clients on one OSS install expecting tenant isolation — OSS does not enforce `tenant_id` on storage paths today.

---

## LLM & third-party data flow

```mermaid
sequenceDiagram
 participant Agent as EngineX agent
 participant Disk as ~/.engine (client VPC)
 participant LLM as LLM provider
 participant SaaS as Client integrations

 Agent->>Disk: Sessions, credentials, checkpoints
 Agent->>LLM: Prompts (may include business text)
 LLM-->>Agent: Completions
 Agent->>SaaS: API calls (Grafana, Slack, CRM)
```

**Client decisions:**

- [ ] Approved LLM vendor + **DPA** in place (Anthropic, OpenAI, or **local Ollama** in VPC for no external LLM)
- [ ] Data residency requirements → prefer in-VPC Ollama or approved regional API
- [ ] Minimize PII in prompts where possible
- [ ] Review integration scopes (OAuth scopes for HubSpot, Google, etc.)

---

## Human-in-the-loop (HITL)

- Approvers use dashboard (or terminal in dev) to approve/reject at `pause_nodes`
- Approval actions stored in session/checkpoint JSON on client disk
- **Access control:** Same as dashboard — VPN/SSO; named approver list is operational policy

---

## Backup, retention & deletion

| Item | Guidance |
|------|----------|
| **Backup** | Snapshot/replicate `~/.engine` volume (client backup tooling) |
| **Retention** | Client defines; no global retention policy in OSS |
| **Deletion** | Remove session dirs under `~/.engine/agents/<agent>/sessions/`; rotate credentials via dashboard/CLI |
| **Right to erasure** | Client controls disk — delete volume snapshot + session files |

---

## Security checklist (client IT sign-off)

Copy for security ticket / sign-off record:

### Infrastructure
- [ ] EngineX in **private subnet** (no unnecessary public IP)
- [ ] **Encrypted volume** for `~/.engine` / `ENGINE_HOME`
- [ ] Secrets from **cloud secret manager**, not baked into images or git
- [ ] Separate install for **dev / staging / prod**

### Application
- [ ] `ENGINE_CREDENTIAL_KEY` set and rotation documented
- [ ] `./engine setup-credentials` used for agent secrets
- [ ] Dashboard behind **TLS + VPN/SSO** (if UI enabled)
- [ ] Headless workers use least-privilege integration credentials (read-only DB where possible)

### Network
- [ ] Outbound allowlist documented (LLM + integrations)
- [ ] Inbound restricted to approved paths only

### LLM & compliance
- [ ] LLM vendor approved; DPA signed if required
- [ ] Alternative: **Ollama/local model** in VPC (no external LLM)
- [ ] Business data in prompts reviewed for pilot scope

### Operations
- [ ] Backup/restore tested for `~/.engine`
- [ ] Incident contacts documented (client ops + EngineX delivery)
- [ ] Hypercare period defined ([#15](https://github.com/EngineXV/engineX/issues/15) Section 4)

---

## Known limitations (OSS — disclose in reviews)

| Limitation | Mitigation |
|------------|------------|
| No dashboard login/RBAC | VPN/SSO proxy; or headless-only |
| Filesystem storage (no DB) | Volume backup; adequate for pilot scale ([#14](https://github.com/EngineXV/engineX/issues/14)) |
| Single-tenant per install | One client per deployment |
| LLM sends prompts externally | Client chooses vendor or local LLM |
| No SOC2 on EngineX OSS itself | Client VPC model reduces vendor data custody |

---

## What is out of scope for this ticket (Phase 2+)

- Multi-tenant auth & tenant isolation ([#4](https://github.com/EngineXV/engineX/issues/4))
- Engine Cloud hosted control plane (private)
- SOC2/ISO certification program
- Automated penetration test reports (client may run their own)

---

## Deliverables

- [ ] This issue = canonical security FAQ on GitHub
- [ ] Link from [#13](https://github.com/EngineXV/engineX/issues/13) and [#15](https://github.com/EngineXV/engineX/issues/15)
- [ ] Optional: 1-page PDF export for finance (GTM — not OSS dev assignment)
- [ ] Optional: `examples/deploy/SECURITY.md` mirror in repo (Dev 2 deploy pack)

---

## Definition of done

- [ ] Client security team can review this issue without a live call
- [ ] Finance/legal understands: data in **client VPC**, no platform DB, LLM is main external dependency
- [ ] IT sign-off checklist (above) used on every pilot ([#15](https://github.com/EngineXV/engineX/issues/15))

---

---

## Client FAQ — questions IT, finance, legal & procurement ask

Use this section to fill security questionnaires (SIG Lite, vendor risk forms, finance due diligence). Answers assume **recommended model: EngineX OSS self-hosted in client VPC** ([#13](https://github.com/EngineXV/engineX/issues/13)).

---

### A. Hosting & data location

**Q: Where is our data stored?**
A: On **your infrastructure** — a persistent disk/volume mounted at `~/.engine/` (or `ENGINE_HOME`) inside your VPC. Sessions, checkpoints, credentials, and logs are JSON/files on that volume. EngineX vendor does not host your production data in the default OSS model.

**Q: Do you store our data in your cloud?**
A: **No**, for standard client VPC deployment. The runtime runs in **your** AWS/GCP/Azure/on-prem network. Optional future **Engine Cloud** (vendor-hosted control plane) is separate and not required for pilots.

**Q: Which regions / countries is data processed in?**
A: **Wherever you deploy the EngineX VM/container.** You choose region, subnet, and data residency. If you use a **cloud LLM** (Anthropic/OpenAI), prompt data also flows to that provider's regions per **their** DPA — use **Ollama in-VPC** if prompts must not leave your network.

**Q: Can we keep everything inside our VPC with no external calls?**
A: **Partially.** Platform state stays in VPC. Agents still need **outbound** access to integrations you configure (Grafana, Slack, CRM, your APIs). For **zero external AI**, run a **local LLM** (Ollama) and only allowlist your internal APIs — no cloud LLM key required.

**Q: Do we need to give you VPN access to our network?**
A: **Not for production.** You operate the install. EngineX delivery may request **time-boxed** staging access during integration/hypercare ([#15](https://github.com/EngineXV/engineX/issues/15)) — document in SOW; prefer jump host + least privilege.

**Q: Can we use our existing Kubernetes / VM standards?**
A: **Yes.** EngineX is a Python process (`./engine run` or `./engine serve`) in your container or VM. You apply your golden images, patching, and pod security policies.

---

### B. Database & storage

**Q: Do we need to provision a database for EngineX?**
A: **No.** Platform persistence is **filesystem JSON**, not Postgres/MySQL ([#14](https://github.com/EngineXV/engineX/issues/14)).

**Q: Will EngineX connect to our production database?**
A: **Only if you configure an agent** to do so (e.g. read-only reconciliation queries). EngineX does not require DB credentials for itself. Prefer **read replicas**, **API gateways**, or **scoped service accounts** — see [#15](https://github.com/EngineXV/engineX/issues/15).

**Q: How do we backup EngineX data?**
A: Snapshot or replicate the **`~/.engine` volume** with your standard backup tool (EBS snapshot, Velero, rsync to DR). Test restore before go-live.

**Q: What happens if the disk fills up?**
A: Session/checkpoint writes may fail. Monitor disk usage; set retention policy (archive/delete old sessions). Size volume for pilot + growth (typical pilot: 10–50 GB depending on log volume).

**Q: Is there a maximum data retention period?**
A: **You define it.** OSS does not auto-purge. Delete session directories or restore from backup per your policy.

---

### C. Authentication & access control

**Q: Who can log into the EngineX dashboard?**
A: OSS `./engine serve` has **no built-in user accounts**. You control access via **VPN**, **Zero Trust**, and/or **SSO/reverse proxy** in front of `:8787` ([#15](https://github.com/EngineXV/engineX/issues/15) Section 3). Named HITL approvers are an **operational** list your team maintains.

**Q: Do you support SSO (Okta, Azure AD, SAML)?**
A: **Not natively in OSS today.** Place your IdP in front of the reverse proxy (standard pattern). Native SSO is planned for multi-tenant / Engine Cloud ([#4](https://github.com/EngineXV/engineX/issues/4) Phase 2).

**Q: Role-based access control (RBAC)?**
A: **Limited in OSS** — no per-user roles in the product. Mitigate with network access, separate dev/prod installs, and headless workers without UI for automated jobs.

**Q: How are API keys managed?**
A: Stored in **`~/.engine/credentials/`** (encrypted with `ENGINE_CREDENTIAL_KEY`) or injected via your secret manager into environment variables. Setup wizard: `./engine setup-credentials <agent>`.

**Q: Can we use AWS Secrets Manager / HashiCorp Vault?**
A: **Yes** — inject secrets at runtime into env vars or files; EngineX reads from env and encrypted local store. No vendor-specific Vault plugin required for pilot.

---

### D. Encryption

**Q: Is data encrypted at rest?**
A: **Credentials/tokens:** yes — Fernet encryption when `ENGINE_CREDENTIAL_KEY` is set. **Session/checkpoint files:** rely on **your volume encryption** (EBS encrypted, GCP CMEK, Azure disk encryption).

**Q: Is data encrypted in transit?**
A: **Your responsibility** for dashboard: TLS at reverse proxy. Outbound: HTTPS to LLM and integration APIs (standard TLS). Do not expose `:8787` without TLS.

**Q: Who holds the encryption keys?**
A: **`ENGINE_CREDENTIAL_KEY`** — you generate/store in env or secret manager. Volume encryption keys — your cloud KMS. EngineX vendor does not hold your production keys in VPC model.

**Q: Can we bring our own KMS (CMK)?**
A: For **disk/volume** — yes, via cloud provider. For **credential vault** — OSS uses Fernet key you supply; HSM integration is not built-in (future/custom).

---

### E. LLM & AI-specific questions

**Q: Does our data train your or OpenAI's models?**
A: EngineX OSS **does not train models**. For **Anthropic/OpenAI**, data handling is governed by **your API agreement** with that provider (typically API data not used for training on enterprise terms — verify your contract).

**Q: What data is sent to the LLM?**
A: **Prompts** built from agent context — may include document text, ticket content, log snippets, or reconciliation data depending on workflow. **Minimize PII** in pilot scope; use redaction where possible.

**Q: Can we use a private / on-prem LLM?**
A: **Yes** — [Ollama](https://ollama.com) or compatible local models via `~/.engine/configuration.json`. Prompts stay in your VPC.

**Q: Do you log prompts externally?**
A: Prompts/completions may appear in **local runtime logs** on your disk. They are **not** sent to EngineX vendor by default. Cloud LLM providers may log per their policy.

**Q: How do we prevent agents from hallucinating sensitive actions?**
A: Use **HITL `pause_nodes`** for approvals, validate→fix loops, and least-privilege tools (read-only DB). See [#10](https://github.com/EngineXV/engineX/issues/10) goal vs node criteria.

---

### F. Integrations & third parties

**Q: What third-party services can EngineX call?**
A: Only what **you configure**: LLM API, Grafana, Slack, HubSpot, Google Calendar, your internal REST APIs, etc. ([#13](https://github.com/EngineXV/engineX/issues/13) Section 5.3). Outbound firewall allowlist is recommended.

**Q: Do you subprocess our data to sub-processors?**
A: In **client VPC model**, **you** choose LLM and integration vendors. EngineX vendor is not a data processor hosting your files. List LLM + SaaS integrations in **your** sub-processor register.

**Q: OAuth — where are refresh tokens stored?**
A: Encrypted in `~/.engine/credentials/` on **your** disk after dashboard **Connect** flow (HubSpot, Google, Zoho).

**Q: Can integrations be read-only?**
A: **Yes** — recommended for pilots (read-only DB user, Grafana read token, Slack webhook post only). Write access requires explicit agent design + approval.

---

### G. Compliance & certifications

**Q: Are you SOC 2 / ISO 27001 certified?**
A: **EngineX OSS as software** — not a certified hosted service. **Client VPC deployment** means **you** operate the environment under **your** compliance program. We provide this security reference; you assess risk.

**Q: GDPR / right to erasure?**
A: Personal data in sessions/checkpoints is on **your disk**. Erasure = delete session files + backups per your GDPR process. EngineX vendor does not retain copies in VPC model.

**Q: HIPAA / PHI?**
A: **Not certified for HIPAA out of the box.** If PHI is in prompts or documents, require BAA with LLM provider, encrypt volume, restrict access, and legal review. Many pilots avoid PHI in scope.

**Q: PCI / card data?**
A: **Do not process PAN/CVV** through agents unless explicitly designed and approved. Not in default templates.

**Q: Audit trail for approvals?**
A: HITL decisions stored in session/checkpoint JSON + runtime logs on your volume. Export via ops console / session history. Retention = your policy.

**Q: Can we get a penetration test report?**
A: **Client may pentest their own deployment** in their VPC. Vendor pentest report — not published for OSS; scope for enterprise agreement if needed later.

---

### H. Operations & incident response

**Q: What is your SLA / uptime?**
A: **Pilot:** defined in **your** ops (systemd/K8s restart policies) + hypercare ([#15](https://github.com/EngineXV/engineX/issues/15)). EngineX vendor SLA applies to **support contract**, not your VPC infrastructure.

**Q: How do we patch vulnerabilities?**
A: `git pull` / release tag from [EngineXV/engineX](https://github.com/EngineXV/engineX), `uv sync`, restart service. You control patch cadence on your VM/container image.

**Q: What if EngineX is compromised?**
A: Isolate VM, rotate `ENGINE_CREDENTIAL_KEY` and all integration secrets, restore volume from clean backup, review runtime logs. Incident runbook is **joint** (client infra + EngineX delivery contact).

**Q: Do you have access to production after go-live?**
A: **By agreement only** — time-boxed support. Default: **client owns production access**.

**Q: Logging & SIEM integration?**
A: Forward **VM/container logs** and optionally `~/.engine/.../runtime_logs/` to your SIEM (CloudWatch, Datadog, Splunk). No built-in SIEM agent; standard file/stdout shipping.

---

### I. Commercial & procurement

**Q: What are we licensing?**
A: **EngineX OSS** — open-source runtime ([MIT License](https://github.com/EngineXV/engineX/blob/main/LICENSE) — verify on repo). Optional commercial support / Engine Cloud — separate agreement.

**Q: Is our data used to improve your product?**
A: **Not from your VPC install** — we do not receive your session files unless you explicitly share logs for support tickets.

**Q: Vendor lock-in?**
A: Agents are Python graphs in your repo/fork; data is JSON on your disk. You can export session data, stop the service, and delete the volume. LLM and integration choices are yours.

**Q: Can we review source code?**
A: **Yes** — public repo [github.com/EngineXV/engineX](https://github.com/EngineXV/engineX).

**Q: Insurance / cyber liability?**
A: **Vendor policy** — address in commercial contract; not covered by this technical ticket.

---

### J. Architecture choices clients ask about

**Q: Headless vs dashboard — which is more secure?**
A: **Headless** (`./engine run --daemon`) — smaller attack surface, no `:8787`. **Dashboard** — needed for OAuth Connect and browser HITL; secure with TLS + VPN/SSO ([#13](https://github.com/EngineXV/engineX/issues/13) Section 3).

**Q: One install for dev and prod?**
A: **No — use separate installs or volumes** to avoid credential and data bleed.

**Q: Multi-tenant — can we share one EngineX for two business units?**
A: **OSS does not isolate business units** on one install. Use **separate installs** per unit or wait for [#4](https://github.com/EngineXV/engineX/issues/4).

**Q: Do you need inbound ports open from the internet?**
A: **No** for headless. For dashboard: **only** through your proxy/VPN — not raw public `:8787`.

**Q: What ports are used?**
A: **8787** (dashboard/API, if enabled), **outbound 443** to LLM and integrations. No inbound agent port required for headless workers.

---

### K. Questionnaire quick-fill (copy-paste block)

| Security questionnaire field | Standard answer (VPC model) |
|------------------------------|---------------------------|
| **Data storage location** | Customer cloud — customer-controlled region |
| **Data processor role** | Customer is controller; vendor provides software only |
| **Encryption at rest** | Yes — credentials encrypted; volume encryption by customer |
| **Encryption in transit** | TLS (customer-configured) + HTTPS outbound |
| **Authentication** | Customer VPN/SSO in front of UI; no native OSS login |
| **Multi-tenancy** | Dedicated install per customer (OSS) |
| **Database required** | No platform database |
| **Sub-processors** | Customer-selected LLM and SaaS APIs only |
| **SOC2** | Customer-operated environment; OSS not a certified hosted SaaS |
| **Pen test** | Customer may test their deployment |
| **Backup** | Customer snapshots `~/.engine` volume |
| **DR** | Customer DR policy on VM/volume |
| **Access to production** | Customer-controlled |

---

### L. Red flags — when to escalate to architecture review

Escalate with EngineX delivery + client CISO if the client requires **all** of:

- [ ] Multi-tenant SaaS on single URL for unrelated external customers → [#4](https://github.com/EngineXV/engineX/issues/4)
- [ ] PHI/PCI in agent scope without local LLM and legal sign-off
- [ ] Public internet dashboard without SSO
- [ ] Write access to production financial DB without HITL
- [ ] Requirement that **vendor** hosts and retains all session data (Engine Cloud scoping)

---

## Related links

- Deployment: [#13](https://github.com/EngineXV/engineX/issues/13)
- Storage (no DB): [#14](https://github.com/EngineXV/engineX/issues/14)
- Integration + TLS/VPN: [#15](https://github.com/EngineXV/engineX/issues/15)
- Multi-tenant (future): [#4](https://github.com/EngineXV/engineX/issues/4)
- Code: `core/engine/credentials/`, `core/engine/server/`, `core/engine/storage/`

---

*Version: 2026-06-28 — EngineX OSS security model + client FAQ (client VPC install)*

Control	Requirement
Dashboard (`:8787`)	Never public internet without TLS + auth
Access path	VPN, Zero Trust, or corporate network + reverse proxy
TLS	Terminated at nginx/ALB/Cloudflare in front of EngineX
Bind address	Prefer `127.0.0.1:8787` behind proxy
Inbound firewall	Deny by default; allow only proxy/VPN sources
Outbound firewall	Allowlist LLM endpoints + agreed integration APIs
Headless-only mode	No `:8787` exposure; outbound-only

Item	Guidance
Backup	Snapshot/replicate `~/.engine` volume (client backup tooling)
Retention	Client defines; no global retention policy in OSS
Deletion	Remove session dirs under `~/.engine/agents/<agent>/sessions/`; rotate credentials via dashboard/CLI
Right to erasure	Client controls disk — delete volume snapshot + session files

Question	Answer
Where does customer data live?	Client's cloud — on disk at `~/.engine/` (or `ENGINE_HOME`)
Does EngineX require a customer database?	No — platform state is JSON files, not client Postgres
Is data sent to EngineX vendor cloud by default?	No — OSS runs entirely in client VPC unless optional Engine Cloud is configured
Multi-tenant isolation in OSS?	No — one install = one organization; separate install per client (#14)
Are credentials encrypted at rest?	Yes — Fernet encryption when `ENGINE_CREDENTIAL_KEY` is set
Is the dashboard authenticated out of the box?	No — OSS `./engine serve` has no built-in login; client must use VPN/SSO/reverse proxy (#15 Section 3)
Does agent data go to LLM providers?	Yes — if using cloud LLM (Anthropic/OpenAI); prompts may contain business data — client chooses provider and DPA
SOC2 / ISO certified?	Not claimed — client evaluates risk for pilot; see checklist below

Data type	Location	Encrypted at rest	Leaves client VPC?
Session state, checkpoints	`~/.engine/agents/.../`	Volume encryption (client responsibility)	No
OAuth tokens, API keys	`~/.engine/credentials/`	Yes (Fernet + `ENGINE_CREDENTIAL_KEY`)	No
Credential bootstrap key	`~/.engine/secrets/credential_key` or env	File permissions / secret manager	No
Agent business inputs (documents, tickets)	Session memory + checkpoint JSON	Same as volume	Only if sent to LLM or integration APIs
Runtime logs	`~/.engine/.../runtime_logs/`	Volume encryption	No
LLM prompts/completions	Transient / provider-side	Provider-dependent	Yes — to chosen LLM vendor

Model	Isolation mechanism	DB required?
Client self-host (recommended)	Separate install + separate `~/.engine` volume per client	No
Dev vs prod	Separate installs or separate volumes	No
Multi-tenant SaaS	Not in OSS — see #4	Phase 2

Limitation	Mitigation
No dashboard login/RBAC	VPN/SSO proxy; or headless-only
Filesystem storage (no DB)	Volume backup; adequate for pilot scale (#14)
Single-tenant per install	One client per deployment
LLM sends prompts externally	Client chooses vendor or local LLM
No SOC2 on EngineX OSS itself	Client VPC model reduces vendor data custody

Security questionnaire field	Standard answer (VPC model)
Data storage location	Customer cloud — customer-controlled region
Data processor role	Customer is controller; vendor provides software only
Encryption at rest	Yes — credentials encrypted; volume encryption by customer
Encryption in transit	TLS (customer-configured) + HTTPS outbound
Authentication	Customer VPN/SSO in front of UI; no native OSS login
Multi-tenancy	Dedicated install per customer (OSS)
Database required	No platform database
Sub-processors	Customer-selected LLM and SaaS APIs only
SOC2	Customer-operated environment; OSS not a certified hosted SaaS
Pen test	Customer may test their deployment
Backup	Customer snapshots `~/.engine` volume
DR	Customer DR policy on VM/volume
Access to production	Customer-controlled

Uh oh!

[INFO][Security] Data handling & client IT sign-off — VPC install model #16

Description

Overview

Executive summary (for security reviewers)

Deployment model (security boundary)

Data classification

Credentials & secrets

Encryption at rest

Secret injection (client responsibility)

What EngineX does NOT do (OSS today)

Network security

Recommended production controls (#15 Section 3)

OSS dashboard authentication gap (honest)

Isolation & tenancy

LLM & third-party data flow

Human-in-the-loop (HITL)

Backup, retention & deletion

Security checklist (client IT sign-off)

Infrastructure

Application

Network

LLM & compliance

Operations

Known limitations (OSS — disclose in reviews)

What is out of scope for this ticket (Phase 2+)

Deliverables

Definition of done

Client FAQ — questions IT, finance, legal & procurement ask

A. Hosting & data location

B. Database & storage

C. Authentication & access control

D. Encryption

E. LLM & AI-specific questions

F. Integrations & third parties

G. Compliance & certifications

H. Operations & incident response

I. Commercial & procurement

J. Architecture choices clients ask about

K. Questionnaire quick-fill (copy-paste block)

L. Red flags — when to escalate to architecture review

Related links

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions