docs: design persistent search runtime hosting

This commit is contained in:
mirivlad 2026-06-29 03:45:29 +08:00
parent bcedbc6143
commit f241dffd0b
1 changed files with 319 additions and 0 deletions

View File

@ -0,0 +1,319 @@
# Persistent Search Index And Runtime Provider Hosting Design
## Context
Phase 4 requires `verstak.search` to move beyond live recursive scans:
- keep search as an official plugin, not a core feature;
- persist a workspace-scoped search index;
- host `searchProviders` contributed by other plugins at runtime;
- preserve the local-first, readable vault model;
- avoid copying code or architecture from the old Verstak repository.
Current implementation status:
- `verstak.search` is a workspace item and contributes `searchProviders`;
- it searches while typing by walking files through `api.files.list`;
- it reads text-like files through `api.files.readText`;
- it opens file results through `api.workbench.openResource`;
- the contribution registry already exposes `searchProviders`;
- `api.commands.executeFor(pluginId, commandId, args)` already hosts frontend
provider handlers for other contribution types;
- desktop backend already has plugin-scoped JSON data methods
`ReadPluginDataJSON` and `WritePluginDataJSON`, but the frontend plugin API
currently exposes only `settings`.
No visual companion is needed for this design because the decision is about
runtime contracts and data flow, not layout.
## Assumptions
- The next implementation should be a small reversible platform step.
- Search remains replaceable plugin functionality; core should expose generic
storage and contribution execution only.
- `searchProviders[].handler` is treated as a command id. A provider plugin
must declare that command in `contributes.commands` and register its frontend
handler with `api.commands.register`.
- The initial persistent index is JSON-backed plugin data, not SQLite FTS and
not a background sidecar.
- The index is an optimization and discovery layer. User files remain the
source of truth.
## Alternatives Considered
### Recommended: plugin-owned index plus command-backed provider hosting
`verstak.search` owns its local index in plugin storage, uses public Files API
and events to keep it fresh, and fans out to contributed providers with
`api.commands.executeFor`.
Trade-offs:
- smallest core/runtime change;
- matches existing Files/Notes contribution execution pattern;
- keeps user-facing search outside core;
- JSON index is simpler than full text search but sufficient for the current
roadmap item.
### Core search service
Desktop core would own indexing and execute provider searches.
Trade-offs:
- easier to centralize later ranking and indexing;
- violates the current direction that search is a plugin-level user feature;
- makes core understand official search semantics too early.
### Sidecar or SQLite FTS indexer now
Introduce a dedicated indexer process or SQLite FTS schema immediately.
Trade-offs:
- better long-term scalability;
- too large for the current milestone;
- adds migration, lifecycle, and sync/cache policy decisions before the basic
provider runtime contract is proven.
## Chosen Design
Use the recommended approach.
`verstak.search` becomes both:
- the workspace search UI;
- the runtime host for all enabled `searchProviders`.
The desktop and SDK expose a generic frontend storage surface:
```ts
api.storage.data.read(name: string): Promise<Record<string, unknown>>
api.storage.data.write(name: string, data: Record<string, unknown>): Promise<void>
```
This maps to existing plugin-scoped backend methods. It requires
`storage.namespace`, follows current plugin ownership rules, and does not let a
plugin read another plugin's namespace.
The Search plugin stores its persistent index as plugin data named
`search-index`. Settings remain for user preferences only.
## Provider Runtime Contract
`searchProviders` keep their current manifest shape:
```json
{
"id": "verstak.search.vault-text",
"label": "Vault Text Search",
"handler": "verstak.search.searchVaultText"
}
```
The `handler` value must name a command declared by the same plugin:
```json
{
"contributes": {
"commands": [
{
"id": "verstak.search.searchVaultText",
"title": "Search Vault Text",
"handler": "searchVaultText"
}
],
"searchProviders": [
{
"id": "verstak.search.vault-text",
"label": "Vault Text Search",
"handler": "verstak.search.searchVaultText"
}
]
}
}
```
The provider registers the command handler at mount time:
```js
api.commands.register('verstak.search.searchVaultText', searchVaultText)
```
Runtime availability:
- this milestone does not auto-start unloaded frontend bundles or sidecars;
- a provider is executable only after its command handler is registered in the
current frontend runtime;
- declared but unregistered providers are skipped and reported as unavailable;
- later sidecar/background activation can extend this without changing the
provider manifest shape.
Provider input:
```ts
{
source: 'search',
providerId: string,
query: string,
workspaceRootPath: string,
limit: number
}
```
Provider output:
```ts
{
results: SearchResult[]
}
```
`SearchResult`:
```ts
{
id?: string,
path?: string,
title?: string,
snippet?: string,
matchType?: string,
providerId?: string,
providerLabel?: string,
type?: 'file' | 'folder' | 'activity' | 'worklog' | string,
openable?: boolean,
line?: number,
score?: number,
resource?: {
kind: 'vault-file',
path: string,
mode?: 'view' | 'edit'
}
}
```
The Search host normalizes missing optional fields. Invalid provider responses
are ignored with a visible status warning; they do not fail the whole search.
## Persistent Index Shape
The first index version is intentionally small:
```json
{
"version": 1,
"workspaceRootPath": "Project",
"builtAt": "2026-06-29T00:00:00Z",
"entries": [
{
"path": "Project/Docs/case.md",
"name": "case.md",
"type": "file",
"extension": "md",
"size": 1234,
"modifiedAt": "2026-06-29T00:00:00Z",
"text": "short normalized searchable text or snippet"
}
]
}
```
Rules:
- index only the current workspace root;
- store vault-relative slash paths;
- index folders and regular files;
- read content only for text-like files already handled by current Search;
- rely on the host `readText` limit for large text files;
- store short normalized text, not full arbitrary binary data;
- rebuild when version or workspace root differs.
## Index Lifecycle
On mount:
1. Read `api.storage.data.read('search-index')`.
2. If version/root matches, use it immediately.
3. If missing or stale, build an index from `api.files.list` and text reads.
4. Write the built index with `api.storage.data.write`.
5. Subscribe to `file.changed`.
On `file.changed`:
- if event path is outside the workspace, ignore it;
- for create/update, refresh that path's metadata/content if readable;
- for delete/trash, remove the path and descendants;
- for move, remove `fromPath` when present and refresh the new path;
- write the updated index after the change is applied.
If an incremental update fails, mark the index stale in UI and continue serving
the last usable index until rebuild succeeds.
## Search Flow
For a query shorter than two characters, return no results.
For a valid query:
1. Query local persistent index for path, name, folder, and text matches.
2. List enabled `searchProviders` through `api.contributions.list`.
3. Call each provider except duplicates that would recurse into the same
in-flight command.
4. Merge and normalize results.
5. Sort by score, then provider order, then path/title.
6. Render provider label, match type, path/title, snippet, and open action when
the result has a supported resource.
Provider failures are isolated. The status line reports how many providers
failed without hiding successful results.
## Error Handling
- Missing `storage` API: fall back to live scanning and show a degraded status.
- Missing `storage.namespace` permission: Search runs without persistence and
reports degraded persistence.
- Corrupt index JSON: ignore it, rebuild, and overwrite only after a successful
rebuild.
- Provider command not declared or not registered: skip that provider and show a
warning count.
- File read errors: skip that file, matching the current Search behavior.
- Event subscription errors: keep search usable, mark index refresh as manual.
## Testing And Verification
Expected test coverage:
- SDK types include `api.storage.data.read/write` and search result/provider
contracts.
- SDK mock API stores plugin data separately from settings.
- Desktop frontend bridge exposes `api.storage.data.read/write`.
- Desktop bridge smoke test verifies plugin data round-trip through Wails mock.
- Search plugin smoke test verifies:
- persisted index is read before scanning;
- missing/stale index triggers build and write;
- query uses persisted entries;
- `file.changed` updates or removes entries;
- provider fan-out uses `contributions.list('searchProviders')` and
`commands.executeFor`;
- provider failure does not hide local results.
- Roadmap and official plugin docs are updated after implementation.
Manual smoke after implementation:
- start desktop frontend/dev flow used by the repository;
- open Search workspace item;
- search a known text file;
- reload/remount Search and confirm results come from stored index;
- enable a test provider and confirm its results appear with provider label.
## Out Of Scope
- SQLite FTS;
- typo/layout tolerant search;
- binary OCR or PDF extraction;
- background sidecar indexing;
- cross-workspace global search;
- sync policy for index cache;
- journal/worklog/activity reconstruction implementation.
Those remain later roadmap items.