10 KiB
Persistent Search Index And Runtime Provider Hosting Design
Context
Phase 4 requires verstak.search to move beyond live recursive scans:
- keep search as an official plugin, not a core feature;
- persist a workspace-scoped search index;
- host
searchProviderscontributed by other plugins at runtime; - preserve the local-first, readable vault model;
- avoid copying code or architecture from the old Verstak repository.
Current implementation status:
verstak.searchis a workspace item and contributessearchProviders;- it searches while typing by walking files through
api.files.list; - it reads text-like files through
api.files.readText; - it opens file results through
api.workbench.openResource; - the contribution registry already exposes
searchProviders; api.commands.executeFor(pluginId, commandId, args)already hosts frontend provider handlers for other contribution types;- desktop backend already has plugin-scoped JSON data methods
ReadPluginDataJSONandWritePluginDataJSON, but the frontend plugin API currently exposes onlysettings.
UX direction:
- the primary search entry point belongs in the workspace header next to the workspace title;
verstak.searchremains the owner of indexing and provider hosting;- a standalone Search workspace item may stay as an expanded results surface, but it is not the primary entry point.
No visual companion is needed for this design because the decision is about runtime contracts and data flow, not layout.
Assumptions
- The next implementation should be a small reversible platform step.
- Search remains replaceable plugin functionality; core should expose generic storage and contribution execution only.
searchProviders[].handleris treated as a command id. A provider plugin must declare that command incontributes.commandsand register its frontend handler withapi.commands.register.- The initial persistent index is JSON-backed plugin data, not SQLite FTS and not a background sidecar.
- The index is an optimization and discovery layer. User files remain the source of truth.
Alternatives Considered
Recommended: plugin-owned index plus command-backed provider hosting
verstak.search owns its local index in plugin storage, uses public Files API
and events to keep it fresh, and fans out to contributed providers with
api.commands.executeFor.
Trade-offs:
- smallest core/runtime change;
- matches existing Files/Notes contribution execution pattern;
- keeps user-facing search outside core;
- JSON index is simpler than full text search but sufficient for the current roadmap item.
Core search service
Desktop core would own indexing and execute provider searches.
Trade-offs:
- easier to centralize later ranking and indexing;
- violates the current direction that search is a plugin-level user feature;
- makes core understand official search semantics too early.
Sidecar or SQLite FTS indexer now
Introduce a dedicated indexer process or SQLite FTS schema immediately.
Trade-offs:
- better long-term scalability;
- too large for the current milestone;
- adds migration, lifecycle, and sync/cache policy decisions before the basic provider runtime contract is proven.
Chosen Design
Use the recommended approach.
verstak.search becomes both:
- the workspace search runtime and expanded results UI;
- the runtime host for all enabled
searchProviders.
The shell may render the compact input in the workspace header, but it should call into the Search plugin/runtime contract rather than implement search semantics in core.
The desktop and SDK expose a generic frontend storage surface:
api.storage.data.read(name: string): Promise<Record<string, unknown>>
api.storage.data.write(name: string, data: Record<string, unknown>): Promise<void>
This maps to existing plugin-scoped backend methods. It requires
storage.namespace, follows current plugin ownership rules, and does not let a
plugin read another plugin's namespace.
The Search plugin stores its persistent index as plugin data named
search-index. Settings remain for user preferences only.
Provider Runtime Contract
searchProviders keep their current manifest shape:
{
"id": "verstak.search.vault-text",
"label": "Vault Text Search",
"handler": "verstak.search.searchVaultText"
}
The handler value must name a command declared by the same plugin:
{
"contributes": {
"commands": [
{
"id": "verstak.search.searchVaultText",
"title": "Search Vault Text",
"handler": "searchVaultText"
}
],
"searchProviders": [
{
"id": "verstak.search.vault-text",
"label": "Vault Text Search",
"handler": "verstak.search.searchVaultText"
}
]
}
}
The provider registers the command handler at mount time:
api.commands.register('verstak.search.searchVaultText', searchVaultText)
Runtime availability:
- this milestone does not auto-start unloaded frontend bundles or sidecars;
- a provider is executable only after its command handler is registered in the current frontend runtime;
- declared but unregistered providers are skipped and reported as unavailable;
- later sidecar/background activation can extend this without changing the provider manifest shape.
Provider input:
{
source: 'search',
providerId: string,
query: string,
workspaceRootPath: string,
limit: number
}
Provider output:
{
results: SearchResult[]
}
SearchResult:
{
id?: string,
path?: string,
title?: string,
snippet?: string,
matchType?: string,
providerId?: string,
providerLabel?: string,
type?: 'file' | 'folder' | 'activity' | 'worklog' | string,
openable?: boolean,
line?: number,
score?: number,
resource?: {
kind: 'vault-file',
path: string,
mode?: 'view' | 'edit'
}
}
The Search host normalizes missing optional fields. Invalid provider responses are ignored with a visible status warning; they do not fail the whole search.
Persistent Index Shape
The first index version is intentionally small:
{
"version": 1,
"workspaceRootPath": "Project",
"builtAt": "2026-06-29T00:00:00Z",
"entries": [
{
"path": "Project/Docs/case.md",
"name": "case.md",
"type": "file",
"extension": "md",
"size": 1234,
"modifiedAt": "2026-06-29T00:00:00Z",
"text": "short normalized searchable text or snippet"
}
]
}
Rules:
- index only the current workspace root;
- store vault-relative slash paths;
- index folders and regular files;
- read content only for text-like files already handled by current Search;
- rely on the host
readTextlimit for large text files; - store short normalized text, not full arbitrary binary data;
- rebuild when version or workspace root differs.
Index Lifecycle
On mount:
- Read
api.storage.data.read('search-index'). - If version/root matches, use it immediately.
- If missing or stale, build an index from
api.files.listand text reads. - Write the built index with
api.storage.data.write. - Subscribe to
file.changed.
On file.changed:
- if event path is outside the workspace, ignore it;
- for create/update, refresh that path's metadata/content if readable;
- for delete/trash, remove the path and descendants;
- for move, remove
fromPathwhen present and refresh the new path; - write the updated index after the change is applied.
If an incremental update fails, mark the index stale in UI and continue serving the last usable index until rebuild succeeds.
Search Flow
For a query shorter than two characters, return no results.
For a valid query:
- Query local persistent index for path, name, folder, and text matches.
- List enabled
searchProvidersthroughapi.contributions.list. - Call each provider except duplicates that would recurse into the same in-flight command.
- Merge and normalize results.
- Sort by score, then provider order, then path/title.
- Render provider label, match type, path/title, snippet, and open action when the result has a supported resource.
Provider failures are isolated. The status line reports how many providers failed without hiding successful results.
Error Handling
- Missing
storageAPI: fall back to live scanning and show a degraded status. - Missing
storage.namespacepermission: Search runs without persistence and reports degraded persistence. - Corrupt index JSON: ignore it, rebuild, and overwrite only after a successful rebuild.
- Provider command not declared or not registered: skip that provider and show a warning count.
- File read errors: skip that file, matching the current Search behavior.
- Event subscription errors: keep search usable, mark index refresh as manual.
Testing And Verification
Expected test coverage:
- SDK types include
api.storage.data.read/writeand search result/provider contracts. - SDK mock API stores plugin data separately from settings.
- Desktop frontend bridge exposes
api.storage.data.read/write. - Desktop bridge smoke test verifies plugin data round-trip through Wails mock.
- Search plugin smoke test verifies:
- persisted index is read before scanning;
- missing/stale index triggers build and write;
- query uses persisted entries;
file.changedupdates or removes entries;- provider fan-out uses
contributions.list('searchProviders')andcommands.executeFor; - provider failure does not hide local results.
- Roadmap and official plugin docs are updated after implementation.
Manual smoke after implementation:
- start desktop frontend/dev flow used by the repository;
- open Search workspace item;
- search a known text file;
- reload/remount Search and confirm results come from stored index;
- enable a test provider and confirm its results appear with provider label.
Out Of Scope
- SQLite FTS;
- typo/layout tolerant search;
- binary OCR or PDF extraction;
- background sidecar indexing;
- cross-workspace global search;
- sync policy for index cache;
- journal/worklog/activity reconstruction implementation.
Those remain later roadmap items.