DHI.Services.Documents — Internal Developer Guide (Core)¶
This guide explains the Documents module in DHI.Services.Documents: what it is, the core types, how grouping & metadata work, how to plug storage providers, how to validate uploads, and how to use it from your code with the built-in file system provider. (Cloud MIKECloud and MCLite providers are documented separately.)
What the Documents module does¶
At a glance:
- Upload & download binary files (any type) addressed by an ID.
- Attach metadata (key/value) to each file and search by metadata.
- Group documents (folder-like) and list by group.
- Validate uploads with pluggable validators (by filename pattern).
- Swap storage by changing the repository (files, cloud, databases).
Core layers follow the Domain Services pattern:

Key concepts¶
Document identity & grouping¶
- Each document has:
- Id (
TId) – the address used by the service to fetch/delete. - Name (string) – display name.
- Group (string, optional) – logical folder/tenant/etc.
- Id (
Document<TId>inheritsBaseGroupedEntity<TId>, givingId,Name,Group,FullName, plus shared Metadata/Permissions.
File provider IDs. With the built-in file repository, TId = string and the repo interprets id as “full name” (group/name) using FullName helpers.
Examples: "reports/2024/summary.pdf", "images/logo.png" → Group=reports/2024, Name=summary.pdf.
Service vs Repository¶
- Service (
DocumentService<TId>) – validation, events, convenience API, uniform errors. - Repository (
IDocumentRepository<TId>) – persistence contract. Any storage works as long as it implements this interface. ImplementIGroupedDocumentRepository<TId>for folder-aware APIs.
Metadata¶
- Free-form key → string value pairs persisted with the file.
- Search with tokenized, case-insensitive “contains” across all values (AND semantics across tokens).
Validators¶
- Upload-time checks driven by filename regex and a
Validate(Stream)body check. - Validators only run when you pass a
fileNameparameter (see §6 and §10).
Core types overview¶
Entities¶
public class Document<TId> : BaseGroupedEntity<TId>
{
public Document(TId id, string name, string group) : base(id, name, group) { }
public Document(TId id, string name) : base(id, name, null) { }
}
Repository contracts¶
public interface IDocumentRepository<TId> : IDiscreteRepository<Document<TId>, TId>
{
(Stream stream, string fileType, string fileName) Get(TId id, ClaimsPrincipal user = null);
IDictionary<string, string> GetMetadata(TId id, ClaimsPrincipal user = null);
IDictionary<TId, IDictionary<string, string>> GetAllMetadata(ClaimsPrincipal user = null);
IDictionary<TId, IDictionary<string, string>> GetMetadataByFilter(string filter, Parameters parameters = null, ClaimsPrincipal user = null);
void Add(Stream stream, TId id, Parameters metadata, ClaimsPrincipal user = null);
void Remove(TId id, ClaimsPrincipal user = null);
}
public interface IGroupedDocumentRepository<TId>
: IDocumentRepository<TId>, IGroupedRepository<Document<TId>> { }
ClaimsPrincipal userenables providers to enforce access control if desired.
Services¶
public class DocumentService<TId>
{
public ICollection<IValidator> Validators { get; }
public int Count(ClaimsPrincipal user = null);
public bool Exists(TId id, ClaimsPrincipal user = null);
public (Stream stream, string fileType, string fileName) Get(TId id, ClaimsPrincipal user = null); // throws if missing
public IEnumerable<Document<TId>> GetAll(ClaimsPrincipal user = null);
public IEnumerable<TId> GetIds(ClaimsPrincipal user = null);
public IDictionary<string, string> GetMetadata(TId id, ClaimsPrincipal user = null);
public IDictionary<TId, IDictionary<string, string>> GetAllMetadata(ClaimsPrincipal user = null);
public IDictionary<TId, IDictionary<string, string>> GetMetadataByFilter(string filter, Parameters parameters = null, ClaimsPrincipal user = null);
public void Add(Stream stream, TId id, Parameters parameters = null, ClaimsPrincipal user = null);
public void Add(string filePath, TId id, Parameters parameters = null, ClaimsPrincipal user = null);
public void Remove(TId id, ClaimsPrincipal user = null);
// Events: Putting/WasPut/Removing/Removed + Validating/Validated
}
public class GroupedDocumentService<TId> : DocumentService<TId>
{
public IEnumerable<Document<TId>> GetByGroup(string group, ClaimsPrincipal user = null);
public IEnumerable<Document<TId>> GetByGroups(IEnumerable<string> groups, ClaimsPrincipal user = null);
public IEnumerable<string> GetFullNames(string group, ClaimsPrincipal user = null);
public IEnumerable<string> GetFullNames(ClaimsPrincipal user = null);
}
Built-in provider: File system¶
public class FileDocumentRepository : BaseGroupedDocumentRepository<string>
{
// root directory containing group/name files and .metadata.json sidecars
}
- Files:
<root>/<group>/<name> - Metadata:
<root>/<group>/<name>.metadata.json GetIds()enumerates all files (ignores.metadata.json) and returns IDs likegroup/name.ContainsGroup(group)checks for the group directory’s existence.
Validators¶
public interface IValidator
{
bool CanValidate(string fileName);
(bool validated, string message) Validate(Stream stream);
}
public abstract class BaseValidator : IValidator
{
protected BaseValidator(string pattern); // regex for activation
public bool CanValidate(string fileName);
public abstract (bool validated, string message) Validate(Stream stream);
}
Guarding inputs (quick recap)¶
Use Guard.Against.* at public boundaries:
Null,NullOrEmpty,NullOrWhiteSpace,NullOrAnySpace,NegativeOrZero.
Typical tasks (with code)¶
Bootstrap a file-backed store¶
var repoRoot = @"C:\AppData\documents";
var repo = new FileDocumentRepository(repoRoot);
// Group helpers:
var docs = new GroupedDocumentService<string>(repo);
// Or:
var docs2 = new DocumentService<string>(repo);
Upload (stream)¶
var id = "reports/2024/summary.pdf";
using var fs = File.OpenRead(@"C:\files\summary.pdf");
var meta = new Parameters {
{ "fileName", "summary.pdf" }, // triggers validators
{ "project", "HarborX" },
{ "reportDate", "2024-06-30" },
{ "author", "alice" }
};
docs.Add(fs, id, meta);
Note
The service rewinds the stream between validators and before saving.
Upload from path¶
docs.Add(@"C:\files\plot.png", "images/plot.png",
new Parameters { { "fileName", "plot.png" }, { "title", "Wave spectrum" }});
Download¶
var (stream, fileType, fileName) = docs.Get("reports/2024/summary.pdf");
using (stream)
using (var dest = File.Create($@"C:\out\{fileName}"))
{
stream.CopyTo(dest);
}
List & search¶
foreach (var d in docs.GetAll())
Console.WriteLine($"{d.FullName} -> {string.Join(", ", d.Metadata.Select(kv => kv.Key + "=" + kv.Value))}");
foreach (var d in docs.GetByGroup("reports/2024"))
Console.WriteLine(d.FullName);
foreach (var fullName in docs.GetFullNames())
Console.WriteLine(fullName);
var hits = docs.GetMetadataByFilter("harborx 2024");
foreach (var (docId, md) in hits)
Console.WriteLine($"{docId} -> {md["project"]} / {md.GetValueOrDefault("reportDate")}");
Delete¶
docs.Remove("images/plot.png");
Validating uploads¶
Attach validators to DocumentService.Validators. They opt-in via CanValidate(fileName) and then inspect the stream. Validators only run if Parameters["fileName"] is present.
Example: PNG signature
public sealed class PngValidator : BaseValidator
{
public PngValidator() : base(@"\.png$") {}
public override (bool validated, string message) Validate(Stream stream)
{
Span<byte> sig = stackalloc byte[8];
if (stream.Read(sig) != 8) return (false, "File too short for PNG header.");
byte[] expected = { 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A };
for (int i = 0; i < 8; i++)
if (sig[i] != expected[i]) return (false, "Invalid PNG signature.");
return (true, "OK");
}
}
Register:
var docs = new DocumentService<string>(new FileDocumentRepository(@"C:\AppData\docs"));
docs.Validators.Add(new PngValidator());
If any matching validator fails,
DocumentService.AddthrowsArgumentExceptionwith the validator’s message.
Events & interception¶
- Validating / Validated
(Type validatorType) - Putting / WasPut
(TId id)– before/after add - Removing / Removed
(TId id)– before/after delete
You can cancel in the “before” events via CancelEventArgs<T>.
FileDocumentRepository details¶
- Ensures root exists.
- Add: writes file +
.metadata.jsonsidecar - Get: returns
(File.OpenRead(path), extensionWithoutDot, fileName)If missing → returns(null, null, null); the service wraps this (see §15). - Search:
GetMetadataByFilter("foo bar")→ every token must appear in some value (case-insensitive).
Note
Threading: lightweight; no global lock. For concurrent writers, prefer a DB/cloud provider.
Grouping API¶
Use GroupedDocumentService<TId> for:
GetByGroup("a/b"),GetByGroups(...)GetFullNames("a/b")or across all groupsContainsGroup("a/b")(repo side)
Parameters & special keys¶
Parametersis persisted as metadata by supporting repos.fileName(optional but recommended) – triggers validators; kept as metadata; does not affect storage path.
Security hooks¶
- Entities carry
Permissions(fromBaseEntity). - File provider doesn’t enforce permissions; enforce in your API or custom provider using
ClaimsPrincipal.
Build your own provider¶
Implement IDocumentRepository<TId> (+ IGroupedDocumentRepository<TId> if folder-aware). Tips:
- Persist
NameandGroupseparately; keepIdas canonical full name. - Store metadata as JSON or in a KV table.
- Return readable streams + fileName + fileType (extension) so clients can save properly.
Discover providers & wire connections¶
Use connection helpers that create services via reflection:
DocumentServiceConnection<TId>GroupedDocumentServiceConnection<TId>
Both expect:
RepositoryType– assembly-qualified repo type name.ConnectionString– whatever your repo needs (directory path, DSN, etc.).[AppData]and[env:VAR]placeholders can be resolved by the Web API connection variants.
Services.Configure(new ConnectionRepository("[AppData]connections.json".Resolve()), lazyCreation: true);
Example connections.json (file repo)
{
"docs-file": {
"$type": "DHI.Services.Documents.DocumentServiceConnection`1[[System.String, System.Private.CoreLib]], DHI.Services.Documents",
"RepositoryType": "DHI.Services.Documents.FileDocumentRepository, DHI.Services.Documents",
"ConnectionString": "[AppData]documents",
"Name": "Local documents (files)",
"Id": "docs-file"
}
}
For UI support,
DocumentService<string>.GetRepositoryTypes(path)lists loaded types implementingIDocumentRepository<string>.
JSON converters¶
DocumentConverter<TId> (and DocumentConverter for string) support System.Text.Json. When exposing Document<T>, register the Documents converters to keep consistent shapes.
Errors & edge cases¶
DocumentService.Get(id)→ throwsKeyNotFoundExceptionif missing (it callsContainsfirst). Some providers’Getmay return(null,null,null)directly; the service wraps it consistently via theContainsguard.- Metadata search is values-only, contains, AND across tokens.
- Input streams are not disposed by the service after
Add. Output streams fromGetmust be disposed by the caller. - File provider is ideal for dev/test or single-host. For scale/concurrency, use DB/cloud.
Practical patterns¶
- Job outputs: store
"jobId","taskId","runTimestamp"in metadata; filter later. - Tenant isolation: encode tenant/project in
Group, e.g."acme/simulations/run-42/output.zip". - Compliance: attach
"checksum","contentType"to metadata; enforce via validators/events. - Soft delete: intercept
Removing/Removedto archive instead of deleting.
Quick reference¶
| Need… | Use… | Notes |
|---|---|---|
| Basic upload/download + metadata | FileDocumentRepository + DocumentService<string> |
IDs are group/name |
| Folder listing | GroupedDocumentService<string> |
File repo supports groups |
| Tokenized metadata search | DocumentService.GetMetadataByFilter |
Case-insensitive AND across tokens |
| Upload-time checks | DocumentService.Validators + BaseValidator |
Triggered when "fileName" present |
| Swap storage w/o recompiling | *ServiceConnection<T> + config |
Replace provider type/connection string |
| Multi-user auth in storage | Custom provider using ClaimsPrincipal |
Enforce roles/tenants |
| Scale/concurrency | DB/cloud provider | See MCLite / MIKECloud guides |