Skip to content

DHI.Services.Documents — Internal Developer Guide (Core)

This guide explains the Documents module in DHI.Services.Documents: what it is, the core types, how grouping & metadata work, how to plug storage providers, how to validate uploads, and how to use it from your code with the built-in file system provider. (Cloud MIKECloud and MCLite providers are documented separately.)

What the Documents module does

At a glance:

  • Upload & download binary files (any type) addressed by an ID.
  • Attach metadata (key/value) to each file and search by metadata.
  • Group documents (folder-like) and list by group.
  • Validate uploads with pluggable validators (by filename pattern).
  • Swap storage by changing the repository (files, cloud, databases).

Core layers follow the Domain Services pattern:

Documents Module Flow

Key concepts

Document identity & grouping

  • Each document has:
    • Id (TId) – the address used by the service to fetch/delete.
    • Name (string) – display name.
    • Group (string, optional) – logical folder/tenant/etc.
  • Document<TId> inherits BaseGroupedEntity<TId>, giving Id, Name, Group, FullName, plus shared Metadata/Permissions.

File provider IDs. With the built-in file repository, TId = string and the repo interprets id as “full name” (group/name) using FullName helpers. Examples: "reports/2024/summary.pdf", "images/logo.png"Group=reports/2024, Name=summary.pdf.

Service vs Repository

  • Service (DocumentService<TId>) – validation, events, convenience API, uniform errors.
  • Repository (IDocumentRepository<TId>) – persistence contract. Any storage works as long as it implements this interface. Implement IGroupedDocumentRepository<TId> for folder-aware APIs.

Metadata

  • Free-form key → string value pairs persisted with the file.
  • Search with tokenized, case-insensitive “contains” across all values (AND semantics across tokens).

Validators

  • Upload-time checks driven by filename regex and a Validate(Stream) body check.
  • Validators only run when you pass a fileName parameter (see §6 and §10).

Core types overview

Entities

public class Document<TId> : BaseGroupedEntity<TId>
{
    public Document(TId id, string name, string group) : base(id, name, group) { }
    public Document(TId id, string name) : base(id, name, null) { }
}

Repository contracts

public interface IDocumentRepository<TId> : IDiscreteRepository<Document<TId>, TId>
{
    (Stream stream, string fileType, string fileName) Get(TId id, ClaimsPrincipal user = null);
    IDictionary<string, string> GetMetadata(TId id, ClaimsPrincipal user = null);
    IDictionary<TId, IDictionary<string, string>> GetAllMetadata(ClaimsPrincipal user = null);
    IDictionary<TId, IDictionary<string, string>> GetMetadataByFilter(string filter, Parameters parameters = null, ClaimsPrincipal user = null);

    void Add(Stream stream, TId id, Parameters metadata, ClaimsPrincipal user = null);
    void Remove(TId id, ClaimsPrincipal user = null);
}

public interface IGroupedDocumentRepository<TId>
    : IDocumentRepository<TId>, IGroupedRepository<Document<TId>> { }

ClaimsPrincipal user enables providers to enforce access control if desired.

Services

public class DocumentService<TId>
{
    public ICollection<IValidator> Validators { get; }

    public int Count(ClaimsPrincipal user = null);
    public bool Exists(TId id, ClaimsPrincipal user = null);

    public (Stream stream, string fileType, string fileName) Get(TId id, ClaimsPrincipal user = null); // throws if missing
    public IEnumerable<Document<TId>> GetAll(ClaimsPrincipal user = null);
    public IEnumerable<TId> GetIds(ClaimsPrincipal user = null);

    public IDictionary<string, string> GetMetadata(TId id, ClaimsPrincipal user = null);
    public IDictionary<TId, IDictionary<string, string>> GetAllMetadata(ClaimsPrincipal user = null);
    public IDictionary<TId, IDictionary<string, string>> GetMetadataByFilter(string filter, Parameters parameters = null, ClaimsPrincipal user = null);

    public void Add(Stream stream, TId id, Parameters parameters = null, ClaimsPrincipal user = null);
    public void Add(string filePath, TId id, Parameters parameters = null, ClaimsPrincipal user = null);
    public void Remove(TId id, ClaimsPrincipal user = null);

    // Events: Putting/WasPut/Removing/Removed + Validating/Validated
}

public class GroupedDocumentService<TId> : DocumentService<TId>
{
    public IEnumerable<Document<TId>> GetByGroup(string group, ClaimsPrincipal user = null);
    public IEnumerable<Document<TId>> GetByGroups(IEnumerable<string> groups, ClaimsPrincipal user = null);
    public IEnumerable<string> GetFullNames(string group, ClaimsPrincipal user = null);
    public IEnumerable<string> GetFullNames(ClaimsPrincipal user = null);
}

Built-in provider: File system

public class FileDocumentRepository : BaseGroupedDocumentRepository<string>
{
    // root directory containing group/name files and .metadata.json sidecars
}
  • Files: <root>/<group>/<name>
  • Metadata: <root>/<group>/<name>.metadata.json
  • GetIds() enumerates all files (ignores .metadata.json) and returns IDs like group/name.
  • ContainsGroup(group) checks for the group directory’s existence.

Validators

public interface IValidator
{
    bool CanValidate(string fileName);
    (bool validated, string message) Validate(Stream stream);
}

public abstract class BaseValidator : IValidator
{
    protected BaseValidator(string pattern); // regex for activation
    public bool CanValidate(string fileName);
    public abstract (bool validated, string message) Validate(Stream stream);
}

Guarding inputs (quick recap)

Use Guard.Against.* at public boundaries:

  • Null, NullOrEmpty, NullOrWhiteSpace, NullOrAnySpace, NegativeOrZero.

Typical tasks (with code)

Bootstrap a file-backed store

var repoRoot = @"C:\AppData\documents";
var repo = new FileDocumentRepository(repoRoot);

// Group helpers:
var docs = new GroupedDocumentService<string>(repo);
// Or:
var docs2 = new DocumentService<string>(repo);

Upload (stream)

var id = "reports/2024/summary.pdf";
using var fs = File.OpenRead(@"C:\files\summary.pdf");

var meta = new Parameters {
    { "fileName", "summary.pdf" },   // triggers validators
    { "project", "HarborX" },
    { "reportDate", "2024-06-30" },
    { "author", "alice" }
};

docs.Add(fs, id, meta);

Note

The service rewinds the stream between validators and before saving.

Upload from path

docs.Add(@"C:\files\plot.png", "images/plot.png",
         new Parameters { { "fileName", "plot.png" }, { "title", "Wave spectrum" }});

Download

var (stream, fileType, fileName) = docs.Get("reports/2024/summary.pdf");
using (stream)
using (var dest = File.Create($@"C:\out\{fileName}"))
{
    stream.CopyTo(dest);
}
foreach (var d in docs.GetAll())
    Console.WriteLine($"{d.FullName}  ->  {string.Join(", ", d.Metadata.Select(kv => kv.Key + "=" + kv.Value))}");

foreach (var d in docs.GetByGroup("reports/2024"))
    Console.WriteLine(d.FullName);

foreach (var fullName in docs.GetFullNames())
    Console.WriteLine(fullName);

var hits = docs.GetMetadataByFilter("harborx 2024");
foreach (var (docId, md) in hits)
    Console.WriteLine($"{docId} -> {md["project"]} / {md.GetValueOrDefault("reportDate")}");

Delete

docs.Remove("images/plot.png");

Validating uploads

Attach validators to DocumentService.Validators. They opt-in via CanValidate(fileName) and then inspect the stream. Validators only run if Parameters["fileName"] is present.

Example: PNG signature

public sealed class PngValidator : BaseValidator
{
    public PngValidator() : base(@"\.png$") {}

    public override (bool validated, string message) Validate(Stream stream)
    {
        Span<byte> sig = stackalloc byte[8];
        if (stream.Read(sig) != 8) return (false, "File too short for PNG header.");
        byte[] expected = { 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A };
        for (int i = 0; i < 8; i++)
            if (sig[i] != expected[i]) return (false, "Invalid PNG signature.");
        return (true, "OK");
    }
}

Register:

var docs = new DocumentService<string>(new FileDocumentRepository(@"C:\AppData\docs"));
docs.Validators.Add(new PngValidator());

If any matching validator fails, DocumentService.Add throws ArgumentException with the validator’s message.

Events & interception

  • Validating / Validated (Type validatorType)
  • Putting / WasPut (TId id) – before/after add
  • Removing / Removed (TId id) – before/after delete

You can cancel in the “before” events via CancelEventArgs<T>.

FileDocumentRepository details

  • Ensures root exists.
  • Add: writes file + .metadata.json sidecar
  • Get: returns (File.OpenRead(path), extensionWithoutDot, fileName) If missing → returns (null, null, null); the service wraps this (see §15).
  • Search: GetMetadataByFilter("foo bar") → every token must appear in some value (case-insensitive).

Note

Threading: lightweight; no global lock. For concurrent writers, prefer a DB/cloud provider.

Grouping API

Use GroupedDocumentService<TId> for:

  • GetByGroup("a/b"), GetByGroups(...)
  • GetFullNames("a/b") or across all groups
  • ContainsGroup("a/b") (repo side)

Parameters & special keys

  • Parameters is persisted as metadata by supporting repos.
  • fileName (optional but recommended) – triggers validators; kept as metadata; does not affect storage path.

Security hooks

  • Entities carry Permissions (from BaseEntity).
  • File provider doesn’t enforce permissions; enforce in your API or custom provider using ClaimsPrincipal.

Build your own provider

Implement IDocumentRepository<TId> (+ IGroupedDocumentRepository<TId> if folder-aware). Tips:

  • Persist Name and Group separately; keep Id as canonical full name.
  • Store metadata as JSON or in a KV table.
  • Return readable streams + fileName + fileType (extension) so clients can save properly.

Discover providers & wire connections

Use connection helpers that create services via reflection:

  • DocumentServiceConnection<TId>
  • GroupedDocumentServiceConnection<TId>

Both expect:

  • RepositoryTypeassembly-qualified repo type name.
  • ConnectionString – whatever your repo needs (directory path, DSN, etc.). [AppData] and [env:VAR] placeholders can be resolved by the Web API connection variants.
Services.Configure(new ConnectionRepository("[AppData]connections.json".Resolve()), lazyCreation: true);

Example connections.json (file repo)

{
  "docs-file": {
    "$type": "DHI.Services.Documents.DocumentServiceConnection`1[[System.String, System.Private.CoreLib]], DHI.Services.Documents",
    "RepositoryType": "DHI.Services.Documents.FileDocumentRepository, DHI.Services.Documents",
    "ConnectionString": "[AppData]documents",
    "Name": "Local documents (files)",
    "Id": "docs-file"
  }
}

For UI support, DocumentService<string>.GetRepositoryTypes(path) lists loaded types implementing IDocumentRepository<string>.

JSON converters

DocumentConverter<TId> (and DocumentConverter for string) support System.Text.Json. When exposing Document<T>, register the Documents converters to keep consistent shapes.

Errors & edge cases

  • DocumentService.Get(id) → throws KeyNotFoundException if missing (it calls Contains first). Some providers’ Get may return (null,null,null) directly; the service wraps it consistently via the Contains guard.
  • Metadata search is values-only, contains, AND across tokens.
  • Input streams are not disposed by the service after Add. Output streams from Get must be disposed by the caller.
  • File provider is ideal for dev/test or single-host. For scale/concurrency, use DB/cloud.

Practical patterns

  • Job outputs: store "jobId", "taskId", "runTimestamp" in metadata; filter later.
  • Tenant isolation: encode tenant/project in Group, e.g. "acme/simulations/run-42/output.zip".
  • Compliance: attach "checksum", "contentType" to metadata; enforce via validators/events.
  • Soft delete: intercept Removing/Removed to archive instead of deleting.

Quick reference

Need… Use… Notes
Basic upload/download + metadata FileDocumentRepository + DocumentService<string> IDs are group/name
Folder listing GroupedDocumentService<string> File repo supports groups
Tokenized metadata search DocumentService.GetMetadataByFilter Case-insensitive AND across tokens
Upload-time checks DocumentService.Validators + BaseValidator Triggered when "fileName" present
Swap storage w/o recompiling *ServiceConnection<T> + config Replace provider type/connection string
Multi-user auth in storage Custom provider using ClaimsPrincipal Enforce roles/tenants
Scale/concurrency DB/cloud provider See MCLite / MIKECloud guides