DHI.Services.MCLite for Documents — Internal Developer Guide¶
This page explains how to use the Documents domain with the MCLite provider. It’s written for developers wiring up repositories or extending behavior. For the MCLite engine overall, see MCLite Providers.
What this provider does¶
DHI.Services.Provider.MCLite.DocumentRepository persists documents and their metadata in an MCLite workspace. It implements grouped, path-based IDs (e.g., /Reports/2025/Q1/summary.pdf) and stores the document payload in chunked BLOB rows. It also supports folder/group operations, keyword filtering (including optional XML metadata search), and lightweight thumbnails.
Backed DBs:
- PostgreSQL (default), SQLite, and SQL Server — selected via
dbflavourin the connection string.
ID & grouping model¶
- FullName: A document’s ID is its full, absolute path, e.g.
/Models/HEC/notes.docx.Name: last segment (notes.docx)Group: everything before (/Models/HEC)
- Root documents have no group (ID looks like
/file.ext). - Folders (aka groups) live in
document_folder(akaNames.TableDocumentGroup). - Membership between documents and folders is in
document_folder_association. - You can pass tree options in some APIs (e.g.,
;nonrecursivefor direct children only).
Tip
Internally, the provider frequently resolves names/paths to GUIDs (document IDs, group IDs), but you always address entities by full paths.
Storage model (tables & fields)¶
- Documents (
Names.TableDocument) – one row per document Important fields the provider uses:id(GUID),title(document name),author,summary,description,language,keywords,format(free text),modified_time,version(GUID),thumbnail_id(GUID → Blob),is_public(bool),exchangeable(bool),data_id(GUID → Blob)
- Blob (
Names.TableBlob) – chunked storage of bytesid(GUID),block_no(int),data(bytea/varbinary/blob)
- Folders (
Names.TableDocumentGroup) – nested folders withparent_id - Folder–Document association (
Names.TableDocumentFolderAssociation) - Metadata (
Names.TableMetadata) – optional XML payload per entity - EntityDescription / EntityType – generic entity catalog rows for “Document”
- Workspace schema: resolved from
master.workspacebyworkspacename on connect
Repository class & key behaviors¶
DocumentRepository : BaseGroupedDocumentRepository<string>, IDocumentRepository<string>
Add¶
public override void Add(Stream stream, string id, Parameters parameters, ClaimsPrincipal user = null)
- Upsert behavior: if a doc with
idexists, it’s removed first. - Splits
idintoGroup+Name; inserts a newdocumentrow with:data_id= new Blob GUID (payload),thumbnail_id= new GUID (optional, see below)- Metadata taken from
parameters(all optional):Author,Summary,Description,Language,Keywords,Format
is_publicis set true by default.
- Folder association is created if
Groupis non-empty. - Blob is inserted in 8 KB blocks into
Names.TableBlob. - Registers an EntityDescription (“Document”).
Thumbnails
thumbnail_id is allocated, and if a blob exists for it, it will be surfaced. Add does not generate a thumbnail; populate Names.TableBlob for thumbnail_id in your own pipeline if you want one.
Get¶
public override (Stream stream, string fileType, string fileName) Get(string id, ClaimsPrincipal user = null)
- Looks up the
data_idand streams the concatenated Blob. fileTypeis the file extension without the dot (e.g.,pdf).fileNameis the last path segment (e.g.,summary.pdf).
Remove¶
- Removes the Blob (all blocks), folder association, document row, and entity description.
Contains / Count / GetIds¶
Contains(id)checks existence by full path.Count()returns number of rows indocument.GetIds()returns full names for all documents.
Metadata APIs¶
GetMetadata(id)returns a flat dictionary:Title,Author,Summary,Language,Format,IsPublic,Description,Thumbnail(base64 PNG if present),Id(full path)
GetAllMetadata()callsGetMetadataByFilter(string.Empty).GetMetadataByFilter(filter, parameters):- Splits
filteron spaces → all keywords must match (AND semantics). - Searches case-insensitively across: Title/Name, Summary, Description, Author, Language.
- Optional
parameters:defaultfolder(path): limits results to that folder and all descendants.includexmlmetadata(true|false): iftrueand a filter is provided, we add a second pass against XML metadata inNames.TableMetadata. A hit requires every keyword to appear somewhere in the text nodes of the XML. Duplicates from the first pass are skipped.
- Splits
Note
The filter builder uses SQL LIKE on LOWER(column). For XML, we stream through text nodes (XmlReader) and confirm all keywords are present.
Listing by group¶
ContainsGroup(group)– validates a folder path or returnstruefor empty path.GetByGroup(group)returnsDocument<string>objects:- If
groupends with;nonrecursive, returns only direct members. - Otherwise, returns members from the folder and all subfolders.
- If
FullName expansion¶
GetFullNames(group, user)supportsTreeOptionsvia suffixes on thegroup:;nonrecursive→ direct children (files + folders);groupsonly→ only folder full names;nonrecursive;groupsonly→ direct child folders only
- For recursive full listing, call without
;nonrecursive.
Parameters reference (when adding or filtering)¶
On Add (document metadata):
Author,Summary,Description,Language,Keywords,Format- All are optional strings. If omitted, the DB receives empty strings.
Formatis free text (e.g.,pdf,docx) and is not auto-derived.
On GetMetadataByFilter (filtering):
defaultfolder– limit to this folder and descendantsincludexmlmetadata–trueto include XML metadata scan (AND with column filters)
Connection & environment (MCLite)¶
The MCLite Db:
- Resolves the workspace schema name from
master.workspace(workspace={name}). - Chooses DB driver via
dbflavour:PostgreSQL(default),SQLite,SqlServer
- Uses
.table delimiter (PostgreSQL/SQL Server) or_(SQLite). - Parameter prefix is
@.
Connection string keys (commonly used)¶
- PostgreSQL:
database,host,port,username,password,workspace,dbflavour=PostgreSQL - SQLite:
database(file path),dbflavour=SQLite - SQL Server:
host,port,database,username,password,dbflavour=SqlServer
Using the repository directly (C#)¶
using DHI.Services.Provider.MCLite;
using DHI.Services.Documents;
using System.Security.Claims;
// Build the connection string (PostgreSQL example)
var cs = "database=mc2014.2;host=localhost;port=5432;username=dss_admin;password=secretdss_admin;workspace=workspace1;dbflavour=PostgreSQL";
IDocumentRepository<string> repo = new DocumentRepository(cs);
// 1) Add a document
var id = "/Reports/2025/Q1/summary.pdf";
var meta = new Parameters {
["Author"] = "Jane Doe",
["Summary"] = "Quarterly summary",
["Language"] = "en",
["Keywords"] = "finance revenue",
["Description"] = "Q1 2025 performance",
["Format"] = "pdf"
};
using (var file = File.OpenRead(@"C:\docs\summary.pdf"))
{
repo.Add(file, id, meta);
}
// 2) Fetch it back
var (stream, fileType, fileName) = repo.Get(id);
using (var fs = File.Create($@"C:\out\{fileName}"))
{
stream.CopyTo(fs);
}
// 3) Metadata lookups
var md = repo.GetMetadata(id); // Title, Author, Summary, etc.
var filtered = repo.GetMetadataByFilter(
"revenue 2025",
new Parameters {
["defaultfolder"] = "/Reports",
["includexmlmetadata"] = "true"
}
);
// 4) List by group
var docs = repo.GetByGroup("/Reports/2025"); // recursive
var docsDirect = repo.GetByGroup("/Reports/2025;nonrecursive");
// 5) Remove
repo.Remove(id);
Via the Web API (quick reference)¶
Your Documents Web API already covers routes & auth. With the Connections entry (below) named mclite, typical calls look like:
GET /api/documents/mclite/ids
GET /api/documents/mclite/metadata?filter=revenue%202025&defaultfolder=/Reports&includexmlmetadata=true
GET /api/documents/mclite/file?path=/Reports/2025/Q1/summary.pdf
POST /api/documents/mclite/file (multipart/form-data with metadata fields)
DELETE /api/documents/mclite/file?path=/Reports/2025/Q1/summary.pdf
(See the Documents WebApi — Internal Guide for exact payloads, status codes, and auth.)
Connections module entries¶
Add these objects to your connections.json (or equivalent configuration) to enable the MCLite Documents provider via Web API:
{
"type": "DHI.Services.Documents.WebApi.GroupedDocumentServiceConnection, DHI.Services.Documents.WebApi",
"id": "mclite",
"name": "MCLite (PostgreSQL)",
"repositoryType": "DHI.Services.Provider.MCLite.DocumentRepository, DHI.Services.MCLite",
"connectionString": "database=mc2014.2;host=localhost;port=5432;username=dss_admin;password=secretdss_admin;workspace=workspace1;dbflavour=PostgreSQL"
},
{
"type": "DHI.Services.Documents.WebApi.GroupedDocumentServiceConnection, DHI.Services.Documents.WebApi",
"id": "mclite-sqlite",
"name": "MCLite (SQLite)",
"repositoryType": "DHI.Services.Provider.MCLite.DocumentRepository, DHI.Services.MCLite",
"connectionString": "database=[AppData]MCSQLiteTest.sqlite;dbflavour=SQLite"
}
Note
Use id (e.g., mclite) as the provider segment in your Web API URLs.
Performance & operational notes¶
- Chunk size is 8 KB per Blob row. Large files create many rows; indexes on
(id, block_no)are recommended. - No explicit transaction wraps
Addend-to-end. If you need stricter atomicity (document + blob + association), wrap at a higher layer. is_publicis inserted as true. This provider doesn’t inspectClaimsPrincipal; authorization is expected to be enforced in the API layer.- Thumbnails are optional. If you populate the Blob for
thumbnail_id, the metadata projection will return a base64 PNG with black treated as transparent.
Common recipes¶
Add from byte[]
using var ms = new MemoryStream(bytes);
repo.Add(ms, "/Inbox/policy.docx", new Parameters { ["Format"] = "docx" });
Search by free text across multiple columns
var hits = repo.GetMetadataByFilter("coastal risk model",
new Parameters { ["defaultfolder"] = "/Studies/Coastal" });
Search including XML metadata
var hits = repo.GetMetadataByFilter("nitrates 2023",
new Parameters {
["defaultfolder"] = "/WaterQuality",
["includexmlmetadata"] = "true"
});
List only folders under a path (names)
var folderNames = repo.GetFullNames("/Reports/2025;nonrecursive;groupsonly");
Troubleshooting¶
- “Schema … does not exist”: The
workspacename must exist inmaster.workspace; the repository resolvesschema_namefrom there. - No results with
defaultfolder: Ensure the folder path is correct (/-prefixed) and actually exists; the filter includes the folder and all descendants. - Thumbnails not appearing:
Adddoesn’t generate them. Insert Blob blocks forthumbnail_idyourself. Formatblank: Supply it inParameters("Format"="pdf"etc.); it isn’t inferred from the filename.
See also¶
- Documents Core (core concepts & types)
- Documents Web API (routes, wiring, auth)
- MCLite Providers (engine details, connection strings, workspace/schema model)