Skip to content

DHI.Services.OpenXML — Internal Developer Guide

Filesystem-backed provider that reads/writes .xlsx using the Open XML SDK (no Excel required). Covers: repository behavior, groups/IDs, read/write APIs, type inference, helpers to author workbooks, streaming, Web API wiring, and troubleshooting.


What this provider gives you

Scope Class Base type Read Write Grouped Notes
Folder of workbooks SpreadsheetRepository BaseSpreadsheetRepository<string>, IGroupedUpdatableRepository Stores each spreadsheet as a real .xlsx file beneath a root folder; supports groups & streaming.

Key capabilities

  • Groups → subfolders under the root; IDs are relative paths including .xlsx.
  • Read: Cell, Range, UsedRange, Named ranges, Formats (typed as Spreadsheets.CellFormat).
  • Write: Add/Update from Spreadsheet<string>; or write directly from memory with helpers below.
  • Streaming: download the physical file (GetStream).
  • Type inference when reading: Number / Integer / Boolean / DateTime / Text.

Quick start

using DHI.Services.Spreadsheets;
using DHI.Services.Provider.OpenXML;

// 1) Create the provider
var repo = new SpreadsheetRepository(@"C:\data\workbooks");

// 2) Create a Spreadsheet<string> to add/update
var wb = new Spreadsheet<string>("/reports/finance.xlsx", "finance.xlsx", "/reports");
wb.Data = new List<object[,]>
{
    new object[,]
    {
        { "Name", "Qty", "When" },
        { "Alpha", 1, DateTime.UtcNow },
        { "Beta",  2, DateTime.UtcNow.AddDays(1) }
    }
};
wb.Metadata["SheetNames"] = new List<string> { "Summary" };

// 3) Persist
repo.Add(wb);

// 4) Read a cell/range
var v    = repo.GetCellValue("/reports/finance.xlsx", "Summary", new Cell(1, 2)).Value; // DateTime
var used = repo.GetUsedRange("/reports/finance.xlsx", "Summary").Value;                  // object[,]

Concepts

Root, groups, and IDs

  • Root folder: passed to SpreadsheetRepository(rootFolder).
  • Group = subfolder under root (recursive).
  • ID / FullName = relative path including filename + .xlsx, e.g. /reports/finance.xlsx.
  • GetByGroup(group) returns all spreadsheets recursively beneath <root>/<group>/.

Only .xlsx is supported (enforced by SpreadsheetId).


Core repository APIs

Existence & listing

repo.Count();                         // number of .xlsx files
repo.Contains("/reports/finance.xlsx");
repo.ContainsGroup("/reports");       // folder exists?
repo.ContainsSheet("/reports/finance.xlsx", "Summary"); // case-insensitive

IEnumerable<string> ids       = repo.GetIds();
IEnumerable<string> fullNames = repo.GetFullNames();              // all
IEnumerable<string> byGroup   = repo.GetFullNames("/reports");    // under group

Read cells, ranges, named ranges, formats

// Single cell (0-based row/col)
var cellVal = repo.GetCellValue("/reports/finance.xlsx", "Summary", new Cell(1,2)).Value;

// Arbitrary range (inclusive)
var block = repo.GetRange("/reports/finance.xlsx", "Summary",
            new Range(new Cell(0,0), new Cell(2,2))).Value;         // object[,]

// Named range (must be sheet-scoped like "Summary!MyRange" in Excel)
var nr = repo.GetNamedRange("/reports/finance.xlsx", "Summary", "MyRange").Value; // object[,]

// Used range (provider trims trailing empty rows/cols)
var used = repo.GetUsedRange("/reports/finance.xlsx", "Summary").Value;           // object[,]

// Formats for used range (classification enum from Spreadsheets: Text/Number/Boolean/DateTime)
var fmts = repo.GetUsedRangeFormats("/reports/finance.xlsx", "Summary").Value;    // CellFormat[,]

Add / Update / Remove

repo.Add(wb);               // creates/overwrites the .xlsx file
repo.Update(wb);            // Remove + Add
repo.Remove("/reports/finance.xlsx");

repo.RemoveByGroup("/reports"); // delete <root>/reports recursively

Upload / Download file

// Upload (write stream to {id})
repo.AddStream("/imports/data.xlsx", name: "data.xlsx", group: "/imports", stream, user: null);

// Download (stream + metadata)
var (maybe, fileType, fileName) = repo.GetStream("/reports/finance.xlsx");

Helpers to author .xlsx directly (no service required)

From List<object[,]> (multiple sheets)

var sheets = new List<object[,]>
{
    new object[,] { { "A", 1 }, { "B", 2 } },           // Sheet1
    new object[,] { { "X", true }, { "Y", false } }     // Sheet2
};

sheets.SaveAsXlsx(
    filePath: Path.Combine(@"C:\data\workbooks", "quick\export.xlsx"),
    sheetNames: new List<string> { "Data", "Flags" } // optional; auto "Sheet{n}" otherwise
);

From DataSet

var ds = new DataSet();
var t  = new DataTable("Orders");
t.Columns.Add("Id",   typeof(int));
t.Columns.Add("When", typeof(DateTime));
t.Rows.Add(1, DateTime.UtcNow);
t.Rows.Add(2, DateTime.UtcNow.AddDays(1));
ds.Tables.Add(t);

ds.SaveAsXlsx(@"C:\data\workbooks\quick\orders.xlsx", includeFieldNames: true);

Safe worksheet names

"Finance:Q1/2025".AsValidSheetName("_");  // "Finance_Q1_2025" (max 31 chars)

Rules:

  • Replaces any of : \ / ? * [ ] with your surrogate (or removes if null).
  • Truncates to 31 characters.
  • Throws if surrogate itself contains invalid characters.

Type handling when reading

GetCellValue / GetRange / GetUsedRange interpret values using cell data type and number format:

  • Text
    • Inline/shared strings, or built-in 49 (@)string
  • Booleanbool (0/1)
  • Date/Time
    • Built-ins 14–22, 45–47, or custom formats (incl. id 164) that contain any of y/m/d (special-case: ignores [Red] color tokens to avoid false positives)DateTime
  • Number
    • Integer formats: 1, 3, 37–38 → parsed int (when safe)
    • Decimal/scientific/fraction formats: 2, 4, 11–13, 39–40, or custom 164double
  • Fallback: raw string

Other behavior

  • GetUsedRange trims trailing empty rows/columns.
  • Sheet lookup is case-insensitive.
  • Files are opened with FileShare.ReadWrite so you can read while another process has the file open.

Web API wiring

Config via Connections

"openxml-sps": {
  "$type": "DHI.Services.Spreadsheets.WebApi.SpreadsheetServiceConnection, DHI.Services.Spreadsheets.WebApi",
  "ConnectionString": "[AppData]",
  "RepositoryType": "DHI.Services.Provider.OpenXML.SpreadsheetRepository, DHI.Services.Provider.OpenXML",
  "Name": "OpenXML spreadsheets",
  "Id": "openxml-sps"
}

Config via ServiceLocator

using DHI.Services;
using DHI.Services.Spreadsheets;
using DHI.Services.Provider.OpenXML;

var root = "[AppData]".Resolve(); // expands to your app data folder
var repo = new SpreadsheetRepository(root);
var svc  = new SpreadsheetService(repo);

// the key becomes your {connectionId} in Web API routes
ServiceLocator.Register(svc, "openxml-sps");

Troubleshooting

  • “invalid sheet name” → the worksheet doesn’t exist (inspect Get(id).Value.Metadata["SheetNames"]).
  • Named range not found → the name must be sheet-scoped, referenced as SheetName!RangeName.
  • Empty used range → no SheetDimension or no data on the sheet.
  • File not found → check the root + ID (ID includes .xlsx).
  • Surrogate error in AsValidSheetName → your surrogate contains an invalid character.

Implementation notes (FYI)

  • SaveAsXlsx creates workbook parts and minimal styles (fonts/fills/borders) and a shared string table.
  • Reading resolves shared strings, cell formats (Stylesheet.CellFormats), and number formats (NumberingFormats) to classify values and formats (Spreadsheets.CellFormat).