Skip to content

DHI.Services.Policies — Internal Guide

DHI.Services.Policies is a tiny, opinionated wrapper around [Polly]-style resiliency for your domain code. It gives you ready-made retry policies with sensible defaults and consistent logging, so you don’t have to wire up WaitAndRetry for the hundredth time.

What you get:

  • Batteries-included retry for common scenarios:
    • Generic (sync + async) exception retries
    • File I/O retries (locked/missing files, transient IO)
    • HTTP retries (with status-code filtering)
    • External process retries (with overall timeout/kill)
    • Stream operation retries
  • Consistent backoff by default: 1s → 5s → 120s (overrideable)
  • Structured logging on each retry attempt
  • Cancellation support where it matters (async APIs)
  • Simple, explicit APIs (no DI ceremony required — just new the policy and call Execute/ExecuteAsync)

When to use which policy

Use case Policy Retries on… Notes
Any async operation with transient exceptions AsyncExceptionRetryPolicy Selected exception types you pass in Your catch-all for async calls (DB, cloud SDKs, etc.)
Any sync operation with transient exceptions ExceptionRetryPolicy Selected exception types you pass in The synchronous sibling
File operations (local/UNC shares) FileAccessRetryPolicy DirectoryNotFoundException, FileNotFoundException, IOException Good for “file locked”, slow network share, eventual consistency
HTTP calls via HttpClient HttpRetryPolicy HttpRequestException, or non-success HTTP responses except 401 We don’t retry Unauthorized (401) by design
Running an external process that may fail transiently ProcessRetryPolicy Any exception or non-zero ExitCode Enforces an overall timeout and kills the process if exceeded
Flaky stream operations StreamOperationsRetryPolicy ObjectDisposedException, InvalidOperationException Useful with piped/async I/O primitives

Don’t use retries for permanent failures (validation errors, 400/403, missing config, etc.). Let those fail fast.


Quick start

File I/O with retries

var policy = new FileAccessRetryPolicy(logger);
policy.Execute(() =>
{
    // Anything that may transiently fail with IO exceptions
    using var stream = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.Read);
    // ...
});

HTTP with retries

var httpPolicy = new HttpRetryPolicy(logger);
var response = await httpPolicy.ExecuteAsync(() => httpClient.GetAsync(uri));
response.EnsureSuccessStatusCode();

Generic async retry on specific exceptions

var transientTypes = new[] { typeof(TimeoutException), typeof(OperationCanceledException) };
var retry = new AsyncExceptionRetryPolicy(transientTypes, logger);

var result = await retry.ExecuteAsync(async () =>
{
    // e.g., cloud SDK call
    return await myClient.DoSomethingAsync();
});

External process with overall timeout

var procPolicy = new ProcessRetryPolicy(TimeSpan.FromMinutes(10), logger);

using var process = new Process
{
    StartInfo = new ProcessStartInfo
    {
        FileName = "mytool.exe",
        Arguments = "--run",
        RedirectStandardOutput = true,
        UseShellExecute = false
    }
};

var finished = await procPolicy.ExecuteAsync(process, async () =>
{
    process.Start();
    await process.WaitForExitAsync();
    return process; // ExitCode will be checked; non-zero triggers retry
});

Policy catalog & API

AsyncExceptionRetryPolicy

  • Purpose: Retry async work that throws one of the exception types you specify.
  • Defaults: waits 1s, 5s, 120s.
  • Ctor
    • AsyncExceptionRetryPolicy(Type[] types, ILogger logger)
    • AsyncExceptionRetryPolicy(Type[] types, TimeSpan[] waitTimes, ILogger logger)
  • Exec
    • Task<TResult> ExecuteAsync(Func<Task<TResult>> function)
    • Task<TResult> ExecuteAsync(Func<CancellationToken, Task<TResult>> function, CancellationToken token)
    • Task ExecuteAsync(Func<Task> function)

Example

var retry = new AsyncExceptionRetryPolicy(
    new[] { typeof(TimeoutException) }, 
    logger);

await retry.ExecuteAsync(async ct =>
{
    // CancellationToken-aware version
    return await repo.SaveAsync(entity, ct);
}, cancellationToken);

ExceptionRetryPolicy

  • Purpose: Same as above, but for synchronous work.
  • Defaults: waits 1s, 5s, 120s.
  • Ctor
    • ExceptionRetryPolicy(Type[] types, ILogger logger)
    • ExceptionRetryPolicy(Type[] types, TimeSpan[] waitTimes, ILogger logger)
    • Overloads with tag exist for symmetry (see Logging below).
  • Exec
    • TResult Execute(Func<TResult> function)
    • TResult Execute(Func<CancellationToken, TResult> function, CancellationToken token)
    • void Execute(Action action)

Example

var retry = new ExceptionRetryPolicy(
    new[] { typeof(IOException) }, 
    logger);

retry.Execute(() => File.Move(src, dst, overwrite:true));

FileAccessRetryPolicy (specialized)

  • Purpose: Convenience for file operations.
  • Retries: DirectoryNotFoundException, FileNotFoundException, IOException.
  • Ctor
    • FileAccessRetryPolicy(ILogger logger)
    • FileAccessRetryPolicy(TimeSpan[] waitTimes, ILogger logger)

Equivalent to new ExceptionRetryPolicy(_fileExceptionTypes, waitTimes, logger).


HttpRetryPolicy (specialized)

  • Purpose: Wrap HttpClient calls with smart retries.
  • Retries on:
    • HttpRequestException, or
    • Responses where !IsSuccessStatusCode and StatusCode != 401 Unauthorized
  • Ctor
    • HttpRetryPolicy(ILogger logger)
    • HttpRetryPolicy(TimeSpan[] waitTimes, ILogger logger[, string tag])
  • Exec
    • Task<HttpResponseMessage> ExecuteAsync(Func<Task<HttpResponseMessage>> action)
    • Task<HttpResponseMessage> Execute(Func<CancellationToken, Task<HttpResponseMessage>> func, CancellationToken token)

Example

var http = new HttpRetryPolicy(
    new[] { TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(3), TimeSpan.FromSeconds(10) },
    logger);

var resp = await http.ExecuteAsync(() => httpClient.PostAsync(url, content));
if (!resp.IsSuccessStatusCode) { /* handle */ }

Why skip 401? Auth problems don’t get better by retrying; fix the token/credentials instead.


ProcessRetryPolicy (specialized)

  • Purpose: Run an external Process with retries on failure and a hard overall timeout.
  • Retries on: Any exception or ExitCode != 0.
  • Ctor
    • ProcessRetryPolicy(TimeSpan overallTimeout, ILogger logger)
    • ProcessRetryPolicy(TimeSpan overallTimeout, TimeSpan[] waitTimes, ILogger logger[, string tag])
  • Exec
    • Task<Process> ExecuteAsync(Process process, Func<Task<Process>> action)
      • You construct & dispose the Process outside; the action should contain the repeatable launch/wait bits.

Logging: On each retry we log the exit code and up to 1000 chars of stdout to aid diagnostics.


StreamOperationsRetryPolicy (specialized)

  • Purpose: Smooth over occasional hiccups in stream pipelines.
  • Retries on: ObjectDisposedException, InvalidOperationException.
  • Ctor
    • StreamOperationsRetryPolicy(ILogger logger)
    • StreamOperationsRetryPolicy(TimeSpan[] waitTimes, ILogger logger)

If you see these a lot, consider whether the underlying stream lifecycle needs tightening; retries can help but shouldn’t mask design issues.


Customising backoff & exceptions

All policies that accept exception types or wait times can be tailored:

// Custom backoff: 0.5s, 1s, 2s, 5s
var waits = new[] { 0.5, 1, 2, 5 }.Select(TimeSpan.FromSeconds).ToArray();

var retry = new ExceptionRetryPolicy(
    new[] { typeof(SqlException), typeof(TimeoutException) },
    waits,
    logger);

Tips:

  • Keep the number of retries low and include at least one longer wait if you’re dealing with rate limits or cold starts.
  • Prefer specific exception types. Catch-all Exception is supported via AsyncExceptionRetryPolicy/ExceptionRetryPolicy, but use it intentionally.

Logging & correlation

Every retry attempt logs at Information level with the attempt number and wait duration. You can correlate in a few ways:

using (logger.BeginScope("job:{JobId}", jobId))
{
    var http = new HttpRetryPolicy(logger);
    var resp = await http.ExecuteAsync(() => client.GetAsync(url));
}

Some constructors expose a tag parameter. Current implementations create a scope only during construction; the retry callbacks themselves log outside that temporary scope. For reliable scoping, prefer using logger.BeginScope(...) around your Execute/ExecuteAsync call, as shown above.


Cancellation & timeouts

  • Async policies expose CancellationToken overloads; use them for cooperative cancellation.
  • For external processes, ProcessRetryPolicy enforces a hard overall timeout and kills the process if exceeded (you’ll see a warning log).

Best practices

  • Retry only idempotent work (or make it idempotent). Avoid retrying operations with side effects unless they’re safe to repeat.
  • Surface failures after retries. Policies rethrow the last error/return the last result; handle it at the call site.
  • Pair with metrics. If a call is frequently retrying, that’s a symptom you might want to address at source.
  • Don’t retry authorization errors. Fix the token/creds upstream.
  • Keep the scope small. Wrap just the flaky operation, not a whole request pipeline.

FAQ

Q: Can I plug these into DI? Yes. Policies are just classes. Register/factory them however you like, or construct ad-hoc at the call site. Example singleton:

services.AddSingleton(provider => new HttpRetryPolicy(
    new[] { TimeSpan.FromSeconds(1), TimeSpan.FromSeconds(3), TimeSpan.FromSeconds(10) },
    provider.GetRequiredService<ILogger<HttpRetryPolicy>>()));

Q: I need exponential backoff/jitter. You can supply any TimeSpan[] pattern you prefer (e.g., precomputed exponential with random jitter).

Q: Do you retry 5xx responses? Yes. HttpRetryPolicy retries any non-success code except 401.

Q: Where do I get the package? DHI.Services.Policies is published as a NuGet package on our feeds. Add it to your project and import the DHI.Services.Policies namespace.


Reference (constructors & methods)

// AsyncExceptionRetryPolicy
new AsyncExceptionRetryPolicy(Type[] types, ILogger logger)
new AsyncExceptionRetryPolicy(Type[] types, TimeSpan[] waitTimes, ILogger logger)
Task<TResult> ExecuteAsync(Func<Task<TResult>> fn)
Task<TResult> ExecuteAsync(Func<CancellationToken, Task<TResult>> fn, CancellationToken ct)
Task ExecuteAsync(Func<Task> fn)

// ExceptionRetryPolicy
new ExceptionRetryPolicy(Type[] types, ILogger logger)
new ExceptionRetryPolicy(Type[] types, TimeSpan[] waitTimes, ILogger logger[, string tag])
TResult Execute(Func<TResult> fn)
TResult Execute(Func<CancellationToken, TResult> fn, CancellationToken ct)
void Execute(Action action)

// FileAccessRetryPolicy
new FileAccessRetryPolicy(ILogger logger)
new FileAccessRetryPolicy(TimeSpan[] waitTimes, ILogger logger)

// HttpRetryPolicy
new HttpRetryPolicy(ILogger logger)
new HttpRetryPolicy(TimeSpan[] waitTimes, ILogger logger[, string tag])
Task<HttpResponseMessage> ExecuteAsync(Func<Task<HttpResponseMessage>> action)
Task<HttpResponseMessage> Execute(Func<CancellationToken, Task<HttpResponseMessage>> fn, CancellationToken ct)

// ProcessRetryPolicy
new ProcessRetryPolicy(TimeSpan overallTimeout, ILogger logger[, string tag])
new ProcessRetryPolicy(TimeSpan overallTimeout, TimeSpan[] waitTimes, ILogger logger[, string tag])
Task<Process> ExecuteAsync(Process process, Func<Task<Process>> action)

// StreamOperationsRetryPolicy
new StreamOperationsRetryPolicy(ILogger logger)
new StreamOperationsRetryPolicy(TimeSpan[] waitTimes, ILogger logger)

That’s it. Drop the package, pick the policy that matches your use case, and wrap the flaky bit.