Introduction
gRPC channels in .NET manage the underlying HTTP/2 connection to the server. When connectivity issues occur, the channel enters TransientFailure state. Unlike some client libraries, gRPC channels do not always recover automatically, especially if the server restarts, the network changes, or the load balancer drops the connection. Understanding channel state and configuring proper recovery is essential for production gRPC clients.
Symptoms
RpcException: Status(StatusCode="Unavailable", Detail="Connection refused")- Channel state stuck in
TransientFailure - Requests fail after server restart until client is restarted
- Load balancer health checks pass but gRPC calls fail
StatusCode=UnavailablewithError starting gRPC callmessage
Example error:
``
Grpc.Core.RpcException: Status(StatusCode="Unavailable",
Detail="Error starting gRPC call. HttpRequestException:
Connection refused (127.0.0.1:5001)",
DebugException="System.Net.Http.HttpRequestException:
Connection refused (127.0.0.1:5001)")
Common Causes
- Server restart or deployment drops existing HTTP/2 connections
- Load balancer terminates HTTP/2 and does not support gRPC
- Keepalive not configured, idle connections are dropped by firewall
- DNS changes not picked up by existing channel
- Client does not reconnect after transient network failure
Step-by-Step Fix
- 1.Monitor channel state and recreate if stuck:
- 2.```csharp
- 3.public class GrpcChannelMonitor : BackgroundService
- 4.{
- 5.private readonly GrpcChannel _channel;
- 6.private readonly ILogger<GrpcChannelMonitor> _logger;
- 7.private readonly IServiceProvider _serviceProvider;
public GrpcChannelMonitor(GrpcChannel channel, IServiceProvider serviceProvider, ILogger<GrpcChannelMonitor> logger) { _channel = channel; _serviceProvider = serviceProvider; _logger = logger; }
protected override async Task ExecuteAsync(CancellationToken stoppingToken) { while (!stoppingToken.IsCancellationRequested) { var state = _channel.State; if (state == ConnectivityState.TransientFailure) { _logger.LogWarning("gRPC channel in TransientFailure state"); // Trigger reconnection _channel.ConnectAsync(); }
await Task.Delay(TimeSpan.FromSeconds(10), stoppingToken); } } } ```
- 1.Configure keepalive to prevent idle connection drops:
- 2.```csharp
- 3.var channel = GrpcChannel.ForAddress("https://api.example.com:5001",
- 4.new GrpcChannelOptions
- 5.{
- 6.HttpHandler = new SocketsHttpHandler
- 7.{
- 8.// Keepalive settings
- 9.KeepAlivePingDelay = TimeSpan.FromSeconds(30),
- 10.KeepAlivePingTimeout = TimeSpan.FromSeconds(10),
- 11.KeepAlivePingPolicy = HttpKeepAlivePingPolicy.WithActiveRequests,
- 12.// Connection settings
- 13.PooledConnectionLifetime = TimeSpan.FromMinutes(5),
- 14.EnableMultipleHttp2Connections = true,
- 15.},
- 16.// Retry configuration
- 17.RetryThrottling = new RetryThrottlingPolicy
- 18.{
- 19.MaxTokens = 10,
- 20.TokenRatio = 0.1
- 21.}
- 22.});
- 23.
` - 24.Add retry policy for transient failures:
- 25.```csharp
- 26.builder.Services.AddGrpcClient<Greeter.GreeterClient>(options =>
- 27.{
- 28.options.Address = new Uri("https://api.example.com:5001");
- 29.})
- 30..ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler
- 31.{
- 32.KeepAlivePingDelay = TimeSpan.FromSeconds(30),
- 33.KeepAlivePingTimeout = TimeSpan.FromSeconds(10),
- 34.PooledConnectionLifetime = TimeSpan.FromMinutes(5),
- 35.})
- 36..AddResilienceHandler("retry", builder =>
- 37.{
- 38.builder.AddRetry(new HttpRetryStrategyOptions
- 39.{
- 40.MaxRetryAttempts = 3,
- 41.BackoffType = DelayBackoffType.Exponential,
- 42.UseJitter = true,
- 43.ShouldHandle = args =>
- 44.{
- 45.if (args.Outcome.Exception is RpcException rpcEx)
- 46.{
- 47.return ValueTask.FromResult(
- 48.rpcEx.StatusCode == StatusCode.Unavailable ||
- 49.rpcEx.StatusCode == StatusCode.DeadlineExceeded);
- 50.}
- 51.return ValueTask.FromResult(false);
- 52.}
- 53.});
- 54.});
- 55.
` - 56.Implement client-side health check:
- 57.```csharp
- 58.public class GrpcHealthCheck : IHealthCheck
- 59.{
- 60.private readonly Health.HealthClient _healthClient;
public GrpcHealthCheck(Health.HealthClient healthClient) { _healthClient = healthClient; }
public async Task<HealthCheckResult> CheckHealthAsync( HealthCheckContext context, CancellationToken ct = default) { try { var response = await _healthClient.CheckAsync( new HealthCheckRequest(), deadline: DateTime.UtcNow.AddSeconds(5), cancellationToken: ct);
if (response.Status == HealthCheckResponse.Types.ServingStatus.Serving) { return HealthCheckResult.Healthy("gRPC service is healthy"); }
return HealthCheckResult.Degraded( $"gRPC service status: {response.Status}"); } catch (RpcException ex) { return HealthCheckResult.Unhealthy( $"gRPC health check failed: {ex.Status.Detail}"); } } } ```
Prevention
- Configure keepalive pings to prevent connection drops by firewalls/load balancers
- Use
AddResilienceHandlerwith retry for gRPC clients - Set
PooledConnectionLifetimeto periodically refresh connections - Enable multiple HTTP/2 connections for better resilience
- Monitor
GrpcChannel.Statein production - Use gRPC health checking protocol for service health monitoring
- Configure load balancers with HTTP/2 and gRPC support (not HTTP/1.1 only)