Fast Client#

The Fast Client is Venice's high-performance read client. It's partition-aware, routes requests directly to servers (bypassing the Router), and provides lower latency than the Thin Client.

Characteristics#

Partition-aware: Maintains metadata to route directly to the correct server
Network hops: 1 (client → server)
Typical latency: < 2ms
Best for: Applications requiring low latency with moderate resource overhead

When to Use#

Choose the Fast Client when:

You need lower latency than the Thin Client (< 2ms vs < 10ms)
You can accept some additional memory overhead for metadata caching
Your application makes frequent reads and benefits from direct server routing
You want built-in long-tail retry and load control features

For the lowest latency, consider the Da Vinci Client (0 hops, < 1ms). For simplicity, consider the Thin Client.

Usage#

Dependency#

Add the Venice client dependency to your project:

dependencies {
  implementation 'com.linkedin.venice:venice-client:<version>'
}

Creating a Client#

Use ClientFactory from the fastclient package:

// D2 client for service discovery (required)
D2Client d2Client = // ... your D2 client setup

// Create client configuration
ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
    .setStoreName("my-store")
    .setR2Client(d2Client)
    .setD2Client(d2Client)
    .setClusterDiscoveryD2Service("VeniceController")
    .build();

// Create and start the client
AvroGenericStoreClient<String, MyValue> client =
    ClientFactory.getAndStartGenericStoreClient(clientConfig);

Reading Data#

The Fast Client implements the same AvroGenericStoreClient interface as the Thin Client:

Single Get#

// Asynchronous single-key lookup
CompletableFuture<MyValue> future = client.get("my-key");
MyValue value = future.get();

Batch Get#

Set<String> keys = Set.of("key1", "key2", "key3");
Map<String, MyValue> results = client.batchGet(keys).get();

Using Specific Records#

ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<String, MyValueRecord, MyValueRecord>()
    .setStoreName("my-store")
    .setR2Client(d2Client)
    .setD2Client(d2Client)
    .setClusterDiscoveryD2Service("VeniceController")
    .setSpecificValueClass(MyValueRecord.class)
    .build();

AvroSpecificStoreClient<String, MyValueRecord> client =
    ClientFactory.getAndStartSpecificStoreClient(clientConfig);

Key Features#

Long-Tail Retry#

The Fast Client supports automatic retry for slow requests (long-tail latency):

ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
    .setStoreName("my-store")
    .setR2Client(d2Client)
    .setD2Client(d2Client)
    .setClusterDiscoveryD2Service("VeniceController")
    // Enable long-tail retry for single get
    .setLongTailRetryEnabledForSingleGet(true)
    .setLongTailRetryThresholdForSingleGetInMicroSeconds(5000) // 5ms
    // Enable for batch get with range-based thresholds
    .setLongTailRetryEnabledForBatchGet(true)
    .build();

Retry Budget#

Control retry rate to prevent overloading servers:

ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
    // ... other config ...
    .setRetryBudgetEnabled(true)
    .setRetryBudgetPercentage(10.0) // Allow 10% of requests to retry
    .build();

Load Control#

Automatic request rejection when servers are overloaded:

ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
    // ... other config ...
    .setStoreLoadControllerEnabled(true)
    .setStoreLoadControllerMaxRejectionRatio(0.5) // Max 50% rejection
    .build();

Dual Read Mode#

Run Fast Client alongside Thin Client for gradual migration:

// Create a Thin Client first
AvroGenericStoreClient<String, MyValue> thinClient = // ... create thin client

ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
    // ... other config ...
    .setDualReadEnabled(true)
    .setGenericThinClient(thinClient)
    .build();

// Dual read: Fast Client handles reads, Thin Client used for comparison/fallback
AvroGenericStoreClient<String, MyValue> client =
    ClientFactory.getAndStartGenericStoreClient(clientConfig);

Configuration Options#

Key configuration options for ClientConfig:

Option	Description	Default
`setStoreName(String)`	Store name (required)	-
`setR2Client(Client)`	R2 client for HTTP (required unless using gRPC)	-
`setD2Client(D2Client)`	D2 client for service discovery	-
`setClusterDiscoveryD2Service(String)`	D2 service name for controller	-
`setMetadataRefreshIntervalInSeconds(long)`	Metadata refresh interval	60
`setLongTailRetryEnabledForSingleGet(boolean)`	Enable retry for single get	false
`setLongTailRetryEnabledForBatchGet(boolean)`	Enable retry for batch get	false
`setRetryBudgetEnabled(boolean)`	Enable retry budget	false
`setRetryBudgetPercentage(double)`	Max retry rate percentage	10.0
`setStoreLoadControllerEnabled(boolean)`	Enable load control	false

Routing Strategies#

The Fast Client supports different routing strategies:

ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
    // ... other config ...
    .setClientRoutingStrategyType(ClientRoutingStrategyType.LEAST_LOADED)
    .build();

Available strategies:

LEAST_LOADED: Route to the server with the least pending requests
HELIX_ASSISTED: Use Helix routing information

gRPC Support#

The Fast Client supports gRPC transport for potentially lower latency:

GrpcClientConfig grpcConfig = new GrpcClientConfig.Builder()
    .setNettyServerToGrpcAddressMap(serverToGrpcMap)
    .build();

ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
    .setStoreName("my-store")
    .setUseGrpc(true)
    .setGrpcClientConfig(grpcConfig)
    .build();

Health Monitoring#

The Fast Client includes instance health monitoring to avoid unhealthy servers:

InstanceHealthMonitorConfig healthConfig = new InstanceHealthMonitorConfig.Builder()
    .setBlockedInstanceMaxBackoffMs(30000)
    .setBlockedInstanceMinBackoffMs(1000)
    .build();

ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
    // ... other config ...
    .setInstanceHealthMonitorConfig(healthConfig)
    .build();

Best Practices#

Configure D2 properly: The Fast Client relies on D2 for metadata discovery
Enable retry for critical paths: Long-tail retry improves P99 latency
Use retry budget: Prevents retry storms under load
Monitor health metrics: The Fast Client exposes metrics for monitoring
Start with conservative settings: Tune retry thresholds based on observed latency

Metrics#

The Fast Client exposes metrics through FastClientStats:

Request latency (P50, P95, P99)
Request count by type (single get, batch get, compute)
Retry count and success rate
Instance health status