Fast Client#
The Fast Client is Venice's high-performance read client. It's partition-aware, routes requests directly to servers (bypassing the Router), and provides lower latency than the Thin Client.
Characteristics#
- Partition-aware: Maintains metadata to route directly to the correct server
- Network hops: 1 (client → server)
- Typical latency: < 2ms
- Best for: Applications requiring low latency with moderate resource overhead
When to Use#
Choose the Fast Client when:
- You need lower latency than the Thin Client (< 2ms vs < 10ms)
- You can accept some additional memory overhead for metadata caching
- Your application makes frequent reads and benefits from direct server routing
- You want built-in long-tail retry and load control features
For the lowest latency, consider the Da Vinci Client (0 hops, < 1ms). For simplicity, consider the Thin Client.
Usage#
Dependency#
Add the Venice client dependency to your project:
Creating a Client#
Use ClientFactory from the fastclient package:
// D2 client for service discovery (required)
D2Client d2Client = // ... your D2 client setup
// Create client configuration
ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
.setStoreName("my-store")
.setR2Client(d2Client)
.setD2Client(d2Client)
.setClusterDiscoveryD2Service("VeniceController")
.build();
// Create and start the client
AvroGenericStoreClient<String, MyValue> client =
ClientFactory.getAndStartGenericStoreClient(clientConfig);
Reading Data#
The Fast Client implements the same AvroGenericStoreClient interface as the Thin Client:
Single Get#
// Asynchronous single-key lookup
CompletableFuture<MyValue> future = client.get("my-key");
MyValue value = future.get();
Batch Get#
Set<String> keys = Set.of("key1", "key2", "key3");
Map<String, MyValue> results = client.batchGet(keys).get();
Using Specific Records#
ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<String, MyValueRecord, MyValueRecord>()
.setStoreName("my-store")
.setR2Client(d2Client)
.setD2Client(d2Client)
.setClusterDiscoveryD2Service("VeniceController")
.setSpecificValueClass(MyValueRecord.class)
.build();
AvroSpecificStoreClient<String, MyValueRecord> client =
ClientFactory.getAndStartSpecificStoreClient(clientConfig);
Key Features#
Long-Tail Retry#
The Fast Client supports automatic retry for slow requests (long-tail latency):
ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
.setStoreName("my-store")
.setR2Client(d2Client)
.setD2Client(d2Client)
.setClusterDiscoveryD2Service("VeniceController")
// Enable long-tail retry for single get
.setLongTailRetryEnabledForSingleGet(true)
.setLongTailRetryThresholdForSingleGetInMicroSeconds(5000) // 5ms
// Enable for batch get with range-based thresholds
.setLongTailRetryEnabledForBatchGet(true)
.build();
Retry Budget#
Control retry rate to prevent overloading servers:
ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
// ... other config ...
.setRetryBudgetEnabled(true)
.setRetryBudgetPercentage(10.0) // Allow 10% of requests to retry
.build();
Load Control#
Automatic request rejection when servers are overloaded:
ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
// ... other config ...
.setStoreLoadControllerEnabled(true)
.setStoreLoadControllerMaxRejectionRatio(0.5) // Max 50% rejection
.build();
Dual Read Mode#
Run Fast Client alongside Thin Client for gradual migration:
// Create a Thin Client first
AvroGenericStoreClient<String, MyValue> thinClient = // ... create thin client
ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
// ... other config ...
.setDualReadEnabled(true)
.setGenericThinClient(thinClient)
.build();
// Dual read: Fast Client handles reads, Thin Client used for comparison/fallback
AvroGenericStoreClient<String, MyValue> client =
ClientFactory.getAndStartGenericStoreClient(clientConfig);
Configuration Options#
Key configuration options for ClientConfig:
| Option | Description | Default |
|---|---|---|
setStoreName(String) | Store name (required) | - |
setR2Client(Client) | R2 client for HTTP (required unless using gRPC) | - |
setD2Client(D2Client) | D2 client for service discovery | - |
setClusterDiscoveryD2Service(String) | D2 service name for controller | - |
setMetadataRefreshIntervalInSeconds(long) | Metadata refresh interval | 60 |
setLongTailRetryEnabledForSingleGet(boolean) | Enable retry for single get | false |
setLongTailRetryEnabledForBatchGet(boolean) | Enable retry for batch get | false |
setRetryBudgetEnabled(boolean) | Enable retry budget | false |
setRetryBudgetPercentage(double) | Max retry rate percentage | 10.0 |
setStoreLoadControllerEnabled(boolean) | Enable load control | false |
Routing Strategies#
The Fast Client supports different routing strategies:
ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
// ... other config ...
.setClientRoutingStrategyType(ClientRoutingStrategyType.LEAST_LOADED)
.build();
Available strategies:
LEAST_LOADED: Route to the server with the least pending requestsHELIX_ASSISTED: Use Helix routing information
gRPC Support#
The Fast Client supports gRPC transport for potentially lower latency:
GrpcClientConfig grpcConfig = new GrpcClientConfig.Builder()
.setNettyServerToGrpcAddressMap(serverToGrpcMap)
.build();
ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
.setStoreName("my-store")
.setUseGrpc(true)
.setGrpcClientConfig(grpcConfig)
.build();
Health Monitoring#
The Fast Client includes instance health monitoring to avoid unhealthy servers:
InstanceHealthMonitorConfig healthConfig = new InstanceHealthMonitorConfig.Builder()
.setBlockedInstanceMaxBackoffMs(30000)
.setBlockedInstanceMinBackoffMs(1000)
.build();
ClientConfig clientConfig = new ClientConfig.ClientConfigBuilder<>()
// ... other config ...
.setInstanceHealthMonitorConfig(healthConfig)
.build();
Best Practices#
- Configure D2 properly: The Fast Client relies on D2 for metadata discovery
- Enable retry for critical paths: Long-tail retry improves P99 latency
- Use retry budget: Prevents retry storms under load
- Monitor health metrics: The Fast Client exposes metrics for monitoring
- Start with conservative settings: Tune retry thresholds based on observed latency
Metrics#
The Fast Client exposes metrics through FastClientStats:
- Request latency (P50, P95, P99)
- Request count by type (single get, batch get, compute)
- Retry count and success rate
- Instance health status