Class PartitionTracker

java.lang.Object
com.linkedin.davinci.validation.PartitionTracker

public class PartitionTracker extends Object
This class maintains state about all the upstream producers for a given partition. It keeps track of the last segment, last sequence number and incrementally computed checksum for each producer (identified by a producer GUID).

This class is thread safe. Locking is at the granularity of producers. Multiple threads can process records from the same partition concurrently.

This class also encapsulates the capability to clear expired state, in the functions which take in the maxAgeInMs parameter:

- clearExpiredStateAndUpdateOffsetRecord(TopicType, OffsetRecord, long) - setPartitionState(TopicType, OffsetRecord, long)

  • Field Details

  • Constructor Details

  • Method Details

    • getPartition

      public int getPartition()
    • getLatestConsumedVtPosition

      public PubSubPosition getLatestConsumedVtPosition()
    • updateLatestConsumedVtPosition

      public void updateLatestConsumedVtPosition(PubSubPosition vtPosition)
    • toString

      public final String toString()
      Overrides:
      toString in class Object
    • clearSegments

      public void clearSegments(PartitionTracker.TopicType type)
    • setPartitionState

      public void setPartitionState(PartitionTracker.TopicType type, OffsetRecord offsetRecord, long maxAgeInMs)
    • setPartitionState

      public void setPartitionState(PartitionTracker.TopicType type, Map<CharSequence,ProducerPartitionState> producerPartitionStateMap, long earliestAllowableTimestamp)
    • getPartitionStates

    • cloneVtProducerStates

      public void cloneVtProducerStates(PartitionTracker destProducerTracker, long maxAgeInMs, long latestMessageTimeInMs, boolean emitLog)
      Clone the vtSegments and LCVP to the destination PartitionTracker. May be called concurrently.
      Parameters:
      latestMessageTimeInMs - the latest producer message timestamp observed so far, used as the data-relative anchor for age-based pruning (mirrors clearExpiredStateAndUpdateOffsetRecord(com.linkedin.davinci.validation.PartitionTracker.TopicType, com.linkedin.venice.offsets.OffsetRecord, long)). When DataIntegrityValidator.DISABLED is passed (e.g. from a fresh OffsetRecord before any messages have been observed), the computed threshold DISABLED - maxAgeInMs is a very large negative value, so no segment is pruned. To disable pruning entirely regardless of timestamps, pass DataIntegrityValidator.DISABLED as maxAgeInMs instead.
    • cloneRtProducerStates

      public void cloneRtProducerStates(PartitionTracker destProducerTracker, String brokerUrl, long maxAgeInMs, long latestMessageTimeInMs)
      Clone the rtSegments to the destination PartitionTracker. Filter by brokerUrl. May be called concurrently.
      Parameters:
      latestMessageTimeInMs - the latest producer message timestamp observed so far, used as the data-relative anchor for age-based pruning (mirrors clearExpiredStateAndUpdateOffsetRecord(com.linkedin.davinci.validation.PartitionTracker.TopicType, com.linkedin.venice.offsets.OffsetRecord, long)). When DataIntegrityValidator.DISABLED is passed (e.g. from a fresh OffsetRecord before any messages have been observed), the computed threshold DISABLED - maxAgeInMs is a very large negative value, so no segment is pruned. To disable pruning entirely regardless of timestamps, pass DataIntegrityValidator.DISABLED as maxAgeInMs instead.
    • setProducerState

      public void setProducerState(OffsetRecord offsetRecord, PartitionTracker.TopicType type, GUID guid, ProducerPartitionState state)
    • updateOffsetRecord

      public void updateOffsetRecord(PartitionTracker.TopicType type, OffsetRecord offsetRecord)
    • validateMessage

      public void validateMessage(PartitionTracker.TopicType type, DefaultPubSubMessage consumerRecord, boolean endOfPushReceived, Lazy<Boolean> tolerateMissingMsgs) throws DataValidationException
      Ensures the integrity of the data by maintaining state about all the data produced by a specific upstream producer:

      1. Segment, which should be equal or greater to the previous segment. 2. Sequence number, which should be exactly one greater than the previous sequence number. 3. Checksum, which is computed incrementally until the end of a segment.

      Parameters:
      consumerRecord - the incoming Kafka message.
      Throws:
      DataValidationException - if the DIV check failed.
    • checkMissingMessage

      public void checkMissingMessage(PartitionTracker.TopicType type, DefaultPubSubMessage consumerRecord, Optional<PartitionTracker.DIVErrorMetricCallback> errorMetricCallback, long kafkaLogCompactionDelayInMs) throws DataValidationException
      Throws:
      DataValidationException
    • removeProducerState

      public void removeProducerState(PartitionTracker.TopicType type, GUID guid, OffsetRecord offsetRecord)