Struct regex::dfa::CacheInner[][src]

struct CacheInner {
    compiled: HashMap<State, u32>,
    trans: Transitions,
    states: Vec<State>,
    start_states: Vec<u32>,
    stack: Vec<u32>,
    flush_count: u64,
    size: usize,
}

CacheInner is logically just a part of Cache, but groups together fields that aren't passed as function parameters throughout search. (This split is mostly an artifact of the borrow checker. It is happily paid.)

Fields

A cache of pre-compiled DFA states, keyed by the set of NFA states and the set of empty-width flags set at the byte in the input when the state was observed.

A StatePtr is effectively a *State, but to avoid various inconvenient things, we just pass indexes around manually. The performance impact of this is probably an instruction or two in the inner loop. However, on 64 bit, each StatePtr is half the size of a *State.

The transition table.

The transition table is laid out in row-major order, where states are rows and the transitions for each state are columns. At a high level, given state s and byte b, the next state can be found at index s * 256 + b.

This is, of course, a lie. A StatePtr is actually a pointer to the start of a row in this table. When indexing in the DFA's inner loop, this removes the need to multiply the StatePtr by the stride. Yes, it matters. This reduces the number of states we can store, but: the stride is rarely 256 since we define transitions in terms of equivalence classes of bytes. Each class corresponds to a set of bytes that never discriminate a distinct path through the DFA from each other.

Our set of states. Note that StatePtr / num_byte_classes indexes this Vec rather than just a StatePtr.

A set of cached start states, which are limited to the number of permutations of flags set just before the initial byte of input. (The index into this vec is a EmptyFlags.)

N.B. A start state can be "dead" (i.e., no possible match), so we represent it with a StatePtr.

Stack scratch space used to follow epsilon transitions in the NFA. (This permits us to avoid recursion.)

The maximum stack size is the number of NFA states.

The total number of times this cache has been flushed by the DFA because of space constraints.

The total heap size of the DFA's cache. We use this to determine when we should flush the cache.

Methods

impl CacheInner
[src]

Resets the cache size to account for fixed costs, such as the program and stack sizes.

Trait Implementations

impl Clone for CacheInner
[src]

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

impl Debug for CacheInner
[src]

Formats the value using the given formatter. Read more

Auto Trait Implementations

impl Send for CacheInner

impl Sync for CacheInner