Struct regex::dfa::CacheInner [−][src]
struct CacheInner { compiled: HashMap<State, u32>, trans: Transitions, states: Vec<State>, start_states: Vec<u32>, stack: Vec<u32>, flush_count: u64, size: usize, }
CacheInner
is logically just a part of Cache, but groups together fields
that aren't passed as function parameters throughout search. (This split
is mostly an artifact of the borrow checker. It is happily paid.)
Fields
compiled: HashMap<State, u32>
A cache of pre-compiled DFA states, keyed by the set of NFA states and the set of empty-width flags set at the byte in the input when the state was observed.
A StatePtr is effectively a *State
, but to avoid various inconvenient
things, we just pass indexes around manually. The performance impact of
this is probably an instruction or two in the inner loop. However, on
64 bit, each StatePtr is half the size of a *State.
trans: Transitions
The transition table.
The transition table is laid out in row-major order, where states are
rows and the transitions for each state are columns. At a high level,
given state s
and byte b
, the next state can be found at index
s * 256 + b
.
This is, of course, a lie. A StatePtr is actually a pointer to the start of a row in this table. When indexing in the DFA's inner loop, this removes the need to multiply the StatePtr by the stride. Yes, it matters. This reduces the number of states we can store, but: the stride is rarely 256 since we define transitions in terms of equivalence classes of bytes. Each class corresponds to a set of bytes that never discriminate a distinct path through the DFA from each other.
states: Vec<State>
Our set of states. Note that StatePtr / num_byte_classes
indexes
this Vec rather than just a StatePtr
.
start_states: Vec<u32>
A set of cached start states, which are limited to the number of
permutations of flags set just before the initial byte of input. (The
index into this vec is a EmptyFlags
.)
N.B. A start state can be "dead" (i.e., no possible match), so we represent it with a StatePtr.
stack: Vec<u32>
Stack scratch space used to follow epsilon transitions in the NFA. (This permits us to avoid recursion.)
The maximum stack size is the number of NFA states.
flush_count: u64
The total number of times this cache has been flushed by the DFA because of space constraints.
size: usize
The total heap size of the DFA's cache. We use this to determine when we should flush the cache.
Methods
impl CacheInner
[src]
impl CacheInner
fn reset_size(&mut self)
[src]
fn reset_size(&mut self)
Resets the cache size to account for fixed costs, such as the program and stack sizes.
Trait Implementations
impl Clone for CacheInner
[src]
impl Clone for CacheInner
fn clone(&self) -> CacheInner
[src]
fn clone(&self) -> CacheInner
Returns a copy of the value. Read more
fn clone_from(&mut self, source: &Self)
1.0.0[src]
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from source
. Read more
impl Debug for CacheInner
[src]
impl Debug for CacheInner
Auto Trait Implementations
impl Send for CacheInner
impl Send for CacheInner
impl Sync for CacheInner
impl Sync for CacheInner