Enum regex::prog::Inst[][src]

pub enum Inst {
    Match(usize),
    Save(InstSave),
    Split(InstSplit),
    EmptyLook(InstEmptyLook),
    Char(InstChar),
    Ranges(InstRanges),
    Bytes(InstBytes),
}

Inst is an instruction code in a Regex program.

Regrettably, a regex program either contains Unicode codepoint instructions (Char and Ranges) or it contains byte instructions (Bytes). A regex program can never contain both.

It would be worth investigating splitting this into two distinct types and then figuring out how to make the matching engines polymorphic over those types without sacrificing performance.

Other than the benefit of moving invariants into the type system, another benefit is the decreased size. If we remove the Char and Ranges instructions from the Inst enum, then its size shrinks from 40 bytes to 24 bytes. (This is because of the removal of a Vec in the Ranges variant.) Given that byte based machines are typically much bigger than their Unicode analogues (because they can decode UTF-8 directly), this ends up being a pretty significant savings.

Variants

Match indicates that the program has reached a match state.

The number in the match corresponds to the Nth logical regular expression in this program. This index is always 0 for normal regex programs. Values greater than 0 appear when compiling regex sets, and each match instruction gets its own unique value. The value corresponds to the Nth regex in the set.

Save causes the program to save the current location of the input in the slot indicated by InstSave.

Split causes the program to diverge to one of two paths in the program, preferring goto1 in InstSplit.

EmptyLook represents a zero-width assertion in a regex program. A zero-width assertion does not consume any of the input text.

Char requires the regex program to match the character in InstChar at the current position in the input.

Ranges requires the regex program to match the character at the current position in the input with one of the ranges specified in InstRanges.

Bytes is like Ranges, except it expresses a single byte range. It is used in conjunction with Split instructions to implement multi-byte character classes.

Methods

impl Inst
[src]

Returns true if and only if this is a match instruction.

Trait Implementations

impl Clone for Inst
[src]

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

impl Debug for Inst
[src]

Formats the value using the given formatter. Read more

Auto Trait Implementations

impl Send for Inst

impl Sync for Inst