Class RegularExpression<E>

java.lang.Object
edu.washington.cs.knowitall.regex.RegularExpression<E>
Type Parameters:
E - the type of the sequence elements
All Implemented Interfaces:
com.google.common.base.Predicate<List<E>>, Predicate<List<E>>

public class RegularExpression<E> extends Object implements com.google.common.base.Predicate<List<E>>
A regular expression engine that operates over sequences of user-specified objects.
  • Field Details

  • Constructor Details

    • RegularExpression

      public RegularExpression(List<Expression<E>> expressions)
  • Method Details

    • compile

      public static <E> RegularExpression<E> compile(List<Expression<E>> expressions)
      Create a regular expression without tokenization support.
      Parameters:
      expressions -
      Returns:
    • compile

      public static <E> RegularExpression<E> compile(String expression, com.google.common.base.Function<String,Expression.BaseExpression<E>> factoryDelegate)
      Create a regular expression from the specified string.
      Parameters:
      expression -
      factoryDelegate -
      Returns:
    • equals

      public boolean equals(Object other)
      Specified by:
      equals in interface com.google.common.base.Predicate<E>
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • build

      public static <E> FiniteAutomaton.Automaton<E> build(List<Expression<E>> exprs)
      Build an NFA from the list of expressions.
      Parameters:
      exprs -
      Returns:
    • apply

      public boolean apply(List<E> tokens)
      Apply the expression against a list of tokens.
      Specified by:
      apply in interface com.google.common.base.Predicate<E>
      Returns:
      true iff the expression if found within the tokens.
    • matches

      public boolean matches(List<E> tokens)
      Apply the expression against a list of tokens.
      Returns:
      true iff the expression matches all of the tokens.
    • find

      public Match<E> find(List<E> tokens)
      Find the first match of the regular expression against tokens. This method is slightly slower due to additional memory allocations. However, the response has much greater detail and is very useful for writing/debugging regular expressions.
      Parameters:
      tokens -
      Returns:
      an object representing the match, or null if no match is found.
    • find

      public Match<E> find(List<E> tokens, int start)
      Find the first match of the regular expression against tokens, starting at the specified index.
      Parameters:
      tokens - tokens to match against.
      start - index to start looking for a match.
      Returns:
      an object representing the match, or null if no match is found.
    • lookingAt

      public Match<E> lookingAt(List<E> tokens)
      Determine if the regular expression matches the beginning of the supplied tokens.
      Parameters:
      tokens - the list of tokens to match.
      Returns:
      an object representing the match, or null if no match is found.
    • lookingAt

      public Match<E> lookingAt(List<E> tokens, int start)
      Determine if the regular expression matches the supplied tokens, starting at the specified index.
      Parameters:
      tokens - the list of tokens to match.
      start - the index where the match should begin.
      Returns:
      an object representing the match, or null if no match is found.
    • match

      public Match<E> match(List<E> tokens)
    • findAll

      public List<Match<E>> findAll(List<E> tokens)
      Find all non-overlapping matches of the regular expression against tokens.
      Parameters:
      tokens -
      Returns:
      an list of objects representing the match.
    • main

      public static void main(String[] args)
      An interactive program that compiles a word-based regular expression specified in arg1 and then reads strings from stdin, evaluating them against the regular expression.
      Parameters:
      args -