Class DFA

java.lang.Object
org.antlr.analysis.DFA
Direct Known Subclasses:
LL1DFA

public class DFA extends Object
A DFA (converted from a grammar's NFA). DFAs are used as prediction machine for alternative blocks in all kinds of recognizers (lexers, parsers, tree walkers).
  • Field Details

    • REACHABLE_UNKNOWN

      public static final int REACHABLE_UNKNOWN
      See Also:
    • REACHABLE_BUSY

      public static final int REACHABLE_BUSY
      See Also:
    • REACHABLE_NO

      public static final int REACHABLE_NO
      See Also:
    • REACHABLE_YES

      public static final int REACHABLE_YES
      See Also:
    • CYCLIC_UNKNOWN

      public static final int CYCLIC_UNKNOWN
      See Also:
    • CYCLIC_BUSY

      public static final int CYCLIC_BUSY
      See Also:
    • CYCLIC_DONE

      public static final int CYCLIC_DONE
      See Also:
    • MAX_TIME_PER_DFA_CREATION

      public static int MAX_TIME_PER_DFA_CREATION
      Set to 0 to not terminate early (time in ms)
    • MAX_STATE_TRANSITIONS_FOR_TABLE

      public static int MAX_STATE_TRANSITIONS_FOR_TABLE
      How many edges can each DFA state have before a "special" state is created that uses IF expressions instead of a table?
    • startState

      public DFAState startState
      What's the start state for this DFA?
    • decisionNumber

      public int decisionNumber
      This DFA is being built for which decision?
    • decisionNFAStartState

      public NFAState decisionNFAStartState
      From what NFAState did we create the DFA?
    • description

      public String description
      The printable grammar fragment associated with this DFA
    • uniqueStates

      protected Map<DFAState,DFAState> uniqueStates
      A set of all uniquely-numbered DFA states. Maps hash of DFAState to the actual DFAState object. We use this to detect existing DFA states. Map<DFAState,DFAState>. Use Map so we can get old state back (Set only allows you to see if it's there). Not used during fixed k lookahead as it's a waste to fill it with a dup of states array.
    • states

      protected Vector<DFAState> states
      Maps the state number to the actual DFAState. Use a Vector as it grows automatically when I set the ith element. This contains all states, but the states are not unique. s3 might be same as s1 so s3 → s1 in this table. This is how cycles occur. If fixed k, then these states will all be unique as states[i] always points at state i when no cycles exist. This is managed in parallel with uniqueStates and simply provides a way to go from state number to DFAState rather than via a hash lookup.
    • stateCounter

      protected int stateCounter
      Unique state numbers per DFA
    • numberOfStates

      protected int numberOfStates
      count only new states not states that were rejected as already present
    • user_k

      protected int user_k
      User specified max fixed lookahead. If 0, nothing specified. -1 implies we have not looked at the options table yet to set k.
    • max_k

      protected int max_k
      While building the DFA, track max lookahead depth if not cyclic
    • reduced

      protected boolean reduced
      Is this DFA reduced? I.e., can all states lead to an accept state?
    • cyclic

      protected boolean cyclic
      Are there any loops in this DFA? Computed by doesStateReachAcceptState()
    • predicateVisible

      public boolean predicateVisible
      Track whether this DFA has at least one sem/syn pred encountered during a closure operation. This is useful for deciding whether to retry a non-LL(*) with k=1. If no pred, it will not work w/o a pred so don't bother. It would just give another error message.
    • hasPredicateBlockedByAction

      public boolean hasPredicateBlockedByAction
    • unreachableAlts

      protected List<Integer> unreachableAlts
      Each alt in an NFA derived from a grammar must have a DFA state that predicts it lest the parser not know what to do. Nondeterminisms can lead to this situation (assuming no semantic predicates can resolve the problem) and when for some reason, I cannot compute the lookahead (which might arise from an error in the algorithm or from left-recursion etc...). This list starts out with all alts contained and then in method doesStateReachAcceptState() I remove the alts I know to be uniquely predicted.
    • nAlts

      protected int nAlts
    • altToAcceptState

      protected DFAState[] altToAcceptState
      We only want one accept state per predicted alt; track here
    • recursiveAltSet

      public IntSet recursiveAltSet
      Track whether an alt discovers recursion for each alt during NFA to DFA conversion; >1 alt with recursion implies nonregular.
    • nfa

      public NFA nfa
      Which NFA are we converting (well, which piece of the NFA)?
    • nfaConverter

      protected NFAToDFAConverter nfaConverter
    • probe

      public DecisionProbe probe
      This probe tells you a lot about a decision and is useful even when there is no error such as when a syntactic nondeterminism is solved via semantic predicates. Perhaps a GUI would want the ability to show that.
    • edgeTransitionClassMap

      public Map<List<Integer>,Integer> edgeTransitionClassMap
      Map an edge transition table to a unique set number; ordered so we can push into the output template as an ordered list of sets and then ref them from within the transition[][] table. Like this for C# target: public static readonly DFA30_transition0 = new short[] { 46, 46, -1, 46, 46, -1, -1, -1, -1, -1, -1, -1,...}; public static readonly DFA30_transition1 = new short[] { 21 }; public static readonly short[][] DFA30_transition = { DFA30_transition0, DFA30_transition0, DFA30_transition1, ... };
    • edgeTransitionClass

      protected int edgeTransitionClass
      The unique edge transition class number; every time we see a new set of edges emanating from a state, we number it so we can reuse if it's every seen again for another state. For Java grammar, some of the big edge transition tables are seen about 57 times.
    • specialStates

      public List<DFAState> specialStates
      List of special DFAState objects
    • specialStateSTs

      public List<org.stringtemplate.v4.ST> specialStateSTs
      List of ST for special states.
    • accept

      public Vector<Integer> accept
    • eot

      public Vector<Integer> eot
    • eof

      public Vector<Integer> eof
    • min

      public Vector<Integer> min
    • max

      public Vector<Integer> max
    • special

      public Vector<Integer> special
    • transition

      public Vector<Vector<Integer>> transition
    • transitionEdgeTables

      public Vector<Integer> transitionEdgeTables
      just the Vector<Integer> indicating which unique edge table is at position i.
    • uniqueCompressedSpecialStateNum

      protected int uniqueCompressedSpecialStateNum
    • generator

      protected CodeGenerator generator
      Which generator to use if we're building state tables
  • Constructor Details

    • DFA

      protected DFA()
    • DFA

      public DFA(int decisionNumber, NFAState decisionStartState)
  • Method Details

    • resetStateNumbersToBeContiguous

      public void resetStateNumbersToBeContiguous()
      Walk all states and reset their numbers to be a contiguous sequence of integers starting from 0. Only cyclic DFA can have unused positions in states list. State i might be identical to a previous state j and will result in states[i] == states[j]. We don't want to waste a state number on this. Useful mostly for code generation in tables. At the start of this routine, states[i].stateNumber <= i by definition. If states[50].stateNumber is 50 then a cycle during conversion may try to add state 103, but we find that an identical DFA state, named 50, already exists, hence, states[103]==states[50] and both have stateNumber 50 as they point at same object. Afterwards, the set of state numbers from all states should represent a contiguous range from 0..n-1 where n is the number of unique states.
    • getJavaCompressedAccept

      public List<? extends String> getJavaCompressedAccept()
    • getJavaCompressedEOT

      public List<? extends String> getJavaCompressedEOT()
    • getJavaCompressedEOF

      public List<? extends String> getJavaCompressedEOF()
    • getJavaCompressedMin

      public List<? extends String> getJavaCompressedMin()
    • getJavaCompressedMax

      public List<? extends String> getJavaCompressedMax()
    • getJavaCompressedSpecial

      public List<? extends String> getJavaCompressedSpecial()
    • getJavaCompressedTransition

      public List<List<? extends String>> getJavaCompressedTransition()
    • getRunLengthEncoding

      public List<? extends String> getRunLengthEncoding(List<Integer> data)
      Compress the incoming data list so that runs of same number are encoded as number,value pair sequences. 3 -1 -1 -1 28 is encoded as 1 3 3 -1 1 28. I am pretty sure this is the lossless compression that GIF files use. Transition tables are heavily compressed by this technique. I got the idea from JFlex http://jflex.de/ Return List<String> where each string is either \xyz for 8bit char and ￿ for 16bit. Hideous and specific to Java, but it is the only target bad enough to need it.
    • createStateTables

      public void createStateTables(CodeGenerator generator)
    • createMinMaxTables

      protected void createMinMaxTables(DFAState s)
    • createTransitionTableEntryForState

      protected void createTransitionTableEntryForState(DFAState s)
    • createEOTAndEOFTables

      protected void createEOTAndEOFTables(DFAState s)
      Set up the EOT and EOF tables; we cannot put -1 min/max values so we need another way to test that in the DFA transition function.
    • createSpecialTable

      protected void createSpecialTable(DFAState s)
    • predict

      public int predict(org.antlr.runtime.IntStream input)
    • addState

      protected DFAState addState(DFAState d)
      Add a new DFA state to this DFA if not already present. To force an acyclic, fixed maximum depth DFA, just always return the incoming state. By not reusing old states, no cycles can be created. If we're doing fixed k lookahead don't updated uniqueStates, just return incoming state, which indicates it's a new state.
    • removeState

      public void removeState(DFAState d)
    • getUniqueStates

      public Map<DFAState,DFAState> getUniqueStates()
    • getMaxStateNumber

      public int getMaxStateNumber()
      What is the max state number ever created? This may be beyond getNumberOfStates().
    • getState

      public DFAState getState(int stateNumber)
    • setState

      public void setState(int stateNumber, DFAState d)
    • isReduced

      public boolean isReduced()
      Is the DFA reduced? I.e., does every state have a path to an accept state? If not, don't delete as we need to generate an error indicating which paths are "dead ends". Also tracks list of alts with no accept state in the DFA. Must call verify() first before this makes sense.
    • isCyclic

      public boolean isCyclic()
      Is this DFA cyclic? That is, are there any loops? If not, then the DFA is essentially an LL(k) predictor for some fixed, max k value. We can build a series of nested IF statements to match this. In the presence of cycles, we need to build a general DFA and interpret it to distinguish between alternatives.
    • isClassicDFA

      public boolean isClassicDFA()
    • canInlineDecision

      public boolean canInlineDecision()
    • isTokensRuleDecision

      public boolean isTokensRuleDecision()
      Is this DFA derived from the NFA for the Tokens rule?
    • getUserMaxLookahead

      public int getUserMaxLookahead()
      The user may specify a max, acyclic lookahead for any decision. No DFA cycles are created when this value, k, is greater than 0. If this decision has no k lookahead specified, then try the grammar.
    • getAutoBacktrackMode

      public boolean getAutoBacktrackMode()
    • setUserMaxLookahead

      public void setUserMaxLookahead(int k)
    • getMaxLookaheadDepth

      public int getMaxLookaheadDepth()
      Return k if decision is LL(k) for some k else return max int
    • _getMaxLookaheadDepth

      int _getMaxLookaheadDepth(DFAState d, int depth)
    • hasSynPred

      public boolean hasSynPred()
      Count all disambiguating syn preds (ignore synpred tests for gated edges, which occur for nonambig input sequences). E.g., x : (X)=> (X|Y)\n" + | X\n" + ; gives .s0-X->.s1 .s0-Y&&{synpred1_t}?->:s2=>1 .s1-{synpred1_t}?->:s2=>1 .s1-{true}?->:s3=>2
    • getHasSynPred

      public boolean getHasSynPred()
    • _hasSynPred

      boolean _hasSynPred(DFAState d, Set<DFAState> busy)
    • hasSemPred

      public boolean hasSemPred()
    • _hasSemPred

      boolean _hasSemPred(DFAState d, Set<DFAState> busy)
    • hasCycle

      public boolean hasCycle()
      Compute cyclic w/o relying on state computed during analysis. just check.
    • _hasCycle

      boolean _hasCycle(DFAState d, Map<DFAState,Integer> busy)
    • getUnreachableAlts

      public List<Integer> getUnreachableAlts()
      Return a list of Integer alt numbers for which no lookahead could be computed or for which no single DFA accept state predicts those alts. Must call verify() first before this makes sense.
    • verify

      public void verify()
      Once this DFA has been built, need to verify that: 1. it's reduced 2. all alts have an accept state Elsewhere, in the NFA converter, we need to verify that: 3. alts i and j have disjoint lookahead if no sem preds 4. if sem preds, nondeterministic alts must be sufficiently covered This is avoided if analysis bails out for any reason.
    • doesStateReachAcceptState

      protected boolean doesStateReachAcceptState(DFAState d)
      figure out if this state eventually reaches an accept state and modify the instance variable 'reduced' to indicate if we find at least one state that cannot reach an accept state. This implies that the overall DFA is not reduced. This algorithm should be linear in the number of DFA states. The algorithm also tracks which alternatives have no accept state, indicating a nondeterminism. Also computes whether the DFA is cyclic. TODO: I call getUniquelyPredicatedAlt too much; cache predicted alt
    • findAllGatedSynPredsUsedInDFAAcceptStates

      public void findAllGatedSynPredsUsedInDFAAcceptStates()
      Walk all accept states and find the manually-specified synpreds. Gated preds are not always hoisted I used to do this in the code generator, but that is too late. This converter tries to avoid computing DFA for decisions in syntactic predicates that are not ever used such as those created by autobacktrack mode.
    • getNFADecisionStartState

      public NFAState getNFADecisionStartState()
    • getAcceptState

      public DFAState getAcceptState(int alt)
    • setAcceptState

      public void setAcceptState(int alt, DFAState acceptState)
    • getDescription

      public String getDescription()
    • getDecisionNumber

      public int getDecisionNumber()
    • okToRetryDFAWithK1

      public boolean okToRetryDFAWithK1()
      If this DFA failed to finish during construction, we might be able to retry with k=1 but we need to know whether it will potentially succeed. Can only succeed if there is a predicate to resolve the issue. Don't try if k=1 already as it would cycle forever. Timeout can retry with k=1 even if no predicate if k!=1.
    • getReasonForFailure

      public String getReasonForFailure()
    • getDecisionASTNode

      public GrammarAST getDecisionASTNode()
      What GrammarAST node (derived from the grammar) is this DFA associated with? It will point to the start of a block or the loop back of a (...)+ block etc...
    • isGreedy

      public boolean isGreedy()
    • newState

      public DFAState newState()
    • getNumberOfStates

      public int getNumberOfStates()
    • getNumberOfAlts

      public int getNumberOfAlts()
    • initAltRelatedInfo

      protected void initAltRelatedInfo()
    • toString

      public String toString()
      Overrides:
      toString in class Object