// Abstract finite state machine implementation (version 4) /* A finite state machine is defined by: 1. a number of states, with a designated start state and some accept states 2. a number of possible inputs (or a type of input) 3. a table that defines for each state s and input i: i. additional conditions to check at the current state, and possibly actions to take. ii. the next state to transition to. A finite state machine executes on a stream of inputs, starting at the start state and ending when either the end of input is reached or an accept state is reached. In this implementation, states are always numbers and the start state is always 0, but the input can be any type . The table is represented by a 2D array of "actions:" interface action { int act(); } The act function should return the next state to transition to, after performing whatever action is needed. The table of actions is declared as action[][] M; The first index of M represents the state number, and the second index represents the input. The input has to be converted from type T to an integer index: this is done by a method 'getindex'. The entry of each M[state][getindex(input)] is an action. Since the action interface contains a single declaration, it can be instantiated with a labmda expression, such as: action a = () -> 3; // this act always returns 3. M[0][2] = () -> { if (...) return 1; else return 2; }; The basic structure and operations of state machines is defined using an abstract class (FSM4 below). It contains the following abstract methods which must be defined in each concrete subclass: protected abstract void setTable(); // creates the 2D action table protected abstract boolean acceptState(int s); //true if s is accept state protected abstract T nextInput(); //gets next available input, null if none protected abstract int getindex(T x); //return column index for input x These functions define a particular FSM. The acceptState function defines which states are accepting states (there could be zero or more). For example, if states 0 and 2 are the accept states, acceptState can be implemented with { return (s==0 || s==2); } The setTable() function is the most important one that a non-abstract subclass must define. It creates the 2D array for the table and sets the "action" for each entry, specific to the FSM being implemented. The constructor of the subclass should set the numstates and maxinputs variables. *The constructor should also call setTable()*. The purpose of the getindex(T x) function is to associate an array column index with a input x, whatever it may be. This can often be implemented using a hash map, such as java.util.HashMap: this will allow us to map inputs (such as strings) to integers that we can use to index the 2D table. Note that this is NOT the same as just calling .hashCode() on an object, which could give the same hash value for different inputs. We want a different index for each different value, for example, "a":0, "b":1, "c":2, etc... HashMap HM = new HashMap(); // create map HM.put("a",0); // associate value 0 with key "a" in table Integer v = HM.get("a"); // returns value associated with key, null if none. However, instead of declaring this HashMap inside the abstract class, we generalize it to a getindex(T x) function, because it is not always the appropriate way to map input to indices: we may want to map an entire range of inputs into one row in the table. For example, if the input type is String, and we just need to distinguish between upper and lower case alphabetical characters, we can implement getindex as follows: protected int getindex(String x) // better to represent chars as 1-char strings { int v = (int)x.charAt(0); // ascii code for first (and only) char in x if (v>=(int)'A' && v<=(int)'Z') return 0; // upper case map to index 0 if (v>=(int)'a' && v<=(int)'z') return 1; // lower case map to index 1 return -1; // this will be interpreted by FSM as bad input } Clearly a hashmap is not the best choice in all cases, hence the abstraction. To implement nextInput, look at my example: usually the input type is String. There's also a wrapup() function that you can override in the subclass that can perform some additional operations (default does nothing - this is NOT an abstract method so overriding it is optional). ****** In addition, the following boolean flags can be set for each instance of the machine (shown with default values): public boolean trace = false; // for debuggging messages public boolean keeprunning = false; // stops when accept state reached public boolean skipbadinput = true; // invalid inputs are ignored Setting trace to true will print the current state and input for each action. If keeprunning is set to false, then the machine terminates upon reaching the first accept state; if set to true, then it will continue to run until the end of input is reached (when nextInput() returns null). If skipbadinput is set to false, then upon encountering an unrecognized input (empty table entry), the program terminates with an error message; if set to true (default) then the bad input is simply ignored: the machine stays at the same state. The machine is executed by public boolean run(), which returns true if machine terminated in an accept state. See dfa4.java for example. */ interface action // entry of state transition table { int act(); // perform action, return new state } // main class for Abstract Finite State Machine: public abstract class FSM4 // T indicates type of input, likely String { // the following variables need to be initialized in subclass constructor protected int numstates; // number of states, assume 0 is start state. protected int maxinputs; // number of different inputs protected action[][] M; // the state action/transition table. // Methods to be implemented in subclass: **************** protected abstract void setTable(); // sets action table, other inits protected abstract T nextInput(); //gets next available input, null if none protected abstract boolean acceptState(int s); // true if s is accept state protected abstract int getindex(T x); // return row index for input x. ////////// optional parameters with defaults: public boolean trace = false; // for debugging messages public boolean keeprunning = false; // runs after accept state reached public boolean skipbadinput = true; // invalid inputs are ignored ////////// protected int cs=0; // current state - used internally (may make private) // polymorphic transition method, called for each input x public void transition(T x) { if (trace) System.out.println("state "+cs+", input "+x); int i = getindex(x); // get index for input if (i<0 || i>=maxinputs || M[cs][i]==null) //check bad input { if (skipbadinput) return; else throw new RuntimeException("unrecognized input "+x); } cs = M[cs][i].act(); // perform action and transition to new state }//transition public boolean run() // run machine, return true on success { // setTable(); // called from concrete class cs = 0; // set current state to start state T x = nextInput(); // should return null on end of input while (x!=null && (keeprunning || !acceptState(cs))) { transition(x); x = nextInput(); } if (trace) System.out.println("final state "+cs); wrapup(); // in case there's anything else to be done at the end... return acceptState(cs); // returns true if ends in an accept state. } protected void wrapup() {} // default does nothing - can override }//FSM4