======================================== Problem 1. Speech recognition as search ======================================== Note: This is simply one possible example solution. We accepted many variants. Search problem Formulation. --------------------------- Let's assume that the input is a sequence {o1, o2, ...., on} In this problem a state is a sequence of phonemes {p1, p2, p3, ..., pk} that match part of the input sequence {o1, o2, ...ok }, where 1 <= k <= n (from the beginning of the input sequence up to certain point of the sequence). Start state. No phoneme has been matched. State S0 = {} Goal state. The "best" sequence {p1, p2, ...pn} of phonemes that matches the whole input sequence {o1, o2, ...., on} Successor function. The successor function selects the set of possibles states in the next step of the search. Any of the 50 phonemes are possible. Path cost: The path cost helps the search algorithm to decide which phoneme to append to the sequence of phonemes to match. Choosing a path cost of 1 for any step is not a good idea because it gives priority to the shortest sequence of phonemes but never takes into account the quality of the matching. We want to give a high cost to bad matches and a low cost to good matches. The the cost of going from the initial state to the state S = {p1,...,p_k } (the cost of the partial match) should be related to the quality of the match. One way of doing it is assigning to each path a value related to the probability of taking that path given the partial input o1,...,ok. Let P(S) = P(p1,...p_k |o1,...o_k) be that probability. Let's assume (for simplicity of computation) that P(S) depends only on the previous state, then we have P(S) = P(p_k|p_k-1, o_k)*P(pk_1|p_k-2,o_k-1)*....*P(p1 | o1) We cannot take P(S) as the cost of the path from the initial state up to S because it would assign the highest values to the most probable paths and A* is searching for the minimum cost path. One way of solving this problem is defining the cost of the path as: C(S1) = -log( P(S1) ) Why do we use -log instead of log? Speech Recognition Cost. ------------------------ We know: - average English sentence length, in good writing style, is about 15 words. - average of phonemes per word is 7.8 - number of phonemes in English is 50 We do the search in the the space of English sentences. Then we have an average of 15*7.8 = 117 phonemes per sentence. Since each phoneme can take one out of 50 values, the space of English sentences has an average of different sentences of O(50^117). Taking into account sentences of any length from one phoneme to 117 phonemes we have O(50^118) possible sentences The breadth first search algorithm in this space would need an average of O(50^118) nodes (space in memory) and O(50^118) node operations. Obviously it is not practical at all. (Observe that these values are far bigger than those shown in page 74 of the book) Using depth first search can reduce considerably the space requirements, but the time requirements are still the same. Speech Recognition Search Strategies. ------------------------------------- A. An A* admissible heuristic The sum of the minimum costs (phone observation probabilities) across the expected number of phonemes. B. Using beam search means to use only the best w successors (we want w<50). Searching only through w out of 50 possibilities means that the search is not performed over the whole sentences space, then it is possible that the beam search process rules out the optimal solution. Hence optimality is not guaranteed if beam search is used. ============================= Problem 2. Systematic search ============================= Problem 3.9 ------------ a. Missionaries/Cannibals problem What is the information necessary in each state? - number of missionaries and cannibals are at each side of the river. - in what side of the river the boat is. Observe that if we know how many cannibals and missionaries are at one side of the river, we can deduce how many people are at the other side. Then we can represent one state as a vector S with three cells defined as: S[1] : number of missionaries at the right side (from 0 to 3) S[2] : number of cannibals at the right side (from 0 to 3) S[3] : is the boat at the right side? (0 or 1) Also observe that a valid state is an state in which S[1] >= S[2] and S[2] >= S[1] when all the missionaries (or all the cannibals) are at one side Now we can define the initial state as S = [3,3,1] and the goal state as S = [0,0,0] The successor function produces a set of states based in the current state S. The set of states is produced by doing the following process: If S[3] is equal to 1, substract of the current state each of the vectors [1,0,1] [0,1,1] [2,0,1] [0,2,1] and discard the invalid states If S[3] is equal to 0, add to the current state each of the vectors [1,0,1] [0,1,1] [2,0,1] [0,2,1] and discard the invalid states The state-space for the problem, any other state is invalid. [3,3,1] (initial state) | [3,1,0] or [2,2,0] | [3,2,1] | [3,0,0] | [3,1,1] | [1,1,0] | [2,2,1] | [0,2,0] | [0,3,1] | [0,1,0] | [1,1,1] or [0,2,1] | [0,0,0] (goal state) Note: there are many more valid transitions between states, but all of them lead to states that where previously visited. If an algorithm does not check for repeated states it can easily get trapped in infinite loops. (c) Why do you think people have a hard time solving the problem? This question is quite open and allows many possible answers. However, it can be highlighted that - there are many possible invalid states that humans have to check for - there are many loops people can get trapped in Problem 3.8 ------------ (a) State space: 1 ____________ | | 2 3 ______ _______ | | | | 4 5 6 7 ____________________ | | | | | | | | 8 9 10 11 12 13 14 15 (b)The goal state is 11. Nodes visited using: BFS search: 1 2 3 4 5 6 7 8 9 10 11 Depth limited search: 1 2 4 8 9 5 10 11 iterative deepening search: 1 1 2 3 1 2 4 5 3 6 7 1 2 4 8 9 5 10 11 ============================= Problem 3. Game search ============================= (a) Number of possible games. The tic tac toe board has nine cells. Then, assuming that x-player always start, for the first move he/she has nine possibilities. For the o-player there are eight possible movements. For the next movement there are seven possibilities, and so on. Then we now that there at most 9*8*7*4*3*2 = 9! possible games. However, you can notice that we are overcounting, mainly because: - the symmetry of the board reduces the number of different movements - not all the games end after 9 movements If we take into account the symmetry, there only 3 choices for the initial movement (center, corner or side). If the first movement was center, then the next player has only 2 different choices. If the first movement was corner, the next player has five choices. Same if the first movement was side. Now we know that there are at most (2 + 10)*7! possible games. For a complete discussion of counting tic-tac-toe games visit the page: http://www.mathrec.org/old/2002jan/solutions.html Minimax Alpha-Beta Pruning ------------------- depth 0 | [+1] -----------------------------|---------------------------------- | | | . . . x . . . x . . x . [+1] . . . [-1] [-2] . . . . . . best! . . . . . . | | | |-----| |-------|------|------|------| |----|----|-----|----| | | | | | | | | | | | | .o. ..o x.. x.o x.. xo. x.. .x. .x. .x. .x. .xo .x. .x. .o. ... ..o ... ... .o. ... ... ..o ... ... ... ... ... ... ... ..o ... .o. ..o ... ... [+2] [+1] [-1] [0] [+1] [+1] [0] [-2] [0] [-1] [0] [-1]