跳转至

3 Context-free Language and Pushdown Automata

3.1 Context-Free Grammars

A Context-Free Grammar (CFG) is a 4-tuple

\[ G = (V, \Sigma, R, S) \]

where:

  • \(V\) is an alphabet (a finite set of symbols), called the vocabulary or set of variables. 字母表
  • \(\Sigma \subseteq V\) is the set of terminal symbols (the actual characters in the language). 终止符号
  • \(S \in V - \Sigma\) is the start symbol (a special non-terminal from which derivation begins). 起始符号
  • \(R\) is the set of production rules, a finite subset of $(V - \Sigma) \times V^* $.

比如说 \(L = \{a^nb^n|n\ge 0\}, S\rightarrow aSb,S\rightarrow e\)


接下来我们证明所有的正则语言都是 CFL,这里就选择直接构造。对于一个正则语言,其被一个 DFA \(M = (K,\Sigma,\delta,s,F)\)接受。那么我们可以构造如下一个 CFG \(G = (V,\Sigma,R,S)\)

  • 对于每个状态 \(q∈K\) 和每个字符 \(a∈Σ\),如果 DFA 中有转移:\(δ(q,a)=p\),那么在 CFG 中添加规则:\(q→ap\)
  • 对于每个接受状态 \(q∈F\),添加规则:\(q→ε\)

然后我们需要考虑的是如何证明某个语言是 CFL,只需要构造一个 CFG,并说明这两者等价即可

Example

3.2 Parse Tree

A derivation of a CFG \(𝐺 = (𝑆, 𝑉, 𝛴, 𝑅)\) , can be represented with a parse tree

  • Leaves: terminal symbols
  • Root: Start Symbol
  • Node: element in \(V\)

我们也可以发现由同一个 CFG 生成出来的相同的字符串,它们的 parse tree 也可能不一样

Definition: Similarity of derivations

Let \(G = (V, \Sigma, R, S)\) be a CFG.

Let \(D = x_1 \Rightarrow x_2 \Rightarrow x_3 \Rightarrow \cdots \Rightarrow x_n\), where \(x_i \in V^*\), \(x_1 \in V - \Sigma\), \(x_n \in \Sigma^*\)

Let \(D' = x'_1 \Rightarrow x'_2 \Rightarrow x'_3 \Rightarrow \cdots \Rightarrow x'_n\), where \(x'_i \in V^*\), \(x'_1 \in V - \Sigma\), \(x'_n \in \Sigma^*\)


\(D\) precedes \(D'\) (\(D < D'\)) \(\Leftrightarrow \exists 1 \leq k \leq n \text{ such that}\)

  • For all \(i \neq k\), we have \(x_i = x'_i\)
  • \(x_{k-1} = x'_{k-1} = uAvBw\), where \(u, v, w \in V^*\), and \(A, B \in V - \Sigma\)
  • \(x_k = uyvBw\), where \(A \to y \in R\)
  • \(x'_k = uAvzw\), where \(B \to z \in R\)
  • \(x_{k+1} = x'_{k+1} = uyvzw\)

\(D\) and \(D'\) are similar \(\Leftrightarrow(D, D') \text{ belong to the reflexive, symmetric, transitive closure of } <\)

Note

相似性是一种等价关系 (Similarity is an equivalence relation)

每个语法树都包含一个在 < 关系下是极大或极小的推导
(Each parse tree contains a derivation that is maximal or minimal under <)

  • 最左推导(类似地,最右推导)
    (Leftmost derivation (similarly, right most derivation))

  • 最左推导是指:每次总是对最左边的非终结符应用产生式规则
    (A leftmost derivation is a derivation in which a production is always applied to the leftmost symbol)

Theorem: Let \(G = (V, \Sigma, R, S)\) be a CFG, and let \(A \in V - \Sigma\), and \(w \in \Sigma^*\). Then the following statements are equivalent:

  • \(A \Rightarrow^* w\)
  • \(\exists\) a parse tree with root \(A\) and yield \(w\)
  • \(\exists\) a leftmost derivation \(A \xRightarrow{L^*} w\)
  • \(\exists\) a rightmost derivation \(A \xRightarrow{R^*} w\)

也即是说只有存在一条路径可以推导出来,那么就存在最右推导和最左推导。

Note

下面的论述是错误的

  • A string cannot have multiple left and rightmost derivations 【错,一个字符串可以有多个最右推导和最左推导】
  • Let \(D = x_1 \Rightarrow x_2 \Rightarrow x_3 \Rightarrow \cdots \Rightarrow x_n\), where \(x_i \in V^*\), \(x_1 \in V - \Sigma\), and \(x_n \in \Sigma^*\). Let \(D' = x'_1 \Rightarrow x'_2 \Rightarrow x'_3 \Rightarrow \cdots \Rightarrow x'_n\). If \(x_1 = x'_1\) and \(x_n = x'_n\), then \(D\) and \(D'\) must be similar. 【只有初始和末尾相同,并不能说明是相似的】

Definition: A grammar in which some words has two parse trees is said to be ambiguous.

3.3 Pushdown Automata

Pushdown Automata 可以被认为是 FA 加了一个栈,也就是多了一个记忆的功能,使得其功能变得更加强大。每次 PDA 读取磁带上的输入,还要读入栈中的信息。最后当状态是最终状态且磁带读完栈为空时,接受该字符串。

形式化定义 PDA:一个 PDA 是一个六元组:

\[ P = (K, \Sigma, \Gamma, \Delta, s, F) \]

其中各部分含义如下:

  • \(K\):有限状态集合(states)
  • \(\Sigma\):输入字母表(input alphabet)
  • \(\Gamma\):栈字母表(stack alphabet),即栈中可能出现的符号集合
  • \(s \in K\):初始状态(start state)
  • \(F \subseteq K\):接受状态集合(final/accepting states)
  • \(\Delta\):转移关系(transition relation),是一个有限子集:
\[ \Delta \subseteq (K \times (\Sigma \cup \{e\}) \times \Gamma^*) \times (K \times \Gamma^*) \]

每一个 FA 都可以平凡地被看做为一个不对栈进行任何操作的 PDA。

在 PDA 中,同一个状态、输入符号和栈顶符号组合,可能对应多个不同的转移规则。这意味着在某一步,机器可以选择执行其中任意一个规则。因此,这种 PDA 是非确定性的(Nondeterministic PDA, NPDA)。

\((p, x, \alpha) \vdash_M (q, y, \zeta)\) (yield in one step) iff there is some transition \(((p, a, \beta), (q, \gamma)) \in \Delta\) such that

  • \(x = ay\), \(a \in \Sigma \cup \{e\}\)
  • \(\alpha = \beta\eta\)
  • \(\zeta = \gamma\eta\) for some \(\eta \in \Gamma^*\)

Let \(\vdash_M^*\) be the reflexive, transitive closure of \(\vdash_M\).

那么也就是说 PDA \(M\) 所接受的语言就是

\[ L(M) = \{w \mid (s, w, e) \vdash_M^* (p, e, e) \text{ for some state } p \in F\}. \]

Example

实际上由 PDA 所接受的语言就是 CFL,证明思路还是正反两面

  • Each context-free language is accepted by some PDA.
  • If a language is accepted by a PDA, it is a context-free language.

评论区

对你有帮助的话请给我个赞和 star => GitHub stars
欢迎跟我探讨!!!