3 Context-free Language and Pushdown Automata¶

3.1 Context-Free Grammars¶

A Context-Free Grammar (CFG) is a 4-tuple

\[ G = (V, \Sigma, R, S) \]

where:

$V$ is an alphabet (a finite set of symbols), called the vocabulary or set of variables. 字母表
$\Sigma \subseteq V$ is the set of terminal symbols (the actual characters in the language). 终止符号
$S \in V - \Sigma$ is the start symbol (a special non-terminal from which derivation begins). 起始符号
$R$ is the set of production rules, a finite subset of $(V - \Sigma) \times V^* $.

比如说 $L = \{a^nb^n|n\ge 0\}, S\rightarrow aSb,S\rightarrow e$

接下来我们证明所有的正则语言都是 CFL，这里就选择直接构造。对于一个正则语言，其被一个 DFA $M = (K,\Sigma,\delta,s,F)$接受。那么我们可以构造如下一个 CFG $G = (V,\Sigma,R,S)$，

对于每个状态 $q∈K$ 和每个字符 $a∈Σ$，如果 DFA 中有转移：$δ(q,a)=p$，那么在 CFG 中添加规则：$q→ap$
对于每个接受状态 $q∈F$，添加规则：$q→ε$

然后我们需要考虑的是如何证明某个语言是 CFL，只需要构造一个 CFG，并说明这两者等价即可

Example

3.2 Parse Tree¶

A derivation of a CFG $𝐺 = (𝑆, 𝑉, 𝛴, 𝑅)$ , can be represented with a parse tree

Leaves: terminal symbols
Root: Start Symbol
Node: element in $V$

我们也可以发现由同一个 CFG 生成出来的相同的字符串，它们的 parse tree 也可能不一样

Definition: Similarity of derivations

Let $G = (V, \Sigma, R, S)$ be a CFG.

Let $D = x_1 \Rightarrow x_2 \Rightarrow x_3 \Rightarrow \cdots \Rightarrow x_n$, where $x_i \in V^*$, $x_1 \in V - \Sigma$, $x_n \in \Sigma^*$

Let $D' = x'_1 \Rightarrow x'_2 \Rightarrow x'_3 \Rightarrow \cdots \Rightarrow x'_n$, where $x'_i \in V^*$, $x'_1 \in V - \Sigma$, $x'_n \in \Sigma^*$

$D$ precedes $D'$ ($D < D'$) $\Leftrightarrow \exists 1 \leq k \leq n \text{ such that}$

For all $i \neq k$, we have $x_i = x'_i$
$x_{k-1} = x'_{k-1} = uAvBw$, where $u, v, w \in V^*$, and $A, B \in V - \Sigma$
$x_k = uyvBw$, where $A \to y \in R$
$x'_k = uAvzw$, where $B \to z \in R$
$x_{k+1} = x'_{k+1} = uyvzw$

$D$ and $D'$ are similar $\Leftrightarrow(D, D') \text{ belong to the reflexive, symmetric, transitive closure of } <$

Note

相似性是一种等价关系 （Similarity is an equivalence relation）

每个语法树都包含一个在 < 关系下是极大或极小的推导
（Each parse tree contains a derivation that is maximal or minimal under <）

最左推导（类似地，最右推导）
（Leftmost derivation (similarly, right most derivation)）
最左推导是指：每次总是对最左边的非终结符应用产生式规则
（A leftmost derivation is a derivation in which a production is always applied to the leftmost symbol）

Theorem: Let $G = (V, \Sigma, R, S)$ be a CFG, and let $A \in V - \Sigma$, and $w \in \Sigma^*$. Then the following statements are equivalent:

$A \Rightarrow^* w$
$\exists$ a parse tree with root $A$ and yield $w$
$\exists$ a leftmost derivation $A \xRightarrow{L^*} w$
$\exists$ a rightmost derivation $A \xRightarrow{R^*} w$

也即是说只有存在一条路径可以推导出来，那么就存在最右推导和最左推导。

Note

下面的论述是错误的

A string cannot have multiple left and rightmost derivations 【错，一个字符串可以有多个最右推导和最左推导】
Let $D = x_1 \Rightarrow x_2 \Rightarrow x_3 \Rightarrow \cdots \Rightarrow x_n$, where $x_i \in V^*$, $x_1 \in V - \Sigma$, and $x_n \in \Sigma^*$. Let $D' = x'_1 \Rightarrow x'_2 \Rightarrow x'_3 \Rightarrow \cdots \Rightarrow x'_n$. If $x_1 = x'_1$ and $x_n = x'_n$, then $D$ and $D'$ must be similar. 【只有初始和末尾相同，并不能说明是相似的】

Definition: A grammar in which some words has two parse trees is said to be ambiguous.

3.3 Pushdown Automata¶

Pushdown Automata 可以被认为是 FA 加了一个栈，也就是多了一个记忆的功能，使得其功能变得更加强大。每次 PDA 读取磁带上的输入，还要读入栈中的信息。最后当状态是最终状态且磁带读完栈为空时，接受该字符串。

形式化定义 PDA：一个 PDA 是一个六元组：

\[ P = (K, \Sigma, \Gamma, \Delta, s, F) \]

其中各部分含义如下：

$K$：有限状态集合（states）
$\Sigma$：输入字母表（input alphabet）
$\Gamma$：栈字母表（stack alphabet），即栈中可能出现的符号集合
$s \in K$：初始状态（start state）
$F \subseteq K$：接受状态集合（final/accepting states）
$\Delta$：转移关系（transition relation），是一个有限子集：

\[ \Delta \subseteq (K \times (\Sigma \cup \{e\}) \times \Gamma^*) \times (K \times \Gamma^*) \]

每一个 FA 都可以平凡地被看做为一个不对栈进行任何操作的 PDA。

在 PDA 中，同一个状态、输入符号和栈顶符号组合，可能对应多个不同的转移规则。这意味着在某一步，机器可以选择执行其中任意一个规则。因此，这种 PDA 是非确定性的（Nondeterministic PDA, NPDA）。

$(p, x, \alpha) \vdash_M (q, y, \zeta)$ (yield in one step) iff there is some transition $((p, a, \beta), (q, \gamma)) \in \Delta$ such that

$x = ay$, $a \in \Sigma \cup \{e\}$
$\alpha = \beta\eta$
$\zeta = \gamma\eta$ for some $\eta \in \Gamma^*$

Let $\vdash_M^*$ be the reflexive, transitive closure of $\vdash_M$.

那么也就是说 PDA $M$ 所接受的语言就是

\[ L(M) = \{w \mid (s, w, e) \vdash_M^* (p, e, e) \text{ for some state } p \in F\}. \]

Example

实际上由 PDA 所接受的语言就是 CFL，证明思路还是正反两面

Each context-free language is accepted by some PDA.
If a language is accepted by a PDA, it is a context-free language.

评论区

对你有帮助的话请给我个赞和 star =>

欢迎跟我探讨！！！