irregular languages - courses.cs.washington.edu
TRANSCRIPT
Irregular Languages
Announcements
(draft) lecture slides are up for the last few lectures.
Youβre responsible for everything in deck 26 (except as marked)
Thereβs a problem on the final that says βyou choose which of these to do: prove this language is irregular, prove this set is uncountable.β
You can use late days on HW8 (but they cut into time when the final is available)
Review session Friday at 6:30. Zoom link coming. Will be recorded.
What canβt NFAs/DFAs/Regexes do?
There are languages that canβt be recognized by a DFA.
{0π1π|π β₯ 0} is the classic example.
Letβs prove it.
{0π1π|π β₯ 0} is not regular
Not a proof:
{0π1π|π β₯ 0} is regular if and only if thereβs a DFA the recognizes it.
To know if the number of 0βs is equal to the number of 1βs that DFA is going to have to count that number π.
But that number could be arbitrarily big.
And we have some finite number of states less than that.
So itβs not regular!
βthe DFA must work like thisβ isnβt
a rigorous argument.
βThe DFA must do thisβ
Consider βThe set of all binary strings, where the number of β01β substrings is equal to the number of β10β substrings.β
The DFA must count the number of β01β and β10β substrings, right?
NO!
Between every 01 substring thereβs a 10 substring.
0001111000011101011
You donβt have to count the total, just the difference between them. And that difference never exceeds 1.
In generalβ¦
Try to avoid saying βin order to solve problem X, a machine must do YβItβs almost impossible to rigorously justify.
And itβs easy to trick yourself β youβll have to argue no one is more clever than you are.
The right way to argue around this claim is to directly argue every DFA does the wrong thing (rather than indirectly that they canβt do the right thing).
A Proof Outline
Claim: 0π1π: π β₯ 0 is an irregular language.
Proof:
Suppose, for the sake of contradiction, that {0π1π: π β₯ 0} is regular.
Then there is a DFA π such that π accepts exactly {0π1π: π β₯ 0}.
We want to show π doesnβt actually accept 0π1π: π β₯ 0
How do we do that?
Forcing a mistake
We donβt know anything about the DFA.
Well except that it is a DFA.
So it is deterministic. I.e. if youβre in a particular state and you read a certain character thereβs only one place to go.
Itβs also finite. I.e. there are not an infinite number of states.
Forcing a Mistake
What ifβ¦
We could find a set of strings that all must be in different states.
Too many of them. Enough that two must be in the same state.
How would we know two strings have to be in different states?
How many strings is enough? infinity
Forcing A Mistake
How do we know π₯, π¦ must be in different states?
Well if one would be accepted and the other rejected, that would be a clear sign.
Or if thereβs some string π§ where π₯π§ is accepted but π¦π§ is rejected (or vice versa).
The machine is deterministic! If π₯ and π¦ take you to the same state, then π₯π§ and π¦π§ are also in the same state!
A Proof Outline
Claim: 0π1π: π β₯ 0 is an irregular language.
Proof:
Suppose, for the sake of contradiction, that {0π1π: π β₯ 0} is regular.
Then there is a DFA π such that π accepts exactly {0π1π: π β₯ 0}.
We want to show π doesnβt actually accept 0π1π: π β₯ 0
How do we do that?
A Proof Outline
Claim: 0π1π: π β₯ 0 is an irregular language.
Proof:
Suppose, for the sake of contradiction, that {0π1π: π β₯ 0} is regular.
Then there is a DFA π such that π accepts exactly {0π1π: π β₯ 0}.
Let π = [TODO]. S is an infinite set of strings.
Because the DFA is finite, there are two (different) strings π₯, π¦ in π such that π₯ and π¦ go to the same state. We donβt get to choose π₯, π¦
Consider the string π§ =[TODO] We do get to choose π§ depending on π₯, π¦
A Proof Outline
Claim: 0π1π: π β₯ 0 is an irregular language.
β¦
Let π = [TODO]. S is an infinite set of strings.
Because the DFA is finite, there are two (different) strings π₯, π¦ in π such that π₯and π¦ go to the same state. We donβt get to choose π₯, π¦
Consider the string π§ =[TODO] We do get to choose π§ depending on π₯, π¦
Since π₯, π¦ led to the same state and π is deterministic, π₯π§ and π¦π§ will also lead to the same state π in π. Observe that π₯π§ β {0π1π: π β₯ 0} but π¦π§ β{0π1π: π β₯ 0}. So π does not actually recognize {0π1π: π β₯ 0}. Thatβs a contradiction!
Therefore, {0π1π: π β₯ 0} is an irregular language.
Filling in the blanks
Need π
Such that
1. π is infinite
2. For every pair of strings π₯, π¦ in π, there is a suffix π§ such that one of π₯π§, π¦π§ is in the language and the other is not in the language.
Let π = {0π: π β₯ 0}
π₯ = 0π, π¦ = 0π with π β π. Take π§ = 1π.
Claim: 0π1π: π β₯ 0 is an irregular language.
Proof:
Suppose, for the sake of contradiction, that {0π1π: π β₯ 0} is regular.
Then there is a DFA π such that π accepts exactly {0π1π: π β₯ 0}.
Let π = {0π: π β₯ 0}.
Because the DFA is finite and π is infinite, there are two (different) strings π₯, π¦in π such that π₯ and π¦ go to the same state when read by π. Since both are in π, π₯ = 0π for some integer π, and π¦ = 0π for some integer π, with π β π.
Consider the string π§ = 1a. π₯π§ = 0π1π β {0π1π: π β₯ 0} but π¦π§ = 0π1π β{0π1π: π β₯ 0}.
Since π₯, π¦ both end up in the same state, and we appended the same π§, both π₯π§ and π¦π§ end up in the same state of π.
Since π₯π§ β 0π1π: π β₯ 0 and π¦π§ β 0π1π: π β₯ 0 , π does not recognize 0π1π: π β₯ 0 . But thatβs a contradiction!
So 0π1π: π β₯ 0 must be an irregular language.
Full outline
1. Suppose for the sake of contradiction that πΏ is regular. Then there is some DFA π that recognizes πΏ.
2. Let π be [fill in with an infinite set of prefixes].
3. Because the DFA is finite and π is infinite, there are two (different) strings π₯, π¦ in π such that π₯ and π¦ go to the same state when read by π [you donβt get to control π₯, π¦ other than having them not equal and in π]
4. Consider the string π§ [argue exactly one of xz, yz will be in L]
5. Since π₯, π¦ both end up in the same state, and we appended the same π§, both π₯π§ and π¦π§ end up in the same state of π. Since π₯π§ β πΏand π¦π§ β πΏ, πdoes not recognize πΏ. But thatβs a contradiction!
6. So πΏ must be an irregular language.
Practical Tips
When youβre choosing the set π, think about what the DFA would βhave to countβ
That is fundamentally why a language is irregular. The set π is the way we prove it! Whatever we βneed to rememberβ itβs different for every element of π.
If your strings have an βobvious middleβ (like between the 0βs and 1βs) thatβs a good place to start.
Letβs Try another
The set of strings with balanced parentheses is not regular.
What do you want π to be? What would you have to count?
The number of unclosed parentheses.
Want π to be a set with infinitely many strings with different numbers of unclosed parentheses.
Let π = (*
Full outline
1. Suppose for the sake of contradiction that πΏ is regular. Then there is some DFA π that recognizes πΏ.
2. Let π be (*
3. Because the DFA is finite and π is infinite, there are two (different) strings π₯, π¦ in π such that π₯ and π¦ go to the same state when read by π Observe that π₯ = (π for some integer π, π¦ = (b for some integer π with π β π.
4. Consider the string π§ [argue exactly one of xz, yz will be in L]
5. Since π₯, π¦ both end up in the same state, and we appended the same π§, both π₯π§ and π¦π§ end up in the same state of π. Since π₯π§ β πΏand π¦π§ β πΏ, πdoes not recognize πΏ. But thatβs a contradiction!
6. So πΏ must be an irregular language.
Full outline
1. Suppose for the sake of contradiction that πΏ is regular. Then there is some DFA π that recognizes πΏ.
2. Let π be (*
3. Because the DFA is finite and π is infinite, there are two (different) strings π₯, π¦ in π such that π₯ and π¦ go to the same state when read by π Observe that π₯ = (π for some integer π, π¦ = (b for some integer π with π β π.
4. Consider the string π§=)π π₯π§ is a balanced set of parentheses (since there are the same number of each and all the open-parentheses come before the close parentheses). But π¦π§ is not balanced because π β π.
5. Since π₯, π¦ both end up in the same state, and we appended the same π§, both π₯π§ and π¦π§ end up in the same state of π. Since π₯π§ β πΏand π¦π§ β πΏ, πdoes not recognize πΏ. But thatβs a contradiction!
6. So πΏ must be an irregular language.
One more, just the key steps
What about {ππππππ: π β₯ 0}?
π = {ππ: π β₯ 0} or π = {ππππ: π β₯ 0} are equally good choices.
Your suffix is ππππ for the first and ππ for the second.
π = {πππ: π β₯ 0} also works. π§ = ππβ1ππ is your suffix. The proof is a little more tedious, but you can make it through.
One Last Fun Fact
On HW8, the extra credit is to use the βpumping lemmaβ to prove a language is irregular.
The method from lecture βalways worksβ β i.e. for every non-regular language, you can write a proof like this.
For the pumping lemma that isnβt true β there are some non-regular languages that satisfy the pumping lemma, which is why we donβt focus on it.
The final common way to show languages are regular or not regular is βclosure propertiesβ β operations that (when applied to regular languages) always give you another regular language. Or vice versa.