please give me some sense of regular language and regular expression

Started by qlmi, June 10, 2006, 07:43:16 AM

Previous topic - Next topic

qlmi


tenkey

It's from parsing (compiler) theory.

Regular expressions are one way of describing regular languages. Regular languages can also be described with traditional BNF style grammars.
A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8

qlmi

OK! i see.
but would you  please tell me what is the regular language?

Ossa

Hmm... well let me start with the Chomsky hierarchy (quote from wikipedia):

Quote

  • Type-0 grammars (unrestricted grammars) include all formal grammars. They generate exactly all languages that can be recognized by a Turing machine. These languages are also known as the recursively enumerable languages. Note that this is different from the recursive languages which can be decided by an always-halting Turing machine.
  • Type-1 grammars (context-sensitive grammars) generate the context-sensitive languages. These grammars have rules of the form \alpha A\beta \rightarrow \alpha\gamma\beta with A a nonterminal and α, β and γ strings of terminals and nonterminals. The strings α and β may be empty, but γ must be nonempty. The rule S \rightarrow \epsilon is allowed if S does not appear on the right side of any rule. The languages described by these grammars are exactly all languages that can be recognized by a linear bounded automaton (a Turing machine whose tape is bounded by a constant times the length of the input.)
  • Type-2 grammars (context-free grammars) generate the context-free languages. These are defined by rules of the form A \rightarrow \gamma with A a nonterminal and γ a string of terminals and nonterminals. These languages are exactly all languages that can be recognized by a non-deterministic pushdown automaton. Context free languages are the theoretical basis for the syntax of most programming languages.
  • Type-3 grammars (regular grammars) generate the regular languages. Such a grammar restricts its rules to a single nonterminal on the left-hand side and a right-hand side consisting of a single terminal, possibly followed (or preceded, but not both in the same grammar) by a single nonterminal. The rule S \rightarrow \epsilon is also here allowed if S does not appear on the right side of any rule. These languages are exactly all languages that can be decided by a finite state automaton. Additionally, this family of formal languages can be obtained by regular expressions. Regular languages are commonly used to define search patterns and the lexical structure of programming languages.

Really the best way to understand this is either to read around wikipedia (see links below) or to get yourself a copy of "the dragon book" (or similar book).

http://en.wikipedia.org/wiki/Chomsky_hierarchy
http://en.wikipedia.org/wiki/Regular_grammar
http://en.wikipedia.org/wiki/Regular_language
http://en.wikipedia.org/wiki/Deterministic_finite_state_machine

I know that probably didn't help, but this is a very general topic - I can't compress it down to a single post.

Ossa
Website (very old): ossa.the-wot.co.uk

tenkey

To get an idea of regular languages, here is one characteristic that regular languages do not exhibit...bracketing.

The grammars for arithmetic expressions are not grammars for regular languages.

Arithmetic expressions allow bracketing with paired symbols, something that can't be done with regular grammars.

a * ((b + c) *  (d + e)) * f
A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8