The Packrat Parsing and Parsing Expression Grammars Page

packrat  peg 

Оригинал статьи:


Parsing expression grammars (PEGs) are an alternative to context free grammars for formally specifying syntax, and packrat parsers are parsers for PEGs that operate in guaranteed linear time through the use of memoization. For a brief technical summary see the Wikipedia entry on PEGs. For more in-depth descriptions see the original PEG paper and packrat parsing paper, and related papers below. Bryan Ford, the maintainer of this site, coined the modern terms PEGs and packrat parsing, but much of their formal theory existed earlier and all the more recent work on this topic is by others.

Mailing List

The following mailing list is open to anyone wishing to announce software or papers related to PEGs and packrat parsing, or discuss ideas and issues:

The PEG Mailing List

PEG/Packrat Parsing Bibliography

The following research papers develop and explore PEGs and packrat parsing in detail; most were written by various authors working independently. The papers are listed chronologically starting from most recent. Some of the papers have associated code available online.

PEG-related Projects

The following projects have implemented PEG parsers, parser generators, and/or combinator libraries in a variety of languages:

  • Multi-language:
    • ANTLR, a well-established parser generator by Terence Parr, supports extensive PEG features and combines packrat parsing with LL parsing techniques.
    • Waxeye by Orlando Hill, a PEG-based parser generator supporting C, Java, JavaScript, Python, Ruby, and Scheme.
    • parboiled is a parser library for Java and Scala that interprets parsers expressed "in-line" in Java/Scala code.
    • vembyr supports C++, Python, and Ruby.
  • Java:
  • Python:
    • The pyparsing monadic parsing combinator library by Paul McGuire now supports packrat parsing.
    • Packrat parsing support has also been incorporated into PyPy.
    • pyPEG is a PEG parser interpreter for Python by Volker Birk.
  • Haskell:
    • Frisby by John Meacham is a monadic packrat parser library for Haskell that uses advanced Haskell type system features to support dynamic specification of composable parsers.
    • Pappy by Bryan Ford is a simple prototype packrat parser generator for Haskell. Not actively supported.
  • C, C++:
    • The PEG Template Library for C++0x by Colin Hirsch expresses grammars via C++ templates.
    • The peg/leg parser generator emphasizes convenience for those already familiar with lex/yacc.
    • chilon is a template-based parser library that builds ASTs from the same PEG spec.
  • C#:
    • NPEG is a library providing objects to build PEGs incrementally in C#.
    • IronMeta is a C# implementation of OMeta.
  • JavaScript:
    • OMeta supports PEG-based pattern matching extended to handle arbitrary kinds of data.
    • PEG.js is a parser generator that produces fast parsers with excellent error reporting.
  • Tcl: The new grammar::peg module supports grammar definition and parsing using PEGs.
  • SmalltalkOMeta provides PEG-based pattern matching for the Squeak programming environment.
  • Scheme:
  • Common Lisp:
    • CL-peg by John Leuner supports PEGs and packrat parsing in Common Lisp.
    • cl-peg, for Common Lisp.
  • Lua: Roberto Ierusalimschy has provided the LPeg pattern-matching library.
  • Ruby now has the Treetop grammar description and packrat parsing language.
  • Dylanpeg-parser library by Dustin Voss.

Sample PEGs

This section contains pointers to some "nontrivial" grammars expressed as PEGs or PEG-like languages:

  • xtc contains modular Rats! grammars for C and Java.
  • Mouse includes grammars for Java and C.
  • parboiled includes a grammar for Java.
  • PEG.js includes a grammar for JavaScript/ECMAScript.