This is the "pgen" module. It is a set of bindings to the pgen, the parser generator that is used to build Python's compiler. Included is a sample application, python.py, which constructs relatively compact ASTs from Python source code. The module contains two functions: parser_from_file -- which creates a grammar object from a grammar description file. parser_from_string -- which creates a grammar object from a grammar description contained in a string. The grammar object has two methods: parse_file -- which creates a parse tree from a file. parse_string -- which creates a parse tree from a string (Both parse_file and parser_from_file require a file name to be given. This could easily (and should) be changed to work with anything that has a fileno method.) The grammar object supports a "reduction function" dictionary. This dictionary is accessed through the "reductions" attribute. After the actual parsing is completed, a bottom up reduction pass is made over the parse tree (someday I would like to eliminate the initial construction of the parse tree). This means that for each terminal and non-termainal in the parse tree, the reduction function dictionary is checked for an entry whos name matches that of the terminal or non-terminal. If one is found, the function is called. The result of the call is used to replace the current node in the parse tree. The reduction process continues until all node in the tree have been reduced. This mechanism is intended to be a convenient way to remove unnecessary nodes from the parse tree. Consider this example grammar: list: '[' value (',' value)* [','] ']' value: NAME | STRING Each punctuation mark '[', ',', and ']' will show up in the resulting parse tree as a seperate node. It is very unlikely that this will be useful to the application. The following reduction function will clean things up. def list(g, tok, where, children): l = [] for child in children: if child[0] == g.NT.value l.append(child[0]) return (tok, l) It replaces the current "list" node with a tuple containing only the token g.NT.list and the list of values represented as regular old Python list. Take a look at example.py for a larger example. -- Donald Beaudry Silicon Graphics Compilers/MTI 1 Cabot Road donb@sgi.com Hudson, MA 01749 ...So much code, so little time...