5 Best Ways to Evaluate S-Expressions as Strings in Python

πŸ’‘ Problem Formulation: In this article, we explore the nuances of evaluating s-expressions, a notation commonly used in programming languages like Lisp. Each s-expression is a string representing a structured list, and evaluating it means parsing and executing it to obtain a result. For instance, given the input "(add 1 (multiply 2 3))", we aim to parse this s-expression and output its evaluation, which would be 7 in this example.

Method 1: Recursive Descent Parser

A Recursive Descent Parser is a straightforward method to evaluate s-expressions by breaking down the problem into smaller subproblems. It involves writing functions that conform to the grammar of s-expressions and recursively parse the expression part by part.

Here’s an example:

def evaluate_expression(expr):
    expr = expr.strip()
    if expr.isdigit():
        return int(expr)
    expr = expr[1:-1].strip()
    operator, rest = expr[0], expr[1:]
    if operator == 'add':
        return sum(evaluate_expression(part) for part in rest.split())
    elif operator == 'multiply':
        result = 1
        for part in rest.split():
            result *= evaluate_expression(part)
        return result

print(evaluate_expression("(add 1 (multiply 2 3))"))

The output will be: 7

This code snippet defines a function evaluate_expression that takes an s-expression as input and recursively evaluates it. The function checks whether the input is a digit, and if so, returns it as an integer. Otherwise, it splits the expression into its operator and operands and evaluates each operand recursively.

Method 2: Using Abstract Syntax Trees (AST)

Abstract Syntax Trees (AST) provide a way to process structured code such as s-expressions. By using Python’s ast library, we can parse the string and convert it into an AST, then execute the AST to obtain the result.

Here’s an example:

import ast
import operator as op

def evaluate_ast(expr):
    expr = expr.replace('add', '+').replace('multiply', '*')
    tree = ast.parse(expr, mode='eval')
    func = {
        ast.Add: op.add,
        ast.Mult: op.mul
    }
    
    def evaluate(node):
        if isinstance(node, ast.Expression):
            return evaluate(node.body)
        elif isinstance(node, ast.Num):
            return node.n
        elif isinstance(node, ast.BinOp):
            return func[type(node.op)](evaluate(node.left), evaluate(node.right))

    return evaluate(tree.body)

print(evaluate_ast("(add 1 (multiply 2 3))"))

The output is: 7

This example converts the s-expression into a Python expression using string replacements and then uses the ast module to parse and evaluate the expression. The evaluate function traverses the AST and computes the result based on the operation nodes encountered.

Method 3: Stack-Based Evaluation

Stack-based evaluation is an iterative method that processes each token in the s-expression. It maintains a stack to evaluate the expression as each token is read.

Here’s an example:

def evaluate_stack(expr):
    stack, token = [], ''
    for char in expr:
        if char in ' ()':
            if token:
                stack.append(int(token) if token.isdigit() else token)
                token = ''
            if char == ')':
                args = []
                while isinstance(stack[-1], int):
                    args.append(stack.pop())
                op = stack.pop()
                stack.append(sum(args) if op == 'add' else prod(args))
        else:
            token += char
    return stack[0]

print(evaluate_stack("(add 1 (multiply 2 3))"))

The output will be: 7

In this example, the evaluate_stack function utilizes a stack to parse and evaluate the s-expression. Operands are pushed onto the stack, and when a closing parenthesis is encountered, the function computes the result of the sub-expression and pushes the result back onto the stack.

Method 4: Tokenizing and Parsing

Tokenizing and parsing build upon separating the s-expression into recognizable pieces (tokens) and constructing a meaningful result from these tokens using a parser function.

Here’s an example:

def tokenize(expr):
    return expr.replace('(', ' ( ').replace(')', ' ) ').split()

def parse(tokens):
    if not tokens:
        raise SyntaxError('Unexpected EOF')
    token = tokens.pop(0)
    if token == '(':
        sub_expr = []
        while tokens[0] != ')':
            sub_expr.append(parse(tokens))
        tokens.pop(0)  # Remove ')'
        return sub_expr
    elif token == ')':
        raise SyntaxError('Unexpected )')
    else:
        return int(token)

def eval_parsed(parsed):
    if isinstance(parsed, int):
        return parsed
    else:
        op = parsed[0]
        args = parsed[1:]
        if op == 'add':
            return sum(eval_parsed(arg) for arg in args)
        elif op == 'multiply':
            result = 1
            for arg in args:
                result *= eval_parsed(arg)
            return result

tokens = tokenize("(add 1 (multiply 2 3))")
parsed = parse(tokens)
print(eval_parsed(parsed))

The output will be: 7

The code snippet first tokenizes the s-expression into individual components and then uses a recursive parsing function to build a nested list structure representing the parsed expression. The eval_parsed function then recursively evaluates this structure.

Bonus One-Liner Method 5: Using Eval with Preprocessing

Given the simple nature of the s-expression and a well-known set of functions, we can even use Python’s built-in eval function with preprocessing of the input string as a quick-and-dirty one-liner solution.

Here’s an example:

expr = "(add 1 (multiply 2 3))"
print(eval(expr.replace('add', '+').replace('multiply', '*')))

The output is: 7

The one-liner uses string replacement to turn the s-expression into a Python expression that can be evaluated with eval.

Summary/Discussion

  • Method 1: Recursive Descent Parser. Strengths: Simplicity, no external libraries needed. Weaknesses: Handling complex expressions may become cumbersome.
  • Method 2: Using AST. Strengths: Robust and secure way to evaluate expressions programmatically. Weaknesses: Might be slower due to the parsing and execution of the AST.
  • Method 3: Stack-Based Evaluation. Strengths: Efficient iterative solution that doesn’t require recursion. Weaknesses: Parsing can be complex and error-prone with more complicated grammar.
  • Method 4: Tokenizing and Parsing. Strengths: Clear separation of concerns between lexing and parsing. Weaknesses: More verbose and requires careful management of tokens.
  • Method 5: Using Eval with Preprocessing. Strengths: Simple and concise. Weaknesses: Security risks if expressions are not properly sanitized, limited to simple transformations.