DEV Community

Tramposo
Tramposo

Posted on

Building a Pawn to Python Compiler in PHP

When we think of PHP, we often associate it with web development. But what happens when we push PHP beyond its usual boundaries? In this article, we'll explore an unconventional use of PHP: building a compiler that translates Pawn code to Python. This project not only demonstrates PHP's versatility but also provides insights into the basics of compiler design.

Pawn to Python

Our goal was to create a compiler that could take Pawn code (a scripting language similar to C) and convert it into equivalent Python code. This task involves several key steps: tokenization, parsing, and code generation - all implemented in PHP.

Key Components of Our Compiler

1. Tokenization

The first step in our compiler is breaking down the input Pawn code into tokens. Here's how we approached it:

private function tokenize($input)
{
    $pattern = '/("[^"]*"|\s+|[{}();=]|\b\w+\b|.)/';
    preg_match_all($pattern, $input, $matches);
    $tokens = array_values(array_filter($matches[0], function ($token) {
        return $token !== '' && !ctype_space($token);
    }));
    return $tokens;
}
Enter fullscreen mode Exit fullscreen mode

This function uses a regular expression to identify different elements of the Pawn code, including string literals, whitespace, brackets, and keywords.

2. Parsing and Compilation

The heart of our compiler lies in the compile method and its supporting functions. Here's a simplified version of the main compilation loop:

public function compile()
{
    while (($token = $this->peekNextToken()) !== null) {
        if ($token === 'main') {
            $this->compileMainFunction();
        } else {
            $this->addError("Unexpected token outside of main function: '$token'");
        }
    }
    return $this->outputBuffer;
}
Enter fullscreen mode Exit fullscreen mode

This method iterates through the tokens, identifying key structures like the main function, and delegates to specialized methods for compiling different parts of the code.

3. Type Handling

One of the interesting challenges was dealing with Pawn's type system. We implemented basic type checking and default value assignment:

private function compileVariableDeclaration($indentation)
{
    $type = $this->getNextToken();
    $name = $this->getNextToken();
    $this->variables[$name] = $type;

    if ($this->peekNextToken() === '=') {
        // Handle initialization
    } else {
        $defaultValue = $this->getDefaultValueForType($type);
        $pythonDeclaration = str_repeat('    ', $indentation) . "$name = $defaultValue\n";
    }
    $this->outputBuffer .= $pythonDeclaration;
}
Enter fullscreen mode Exit fullscreen mode

This function handles variable declarations, assigning default values based on the variable type when no initial value is provided.

Challenges and Learnings

  1. Regular Expressions in PHP: Crafting the right regex for tokenization was crucial. PHP's preg_match_all proved suitable for this task.

  2. State Management: Keeping track of the current compilation state (like indentation level and declared variables) was essential. It was manageable given PHP's object-oriented features.

  3. Error Handling: Implementing robust error checking and reporting was vital for creating a usable compiler. We used a simple array to collect and report errors.

  4. Type Conversion: Bridging the gap between Pawn's static typing and Python's dynamic typing required careful consideration.

Preview: From Pawn to Python

To give you a concrete idea of what our compiler does, let's look at a simple "Hello, World!" program in Pawn and its equivalent Python output after compilation.

Pawn Input:

main()
{
    print("Hello, World!");
    return 0;
}
Enter fullscreen mode Exit fullscreen mode

Python Output:

# Compiled from Pawn to Python

def main():
    print("Hello, World!")

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

This simple example demonstrates several key aspects of our compiler:

  1. Print Statement: The print statement is carried over almost identically, as both languages use similar syntax for this basic operation.

  2. Return Statement Handling: The return 0; statement from Pawn is omitted in the Python output, as it's not necessary in Python's main() function.

  3. Python Idiom Addition: The compiler adds the if __name__ == "__main__": idiom, which is a common Python pattern for making scripts both importable and executable.

While this is just a basic example, it showcases the fundamental operation of our compiler: taking Pawn code and producing equivalent, executable Python code.

Conclusion

Building a Pawn to Python compiler in PHP was an exciting exploration of the language's capabilities. It showcases PHP's versatility and proves that with creativity, PHP can be pushed far beyond its typical use cases.

Whether you're a PHP enthusiast looking to expand your capabilities or a programmer interested in compiler design, experiments like these open up new perspectives on what's possible with the tools we use every day.

Top comments (0)