DEV Community

Cover image for Write your own static analyzer for PHP.
Denzyl Dick
Denzyl Dick

Posted on • Edited on

Write your own static analyzer for PHP.

I keep reading everywhere that rust is fast, has memory safety guardrails, and is 1 of the most loved languages lately. After reading the book and dreaming about understanding the language, I decided to build a static analyzer for the language I've been using for over a decade. I will be naming it Phanalist and the code will also be available on
https://github.com/denzyldick/phanalist.

To see if this idea would work, I made a list of 4 common bad practices/mistakes I often see. As you can see in the code below, we can all agree that the class doesn't make you happy.

 <?php

 class uTesting extends FakeClass
  {
    const I_ = null;
    const hello = null;
    $no_= null, $no_modifier = null;

    public function __construct($o)
    {
      $this->fake_variable = 'hellworld';
    }

    function test($a){
      return 1;
    }
  }
Enter fullscreen mode Exit fullscreen mode

To keep it easy, I will first find these four common mistakes that are easy to spot. This list will continue to grow in the future.

  • Misplaced opening tag.
  • Class name that starts in lowercase
  • Lowercase constants
  • Defining parameters without a type.

The static analyzer should be beginner friendly. Instead of making the developer's life harder, it should be able to explain the error in a way that the developer should understand. But first, let us focus on the first steps.

I will use PHP-Parser a rust library that can parse PHP code and generate an AST(abstract syntax tree). The parser's output for the PHP code above will be a vector(Vec<Statement) containing all the statements.

   [
    FullOpeningTag(
        Span {
            line: 1,
            column: 1,
            position: 0,
        },
    ),
    Class(
        ClassStatement {
            attributes: [],
            modifiers: ClassModifierGroup {
                modifiers: [],
            },
            class: Span {
                line: 2,
                column: 2,
                position: 7,
            },
            name: SimpleIdentifier {
                span: Span {
                    line: 2,
                    column: 8,
                    position: 13,
                },
                value: "uTesting",
            },
            extends: Some(
                ClassExtends {
                    extends: Span {
                        line: 2,
                        column: 17,
                        position: 22,
                    },
                    parent: SimpleIdentifier {
                        span: Span {
                            line: 2,
                            column: 25,
                            position: 30,
                        },
                        value: "FakeClass",
                    },
                },
            ),
            implements: None,
            body: ClassBody {
                left_brace: Span {
                    line: 3,
                    column: 3,
                    position: 42,
                },
                members: [
                    Constant(
                        ClassishConstant {
                            comments: CommentGroup {
                                comments: [],
                            },
                            attributes: [],
                            modifiers: ConstantModifierGroup {
                                modifiers: [],
                            },
                            const: Span {
                                line: 4,
                                column: 5,
                                position: 48,
                            },
                            entries: [
                                ConstantEntry {
                                    name: SimpleIdentifier {
                                        span: Span {
                                            line: 4,
                                            column: 11,
                                            position: 54,
                                        },
                                        value: "I_",
                                    },
                                    equals: Span {
                                        line: 4,
                                        column: 14,
                                        position: 57,
                                    },
                                    value: Null,
                                },
                            ],
                            semicolon: Span {
                                line: 4,
                                column: 20,
                                position: 63,
                            },
                        },
                    ),
                    Constant(
                        ClassishConstant {
                            comments: CommentGroup {
                                comments: [],
                            },
                            attributes: [],
                            modifiers: ConstantModifierGroup {
                                modifiers: [],
                            },
                            const: Span {
                                line: 5,
                                column: 5,
                                position: 69,
                            },
                            entries: [
                                ConstantEntry {
                                    name: SimpleIdentifier {
                                        span: Span {
                                            line: 5,
                                            column: 11,
                                            position: 75,
                                        },
                                        value: "hello",
                                    },
                                    equals: Span {
                                        line: 5,
                                        column: 17,
                                        position: 81,
                                    },
                                    value: Null,
                                },
                            ],
                            semicolon: Span {
                                line: 5,
                                column: 23,
                                position: 87,
                            },
                        },
                    ),
                    Property(
                        Property {
                            attributes: [],
                            modifiers: PropertyModifierGroup {
                                modifiers: [],
                            },
                            type: None,
                            entries: [
                                Initialized {
                                    variable: SimpleVariable {
                                        span: Span {
                                            line: 6,
                                            column: 5,
                                            position: 93,
                                        },
                                        name: "$no_",
                                    },
                                    equals: Span {
                                        line: 6,
                                        column: 9,
                                        position: 97,
                                    },
                                    value: Null,
                                },
                                Initialized {
                                    variable: SimpleVariable {
                                        span: Span {
                                            line: 6,
                                            column: 17,
                                            position: 105,
                                        },
                                        name: "$no_modifier",
                                    },
                                    equals: Span {
                                        line: 6,
                                        column: 30,
                                        position: 118,
                                    },
                                    value: Null,
                                },
                            ],
                            end: Span {
                                line: 6,
                                column: 36,
                                position: 124,
                            },
                        },
                    ),
                    ConcreteConstructor(
                        ConcreteConstructor {
                            comments: CommentGroup {
                                comments: [],
                            },
                            attributes: [],
                            modifiers: MethodModifierGroup {
                                modifiers: [
                                    Public(
                                        Span {
                                            line: 8,
                                            column: 5,
                                            position: 131,
                                        },
                                    ),
                                ],
                            },
                            function: Span {
                                line: 8,
                                column: 12,
                                position: 138,
                            },
                            ampersand: None,
                            name: SimpleIdentifier {
                                span: Span {
                                    line: 8,
                                    column: 21,
                                    position: 147,
                                },
                                value: "__construct",
                            },
                            parameters: ConstructorParameterList {
                                comments: CommentGroup {
                                    comments: [],
                                },
                                left_parenthesis: Span {
                                    line: 8,
                                    column: 32,
                                    position: 158,
                                },
                                parameters: CommaSeparated {
                                    inner: [
                                        ConstructorParameter {
                                            attributes: [],
                                            comments: CommentGroup {
                                                comments: [],
                                            },
                                            ampersand: None,
                                            name: SimpleVariable {
                                                span: Span {
                                                    line: 8,
                                                    column: 33,
                                                    position: 159,
                                                },
                                                name: "$o",
                                            },
                                            data_type: None,
                                            ellipsis: None,
                                            default: None,
                                            modifiers: PromotedPropertyModifierGroup {
                                                modifiers: [],
                                            },
                                        },
                                    ],
                                    commas: [],
                                },
                                right_parenthesis: Span {
                                    line: 8,
                                    column: 35,
                                    position: 161,
                                },
                            },
                            body: MethodBody {
                                comments: CommentGroup {
                                    comments: [],
                                },
                                left_brace: Span {
                                    line: 9,
                                    column: 5,
                                    position: 167,
                                },
                                statements: [
                                    Expression(
                                        ExpressionStatement {
                                            expression: AssignmentOperation(
                                                Assign {
                                                    left: PropertyFetch {
                                                        target: Variable(
                                                            SimpleVariable(
                                                                SimpleVariable {
                                                                    span: Span {
                                                                        line: 10,
                                                                        column: 7,
                                                                        position: 175,
                                                                    },
                                                                    name: "$this",
                                                                },
                                                            ),
                                                        ),
                                                        arrow: Span {
                                                            line: 10,
                                                            column: 12,
                                                            position: 180,
                                                        },
                                                        property: Identifier(
                                                            SimpleIdentifier(
                                                                SimpleIdentifier {
                                                                    span: Span {
                                                                        line: 10,
                                                                        column: 14,
                                                                        position: 182,
                                                                    },
                                                                    value: "fake_variable",
                                                                },
                                                            ),
                                                        ),
                                                    },
                                                    equals: Span {
                                                        line: 10,
                                                        column: 28,
                                                        position: 196,
                                                    },
                                                    right: Literal(
                                                        String(
                                                            LiteralString {
                                                                value: "'hellworld'",
                                                                span: Span {
                                                                    line: 10,
                                                                    column: 30,
                                                                    position: 198,
                                                                },
                                                            },
                                                        ),
                                                    ),
                                                },
                                            ),
                                            ending: Semicolon(
                                                Span {
                                                    line: 10,
                                                    column: 41,
                                                    position: 209,
                                                },
                                            ),
                                        },
                                    ),
                                ],
                                right_brace: Span {
                                    line: 11,
                                    column: 5,
                                    position: 215,
                                },
                            },
                        },
                    ),
                    ConcreteMethod(
                        ConcreteMethod {
                            comments: CommentGroup {
                                comments: [],
                            },
                            attributes: [],
                            modifiers: MethodModifierGroup {
                                modifiers: [],
                            },
                            function: Span {
                                line: 13,
                                column: 5,
                                position: 222,
                            },
                            ampersand: None,
                            name: SimpleIdentifier {
                                span: Span {
                                    line: 13,
                                    column: 14,
                                    position: 231,
                                },
                                value: "test",
                            },
                            parameters: FunctionParameterList {
                                comments: CommentGroup {
                                    comments: [],
                                },
                                left_parenthesis: Span {
                                    line: 13,
                                    column: 18,
                                    position: 235,
                                },
                                parameters: CommaSeparated {
                                    inner: [
                                        FunctionParameter {
                                            comments: CommentGroup {
                                                comments: [],
                                            },
                                            name: SimpleVariable {
                                                span: Span {
                                                    line: 13,
                                                    column: 19,
                                                    position: 236,
                                                },
                                                name: "$a",
                                            },
                                            attributes: [],
                                            data_type: None,
                                            ellipsis: None,
                                            default: None,
                                            ampersand: None,
                                        },
                                    ],
                                    commas: [],
                                },
                                right_parenthesis: Span {
                                    line: 13,
                                    column: 21,
                                    position: 238,
                                },
                            },
                            return_type: None,
                            body: MethodBody {
                                comments: CommentGroup {
                                    comments: [],
                                },
                                left_brace: Span {
                                    line: 13,
                                    column: 22,
                                    position: 239,
                                },
                                statements: [
                                    Return(
                                        ReturnStatement {
                                            return: Span {
                                                line: 14,
                                                column: 7,
                                                position: 247,
                                            },
                                            value: Some(
                                                Literal(
                                                    Integer(
                                                        LiteralInteger {
                                                            value: "1",
                                                            span: Span {
                                                                line: 14,
                                                                column: 14,
                                                                position: 254,
                                                            },
                                                        },
                                                    ),
                                                ),
                                            ),
                                            ending: Semicolon(
                                                Span {
                                                    line: 14,
                                                    column: 15,
                                                    position: 255,
                                                },
                                            ),
                                        },
                                    ),
                                ],
                                right_brace: Span {
                                    line: 15,
                                    column: 5,
                                    position: 261,
                                },
                            },
                        },
                    ),
                ],
                right_brace: Span {
                    line: 17,
                    column: 3,
                    position: 266,
                },
            },
        },
    ),
]
Enter fullscreen mode Exit fullscreen mode

We will be finding the four common mistakes I defined before by navigating the tree. Let's start with finding the first mistake:

 <?php
Enter fullscreen mode Exit fullscreen mode

According to the PSR-2 coding standard, the PHP opening tag should always be at the beginning of the file. When you don't
do this, the white space will be sent to the client before executing your PHP code. Resulting in the header already sent error. This mistake will be easy to find in the AST. We can iterate through the items in the vector and use pattern matching to find the FullOpeningTag statement.

  for statement in statements {
        match statement {
            Statement::FullOpeningTag(tag) =>  project.opening_tag(tag.span, file),
        }
    }
Enter fullscreen mode Exit fullscreen mode

As you can see in the AST, the information we need is stored in the struct Span.

 FullOpeningTag(
        Span {
            line: 1,
            column: 1,
            position: 0,
        },
    ),
Enter fullscreen mode Exit fullscreen mode

We only need to check if the field line and column are higher than 0. If they are, it means the opening tag is not in the correct position, and we will push a suggestion into the field suggestion of the file parameter that is being passed to the opening_tag function.

 pub fn opening_tag(&mut self, span: Span, file: &mut File) -> &mut Project {
        if span.line > 1 {
            file.suggestions.push(
                Suggestion::from(
                    "The opening tag <?php is not on the right line. This should always be the first line in a PHP file.".to_string(),
                    span
                ))
        }

        if span.column > 1 {
            file.suggestions.push(Suggestion::from(
                format!(
                    "The opening tag doesn't start at the right column: {}.",
                    t.column
                )
                .to_string(),
                span,
            ));
        }
        self
    }

Enter fullscreen mode Exit fullscreen mode

I won't explain how to navigate to the correct statement in the AST for the rest of the mistakes.

Class name that starts in lowercase.

class uTesting extends FakeClass
Enter fullscreen mode Exit fullscreen mode

When you do this, it's harder to distinguish a class between a variable and a method.

I need to find out if the first letter of the name of the class is capitalized. The String type has a method chars that can convert the string into an iterator containing all the letters. You can grab the first character with the next() function. The char type has some valuable methods. The one we need is is_uppercase().

pub fn has_capitalized_name(name: String, span: Span) -> Option<Suggestion> {
    if !name.chars().next().unwrap().is_uppercase() {
        Some(Suggestion::from(
                format!("The class name {} is not capitalized. The first letter of the name of the class should be in uppercase.", name).to_string(),
                span
            ));
    }

    None
}
Enter fullscreen mode Exit fullscreen mode

Lowercase constants

    const I_ = null;
    const hello = null;
Enter fullscreen mode Exit fullscreen mode

Similar to the capitalized class name, it's easier to distinguish a normal variable from a constant if the constant is in uppercase.

I assume the constant is already upper cased, so I initialize the variable is_uppercase = true. When iterating through all of the letters, if I see a letter that is not in uppercase, I set the is_uppercase = false.

pub fn uppercased_constant_name(entry: ConstantEntry) -> bool {
    match entry {
        ConstantEntry {
            name,
            equals,
            value,
        } => {
            let mut is_uppercase = true;
            for l in name.value.to_string().chars() {
                if l.is_uppercase() == false && l.is_alphabetic() {
                    is_uppercase = l.is_uppercase()
                }
            }

            return is_uppercase;
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Defining parameters without a type.

I strongly advocate always defining the type for every parameter you declare. You need to define the type to avoid opening the gates to a bad interpretation of your code and introducing unnecessary bugs. In the AST, the FunctionParameter contains the parameter. We can pattern match the data_type field and the arm that has the None will need to return true. We will return the false for the Some(_) arm.

pub fn function_parameter_without_type(parameter: FunctionParameter) -> bool {
    match parameter {
        FunctionParameter {
            comments,
            name,
            attributes,
            data_type,
            ellipsis,
            default,
            ampersand,
        } => match data_type {
            None => return true,
            Some(_) => return false,
        },
    }
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

In part 2, I will show how to calculate the cyclomatic complexity of an example code.

If you are a PHP developer, you know that a developer can make many more mistakes in PHP. In the future, this list will continue to grow into something more useful.

Thanks for reading!

Contribution is always welcome if you have a mistake you would like to add to phanalist.

Top comments (0)