DEV Community

Cover image for Implementing Webpack from Scratch, But in Rust - [1] Parsing and Modifying JS Code Using Oxc
ayou
ayou

Posted on

Implementing Webpack from Scratch, But in Rust - [1] Parsing and Modifying JS Code Using Oxc

Referencing mini-webpack, I implemented a simple webpack from scratch using Rust. This allowed me to gain a deeper understanding of webpack and also improve my Rust skills. It's a win-win situation!

Code repository: https://github.com/ParadeTo/rs-webpack

This article corresponds to Pull Request

To implement a simple webpack, the primary task is to address the issue of JavaScript code parsing. Building a JavaScript parser from scratch is a monumental task, so it's better to choose an existing tool. Here, I chose oxc, which has the endorsement of Evan You.

Although oxc doesn't have as detailed documentation as Babel, the usage patterns are similar. First, we need to use oxc_parser to parse the JS code and generate an AST (Abstract Syntax Tree):

let name = env::args().nth(1).unwrap_or_else(|| "test.js".to_string());
let path = Path::new(&name);
let source_text = Arc::new(std::fs::read_to_string(path)?);
let source_type = SourceType::from_path(path).unwrap();

// Memory arena where Semantic and Parser allocate objects
let allocator = Allocator::default();

// 1 Parse the source text into an AST
let parser_ret = Parser::new(&allocator, &source_text, source_type).parse();
let mut program = parser_ret.program;

println!("Parse result");
println!("{}", serde_json::to_string_pretty(&program).unwrap());
Enter fullscreen mode Exit fullscreen mode

The content of test.js is as follows:

const b = require('./b.js')
Enter fullscreen mode Exit fullscreen mode

The parsed AST looks like this:

{
  "type": "Program",
  "start": 0,
  "end": 28,
  "sourceType": {
    "language": "javascript",
    "moduleKind": "module",
    "variant": "jsx"
  },
  "hashbang": null,
  "directives": [],
  "body": [
    {
      "type": "VariableDeclaration",
      "start": 0,
      "end": 27,
      "kind": "const",
      "declarations": [
        {
          "type": "VariableDeclarator",
          "start": 6,
          "end": 27,
          "id": {
            "type": "Identifier",
            "start": 6,
            "end": 7,
            "name": "b",
            "typeAnnotation": null,
            "optional": false
          },
          "init": {
            "type": "CallExpression",
            "start": 10,
            "end": 27,
            "callee": {
              "type": "Identifier",
              "start": 10,
              "end": 17,
              "name": "require"
            },
            "typeParameters": null,
            "arguments": [
              {
                "type": "StringLiteral",
                "start": 18,
                "end": 26,
                "value": "./b.js"
              }
            ],
            "optional": false
          },
          "definite": false
        }
      ],
      "declare": false
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

As webpack users know, during bundling, we need to replace require with __webpack_require__ and replace the relative path ./b.js with the full path. To achieve this, we need to modify the original code using oxc_traverse, which allows us to traverse the nodes in the AST and perform operations on the nodes we are interested in.

From the AST result above, we can see that the node of interest is CallExpression. Therefore, we can implement a Transform to modify this node as follows:

struct MyTransform;

impl<'a> Traverse<'a> for MyTransform {
    fn enter_call_expression(&mut self, node: &mut CallExpression<'a>, ctx: &mut TraverseCtx<'a>) {
        if node.is_require_call() {
            match &mut node.callee {
                Expression::Identifier(identifier_reference) => {
                    identifier_reference.name = Atom::from("__webpack_require__")
                }
                _ => {}
            }

            let argument: &mut Argument<'a> = &mut node.arguments.deref_mut()[0];
            match argument {
                Argument::StringLiteral(string_literal) => {
                    string_literal.value = Atom::from("full_path_of_b")
                }
                _ => {}
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

You can use this Transform as follows:

// 2 Semantic Analyze
let semantic = SemanticBuilder::new(&source_text)
    .build_module_record(path, &program)
    // Enable additional syntax checks not performed by the parser
    .with_check_syntax_error(true)
    .build(&program);
let (symbols, scopes) = semantic.semantic.into_symbol_table_and_scope_tree();

// 3 Transform
let t = &mut MyTransform;
traverse_mut(t, &allocator, &mut program, symbols, scopes);
Enter fullscreen mode Exit fullscreen mode

Note that, unlike Babel, we need to use oxc_semantic to perform syntax analysis first and obtain symbols and scopes, which are then passed to traverse_mut.

Finally, we use oxc_codegen to regenerate the code:

// 4 Generate Code
let new_code = CodeGenerator::new()
    .with_options(CodegenOptions {
        ..CodegenOptions::default()
    })
    .build(&program)
    .code;

println!("{}", new_code);
Enter fullscreen mode Exit fullscreen mode

The resulting code will be:

const b = __webpack_require__('full_path_of_b')
Enter fullscreen mode Exit fullscreen mode

Please kindly give me a star!

Top comments (4)

Collapse
 
mandyedi profile image
Ed Ward Coding

Reimplementing existing software is one of the best ways to learn the actual software and also improve your language skills. I recently started taking this approach.

Collapse
 
paradeto profile image
ayou

I couldn't agree more with what you said

Collapse
 
jason89521 profile image
Xuan

Great article! I recently started learning Rust and Bundler, and I can’t wait to read more of your articles!

Collapse
 
paradeto profile image
ayou

Thanks!