Generating JavaScript from SVG, an Intro to Code Generation — Part 1
Working with htmlparser2, abstract syntax trees, and Babel to generate code.
Plenty of posts talk about code generation at a high level: what it is and why you’d want to use it. This is a specific example of exercising some code generation concepts and using some code generation tools. When we’re done, we’ll have a working command line tool that will automatically generate JavaScript code from an SVG file. In later parts, we’ll add complexity and sophistication to that tool.
Custom Drawing in Max
In Max (our visual programming language for multimedia manipulation), you build programs by connecting together objects with things called patch cords. A patch that adds two numbers together could look like this:
Under the hood, so to speak, this “+” object is just a bit of C code that calls an “add” function. Most Max objects have pre-defined behavior. If you want to write a new object, you have to write that object in C or C++.
UI objects are an interesting exception to this though. You can change the appearance of objects like sliders, dials, buttons and number boxes by attaching a JavaScript file to the jspainter property of that object. When you specify such a file, rather than using its built-in C drawing implementation, the object will use your custom JavaScript implementation to render itself instead.
So if for some reason you wanted a toggle object to always appear brown, you just create a paint function and bind that to your toggle. On the left, you’ve got a regular toggle. In the middle, the code for turning a toggle brown. On the right, you’ve got yourself a brown toggle.
The challenge
In principle this is amazing, it means that you can change the look of any Max UI object in any way you like. On the other hand, you have to do all your drawing using our mgraphics object. If you want to tweak your custom UI, perhaps changing a color or resizing an object, you have to do it in JavaScript. If you design your custom UI object in a drawing program, you have to translate your design to JavaScript by hand. Wouldn’t it be nice if you could design your custom UI in another application, say Adobe Illustrator, and then export your SVG file to a jspainter compatible file automatically? In other words, can we generate JavaScript from an SVG file?
Generating code this way might seem daunting, but it’s much less difficult than it might first appear. Not only that, but there are great resources for getting started. There is for example this blog post https://lihautan.com/manipulating-ast-with-javascript/, describing at a high level how code generation works, as well as AST Explorer https://astexplorer.net/ for viewing and understanding syntax trees. What we’re going to do now is to see how to put these ideas into practice, with our own, custom code generation tool.
Getting set up
First, checkout the repository containing the source code.
This repository breaks this post down into steps. It starts at step 0, which you can check out by running git checkout tags/step-0
. If you want to follow along with this tutorial, you can check this tag out to a new branch, with a command like git checkout tags/step-0 my-branch.
Then if you get stuck, you can always go back to the master branch to move to the next step.
If you’re on step 0, you see that this just has a bunch of boilerplate in place. We’re using TypeScript, so rather than running node index.js to execute our script, we use ts-node to run index.ts. We’re also using commander to parse command line arguments. So to parse an input file in input/01-input.svg and write it to a file out.js, simply run
npm run start -- -i input/01-input.svg -o out.js
Parsing SVG (as if it were HTML)
The first step in doing any code generation is to go from text to an Abstract Syntax Tree. For an overview of just what an AST is, I highly recommend https://lihautan.com/manipulating-ast-with-javascript/. Once we have that tree, we can walk through the tree to continue our code generation algorithm.
For us, the first step is to get an AST for our SVG document. We’ll be using a library called htmlparser2, which actually treats the SVG document as XML. As we walk through that tree, we’ll print out lines of JavaScript based on what we encounter in the SVG document.
To start, let’s keep things simple and focus on just one kind of drawing: drawings containing black rectangles only. Yes, there are many beautiful things in this world that are not black rectangles, but we’ll worry about those things in a later post. Here’s the SVG file that generated this image:
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 177.3 307"> | |
<g id="Layer_2" data-name="Layer 2"> | |
<g id="Layer_1-2" data-name="Layer 1"> | |
<rect y="120" width="29" height="187" /> | |
<rect x="60.97" y="68" width="37.06" height="239" /> | |
<rect x="129.7" width="47.61" height="307" /> | |
</g> | |
</g> | |
</svg> |
As you can see, it’s pretty much what you’d expect. Within the svg tag, there are three rect tags that define the position of the three rectangles. It’s kind of interesting that there’s no mention of color — since black is the default, the SVG file is free to omit that information. The first rectangle also has no attribute for x and the last rectangle has no attribute for y. This is for a similar reason: zero is the default values, so for a shape at the left or top of the viewBox, the default value of zero is correct.
Before we go about translating this into JavaScript that will draw this in Max, it’s worth thinking about what the code generator should be doing. In plain English, it should more or less find every rect tag in the SVG, and for each such tag it should call mgraphics.rectangle with the correct origin and size. Unsurprisingly, that’s exactly what our code does. We’ll start by using a library called htmlparser2 to parse the SVG.
npm i --save htmlparser2
The htmlparser2 tool can handle HTML and XML, so it’s perfectly suited for walking through an SVG file. To use the htmlparser2 library, we first define a parser. That parser should have properties onopentag and onclosetag that define what to do when the parser encounters each such tag. For example:
This will print “entering an svg tag” when it reaches the first line of our SVG file, and will print “exiting an svg tag” when it reaches the last line of our SVG file. With just this and nothing else, it’s already possible to imagine building a simplistic code generator that just prints out templated text whenever it encounters a “rect” tag. It might look something like this:
import * as html2 from "htmlparser2"; | |
// ... | |
const parser = new html2.Parser({ | |
onopentag(name: string, attribs: {[s: string]: string}) { | |
if (name === "svg") | |
console.log("entering an svg tag"); | |
}, | |
onclosetag(name: string) { | |
if (name === "svg") | |
console.log("exiting an svg tag"); | |
} | |
}); | |
// ... | |
function translateSource(data: string, outPath: string) { | |
parser.parseComplete(data); | |
fs.writeFile(outPath, data, { encoding: "utf8" }, () => { | |
console.log(`Wrote output to ${outPath}`); | |
}); | |
} |
Already this is good enough for JavaScript output:
function paint() { | |
mgraphics.rectangle(0, 120, 29, 187); | |
mgraphics.fill(); | |
mgraphics.rectangle(60.97, 68, 37.06, 239); | |
mgraphics.fill(); | |
mgraphics.rectangle(129.7, 0, 47.61, 307); | |
mgraphics.fill(); | |
} |
And this works! If we create a UI object in Max and set its jspainterfile to our output, then we can see the rectangles as expected.
Building an AST
Okay, let’s take a look at this line:
outputString += `\tmgraphics.rectangle(${x}, ${y}, ${w}, ${h});\n\tmgraphics.fill();\n`;
As far as formatting and writing the output is concerned, this is by far the most important line. However, it’s far from perfect. First of all, it’s hard to read. Quick: did we remember to include a semicolon at the end of the first line? Also even this simplistic version, which can only draw a black rectangle, is already getting a little complex. What if we want to draw a rectangle with a gradient, or a dotted border? What if we want to draw one with a 2D transform applied? Clearly, this line can quickly get very unreadable and very unwieldy.
The way out is to take a big step back and to refuse to handle the code printing at all. Instead of worrying about semicolons and indentation, let’s leave all that to a library. That lets us focus on syntax — what function is being called with what argument. For this we’re going to use babel, specifically @babel/generate, @babel/types and @babel/template. We can get all of these by installing @babel/core
npm i --save @babel/core @types/babel__core
Here’s the main idea: rather than building up a function definition by gluing together a long string of semicolon-separated lines, we’re going to build a function definition by creating an array of statements. Using AST Explorer, you can get a general sense of what’s going on. If you ask AST Explorer to parse a function like this:
function paint() { | |
let x = 10; | |
x += 20; | |
console.log(x); | |
} |
You’ll see something like this in the tree view:
The important bit is the “body” of the BlockStatement, which as you can see is just an array of statements. Armed with this knowledge, we can start to build a statement array to implement our function. The pseudocode is something like this:
function translateSource(data: string, outPath: string) { | |
let paintFunctionStatements: Statement[] = []; | |
const parser = new html2.Parser({ | |
onopentag(name: string, attribs: {[s: string]: string}) { | |
if (name === "rect") { | |
const statements = makeStatementsFromRectAttribs(attribs); | |
paintFunctionStatemens = paintFunctionStatements.concat(statements); | |
} | |
} | |
}); | |
parser.parseComplete(data); | |
const program: Program = makePaintProgramForStatements(paintFunctionStatements); | |
const outputString = generate(program); | |
fs.writeFile(outPath, outputString, { encoding: "utf8" }, () => { | |
console.log(`Wrote output to ${outPath}`); | |
}); | |
} |
Better Living with Babel Templates
So you want to build a statement. Well, @babel.types exports an object that contains constructors for the various syntactic nodes of an Abstract Syntax Tree. You can for example write:
// This implements var y = x+ 5; | |
let statement = t.variableDeclaration("var", [ | |
t.variableDeclarator( | |
t.identifier("y"), | |
t.binaryExpression("+", t.identifier("x"), t.numericLiteral(5)) | |
) | |
]); |
As you can see, building an AST like this can be verbose and hard to read. The thing is, we already have a great, human-readable representation for programming languages: the code itself. This is why it’s so nice to use Babel templates to generate reusable AST-fragments. Instead of the above, we could write something like:
const makeAddFive = template(` | |
var %%newVariable%% = %%oldVariable%% + 5; | |
`); | |
const addFiveStatement = makeAddFive({ | |
newVariable: "x", | |
oldVariable: "y" | |
}); |
This creates a function makeAddFive that will return part of an AST, as generated using the code fragment template we provided. So now we’re ready to make some serious progress on our code generator.
import template from "@babel/template"; | |
// ... | |
const makePaintFunction = template(` | |
function paint() { | |
%%statements%% | |
} | |
`); | |
const makeRectDrawStatements = template(` | |
mgraphics.rectangle(%%x%%, %%y%%, %%w%%, %%h%%); | |
mgraphics.fill(); | |
`); | |
/** | |
* This is where the interesting stuff happens. We'll read in the source data, build an AST, | |
* create another AST, and then use that to generate an output in a new language. | |
* @param data - Input data in SVG format | |
* @param outPath - Path to which we'd like our result to be written | |
*/ | |
function translateSource(data: string, outPath: string) { | |
let paintStatements: t.Statement[] = []; | |
const parser = new html2.Parser({ | |
onopentag(name: string, attribs: {[s: string]: string}) { | |
if (name === "rect") { | |
let x = Number.parseFloat(attribs.x || "0"); | |
let y = Number.parseFloat(attribs.y || "0"); | |
let w = Number.parseFloat(attribs.width || "0"); | |
let h = Number.parseFloat(attribs.height || "0"); | |
const rectStatements = makeRectDrawStatements({ x, y, w, h }); | |
paintStatements = paintStatements.concat(rectStatements); | |
} | |
} | |
}); | |
parser.parseComplete(data); | |
const paintFunction = ([] as t.Statement[]).concat(makePaintFunction({ statements: paintStatements })); | |
const programAST = t.program(paintFunction); | |
// Somehow turn our program AST into an outputString | |
fs.writeFile(outPath, outputString, { encoding: "utf8" }, () => { | |
console.log(`Wrote output to ${outPath}`); | |
}); | |
} |
I’ve left out the last bit, where we actually print a string, but you can see the magic happening on line 33, where we build a branch of our AST that will draw a single rectangle. Then on lines 41–42, we create our full paint function and generate the root AST that implements our full program. There’s a tiny annoyance on 41, where we concat the result of makePaintFunction to an empty array. This simply guarantees that we pass an array of statements to t.Program. Actually generating the AST is very easy: we just use @babel/generate.
import generate from "@babel/generator"; | |
// ... | |
const outputString = generate(programAST).code; |
Putting it all together gives us something like this:
import * as html2 from "htmlparser2"; | |
import * as fs from "fs"; | |
import * as t from "@babel/types"; | |
import template from "@babel/template"; | |
import generate from "@babel/generator"; | |
import { program } from "commander"; | |
program.version("0.1.0"); | |
program | |
.requiredOption("-i, --input <input>", "input file", "input/01-input.svg") | |
.requiredOption("-o, --output <output>", "output file", "out.js"); | |
program.parse(process.argv); | |
fs.readFile(program.input, { encoding: "utf8" }, (err, data) => { | |
if (err) { | |
console.error(err); | |
} else { | |
translateSource(data, program.output); | |
} | |
}); | |
const makePaintFunction = template(` | |
function paint() { | |
%%statements%% | |
} | |
`); | |
const makeRectDrawStatements = template(` | |
mgraphics.rectangle(%%x%%, %%y%%, %%w%%, %%h%%); | |
mgraphics.fill(); | |
`); | |
/** | |
* This is where the interesting stuff happens. We'll read in the source data, build an AST, | |
* create another AST, and then use that to generate an output in a new language. | |
* @param data - Input data in SVG format | |
* @param outPath - Path to which we'd like our result to be written | |
*/ | |
function translateSource(data: string, outPath: string) { | |
let paintStatements: t.Statement[] = []; | |
const parser = new html2.Parser({ | |
onopentag(name: string, attribs: {[s: string]: string}) { | |
if (name === "rect") { | |
let x = Number.parseFloat(attribs.x || "0"); | |
let y = Number.parseFloat(attribs.y || "0"); | |
let w = Number.parseFloat(attribs.width || "0"); | |
let h = Number.parseFloat(attribs.height || "0"); | |
const rectStatements = makeRectDrawStatements({ x, y, w, h }); | |
paintStatements = paintStatements.concat(rectStatements); | |
} | |
} | |
}); | |
parser.parseComplete(data); | |
const paintFunction = ([] as t.Statement[]).concat(makePaintFunction({ statements: paintStatements })); | |
const programAST = t.program(paintFunction); | |
// Somehow turn our program AST into an outputString | |
const outputString = generate(programAST).code; | |
fs.writeFile(outPath, outputString, { encoding: "utf8" }, () => { | |
console.log(`Wrote output to ${outPath}`); | |
}); | |
} |
Okay! Let’s run it and see the output! Oh… what happened?
/Users/name/c74/codegen-blog/node_modules/@babel/types/lib/definitions/utils.js:132 | |
throw new TypeError(`Property ${key} of ${node.type} expected node to be of a type ${JSON.stringify(types)} but instead got ${JSON.stringify(val == null ? void 0 : val.type)}`); | |
^ | |
TypeError: @babel/template placeholder "h": Property arguments[3] of CallExpression expected node to be of a type ["Expression","SpreadElement","JSXNamespacedName","ArgumentPlaceholder"] but instead got undefined |
That’s a bit cryptic. So I’ve been a bit sloppy, it turns out that the values you pass to out AST-generating functions makeRectDrawStatements and makePaintFunction have to be AST nodes. They can’t just be regular JavaScript values. So you have to write something like this:
const parser = new html2.Parser({ | |
onopentag(name: string, attribs: {[s: string]: string}) { | |
if (name === "rect") { | |
let x = t.numericLiteral(Number.parseFloat(attribs.x || "0")); | |
let y = t.numericLiteral(Number.parseFloat(attribs.y || "0")); | |
let w = t.numericLiteral(Number.parseFloat(attribs.width || "0")); | |
let h = t.numericLiteral(Number.parseFloat(attribs.height || "0")); | |
const rectStatements = makeRectDrawStatements({ x, y, w, h }); | |
paintStatements = paintStatements.concat(rectStatements); | |
} | |
} | |
}); |
This uses the function numericLiteral to build an AST node of type NumericLiteral from whatever number you provide. With this tweak, now we can run our code and everything works as expected. This looks the same as it did before (which is good!) only now we’re using Babel to build and AST and using generate to print that to JavaScript automatically.
Next up
So far we’ve drawn black rectangles. Surely we can do more? In the next part we’ll look at black rectangles that stretch, and we’ll dream of a world where rectangles can have any color they want.