Static Analysis With DBOS Cloud: Behind The Scenes

Static TypeScript code analysis to protect DBOS Cloud apps from cyberattacks and workflow errors
Caspian Ahlberg
October 17, 2024

Introduction 

Static analysis plays an important role in the deployment of applications to DBOS Cloud. After compilation, DBOS automatically analyzes your code to help detect and prevent various workflow errors and security vulnerabilities such as SQL injections. I spent a good part of my summer internship at DBOS writing the newest iteration of the plugin! See here for more documentation on it. This blog post covers its internals.

  • The first section covers the history of the plugin and its original design, and how I set out to upgrade its design and functionality to offer DBOS users stronger correctness guarantees.
  • I then discuss the initial technical hurdles I had to overcome to make more advanced analysis possible, and show some of the code involved in that.
  • After that, I talk about what error checker functions are, which ones I wrote, and how they are called over your DBOS code.
  • I then talk about the different types of error checkers that I implemented.
  • I follow this with a deep dive into the implementation of each error checker.
  • The second-to-last section discusses the difficulty of testing a plugin like this, and the strategies that I employed in doing so.

History And Goals

DBOS uses an ESLint plugin for detecting general programming errors, along with errors that are specific to DBOS applications. ESLint is a tool used for the static analysis of JavaScript and TypeScript code. By configuring JS/TS projects with ESLint, you can run a series of plugins over a codebase that expose critical application errors before your code even runs. 

The original version of the plugin (written by Chuck Bear) depended more on the core functionality provided by ESLint. It used selectors to match against different parts of a DBOS application’s syntax tree (AST) to detect things like nondeterministic function calls, unexpected I/O, and more. The plugin includes many other ESLint rules that detect various things like not using semicolons, floating promises, and more. These rules still exist in the plugin today. Chuck’s plugin was written purely in JavaScript, with his last version being 0.0.6.

This version was limited because it could not check for things like determinism or safety from SQL injection. I entered my internship ready to overhaul the plugin, so that DBOS applications could finally have these guarantees. My first step was to remake this into a TypeScript project. From there, I set out to implement things like global variable mutation detection in workflows, SQL injection in transactions, and more.

From JavaScript to TypeScript with ts-morph

I decided to use ts-morph for the plugin. ts-morph is a wrapper around the TypeScript compiler API that makes it more typesafe, less verbose, and much easier to use overall. The obvious alternative would be to use ESLint directly. Simple things like detecting banned function calls is easy to do with ESLint’s selectors, but doing other things like detecting modification of a variable defined outside of a function’s scope is significantly harder. 

Additionally, getting access to an AST node’s children, and traversing those children recursively, is impossible out of the box. The selector interface provided by ESLint essentially just allows you to pattern-match against different AST forms, which stops you from implementing more complicated matching and traversal schemes. 

And finally, the ESLint custom plugin tooling itself is not very strongly typed, so there’s quite a few unsuspecting errors that can sneak up on you there. All of these drawbacks to using just ESLint meant that I decided to work with ts-morph.

Tree Construction and Conversion in 5 Milliseconds

I converted the original plugin into a TypeScript project, and decided to use ts-morph to traverse the AST to implement my custom rules. There was a major problem though: the ASTs that ESLint and ts-morph use are different! This required me to come up with a way to convert between the two AST types. The actual steps behind the scenes to convert your TypeScript code into a ts-morph compatible form through ESLint are quite complicated, but necessary because the naive solution is slow. In this section, I'll explain how it’s done.

The default way to initialize a ts-morph project is through its Project class. You pass it a list of TypeScript files to parse, and it then gives you an AST back. There’s two problems: I don’t have files to work with (everything stays in memory), and I don’t have plain TypeScript code either (the TypeScript code has already been deserialized into an AST). The naive solution is to serialize my AST into its original form again, write this to a temporary file, and then read this file in through the Project class. There’s a problem, though: re-parsing the code through this class takes 1-2 seconds on an average-sized file. The major bottleneck is that all TypeScript configuration files pertaining to that file must be found and parsed, and then the code has to be recompiled fully too. For a larger project, this is prohibitively expensive.

To solve this problem, I developed a three-step AST conversion process: the plugin first converts your TypeScript code into an ESTree-compatible AST, then it converts that AST to the format provided by the TypeScript compiler API, and finally converts that to the format used by ts-morph. This process takes ~5 milliseconds on an equivalently-sized file. Problem solved!

Let’s first talk about how ESLint builds the ESTree AST for you. To use TypeScript with ESLint, you use the typescript-eslint package. This builds an AST from your TypeScript code that ESLint can work with. Behind the scenes, typescript-eslint uses typescript-estree to convert your TypeScript code into an ESTree-compatible tree (ESTree is considered the standard AST for representing JavaScript code). In summary, the first part of the tree conversion step goes like this:

  1. Your TypeScript code is passed to typescript-eslint.
  2. typescript-eslint uses typescript-estree to make an ESTree-compatible tree.
  3. This tree is then passed to your plugin (note that type and line number information is retained from your original TypeScript code).

The next step done in the plugin, is converting from the ESTree structure to the one that ts-morph uses. There is no direct mapping from ESTree to ts-morph, but for TypeScript code, they share the same underlying tree structure (which is the tree provided by the TSC API). typescript-eslint’s utils subpackage includes a set of parser services that, among other functionality, allows you to convert from ESTree to TSC cheaply (and vice versa), specifically via the esTreeNodeToTSNodeMap and tsNodeToESTreeNodeMap objects.

The final step is to convert from TSC to the structure needed by ts-morph. ts-morph provides a function createWrappedNode that does this. The resulting tree is a bit limited in terms of its functionality though. ts-morph nodes created through the Project class have access to TypeScript’s language services, which provide nice utilities like finding all the references to a variable. Because I worked through these limited ts-morph nodes, I had no language service access, and I had to build my own tables for node reference tracking (still, some analysis was simply not possible on a practical level without the language services).

These functions do the conversions (permalink):

// This function makes a ts-morph node from an ESLint/ESTree node
function makeTsMorphNode(eslintNode: EslintNode): Node {
       const parserServices = GLOBAL_TOOLS!.parserServices;

       // The `esTreeNodeToTSNodeMap` object provides a nice mapping from ESTree nodes to TSC nodes
       const compilerNode = parserServices.esTreeNodeToTSNodeMap.get(eslintNode);
       
       /* These options ensure that the right compiler options and type checker are used.
       I pass the source file just for the sake of thoroughness (it can technically be excluded). */
       const options = {
              compilerOptions: parserServices.program.getCompilerOptions(),
              sourceFile: compilerNode.getSourceFile(),
              typeChecker: GLOBAL_TOOLS!.typeChecker
       };
       
       return createWrappedNode(compilerNode, options);
}

// This function makes an ESLint/ESTree node from a ts-morph node.
function makeEslintNode(tsMorphNode: Node): EslintNode {
       const compilerNode = tsMorphNode.compilerNode;
       
       const eslintNode =
              GLOBAL_TOOLS!.parserServices.tsNodeToESTreeNodeMap.get(compilerNode)
              ?? panic("Couldn't find the corresponding ESLint node!");
              
       return eslintNode;
}

This function builds a table that maps symbols to every node with that symbol, which is used for things like finding all references to an identifier (permalink). Normally, we would be able to use something like ts-morph’s findReferences function, but we never generated a Project, so that’s not possible.

// symRef = symbol reference
function buildSymRefMap(root: Node): Map<Symbol, Node[]> {
       let map = new Map<Symbol, Node[]>();

       // Loop over every descendant node from the root
       root.forEachDescendant((descendant) => {	
              // Skip nodes without symbols
              const symbol = descendant.getSymbol();
              if (symbol === Nothing) return;
              
              // Add to the reference list stored in the map
              const refList = map.get(symbol);
              if (refList === Nothing) map.set(symbol, ([descendant]));		
              else refList.push(descendant);
       });
       
       return map;
}

Here is how the DBOS static analysis rule is defined. A single ESTree selector calls the function analyzeRootNode, which statically analyzes your ESLint node, which is really an ESTree node (permalink). The rule is packaged and exported like this, if you’re curious.

export const dbosStaticAnalysisRule = createRule({
       create: (context: EslintContext) => {
              return {
                     // This is our one selector that we use, which just calls `analyzeRootNode`.  
                     Program(node: EslintNode) {
                            analyzeRootNode(node, context);
                     }
              }
       },
       name: "dbos-static-analysis",
       meta: {
              schema: [],
              type: "suggestion",		
              docs: { description: "Analyze DBOS applications to make sure they run reliably (e.g. determinism checking, SQL injection detection, etc..." },
              
              // This object maps error message IDs to actual messages.
              messages: Object.fromEntries(errorMessages)
       },
       defaultOptions: []
});

The function analyzeRootNode first sets up some global tools used by some various functions, then makes a ts-morph node from the ESTree node. Diagnostics are emitted for local unit tests (to ensure that that code is well-formed because semantically incorrect TypeScript code can oftentimes still be compiled to JS). From there, every function and class in the source file is analyzed (permalink):

// This performs analysis on the root ESLint/EStree node, and is passed an ESLint context as well.
function analyzeRootNode(eslintNode: EslintNode, eslintContext: EslintContext) {
       const parserServices = ESLintUtils.getParserServices(eslintContext, false);	

       /* We use these tools everywhere in the project. Normally, using globals like this wouldn't be the
       best idea, but the fields are never modified after this point, and passing all of the fields around
       would make everything needlessly verbose. */
       GLOBAL_TOOLS = {
              eslintContext: eslintContext,
              rootEslintNode: eslintNode,
              parserServices: parserServices,
              typeChecker: parserServices.program.getTypeChecker(),
              symRefMap: new Map()
       };
       
       const tsMorphNode = makeTsMorphNode(eslintNode);
       
       /* The linter can technically run on a TypeScript project that does not compile via `tsc`!I'm pretty
       sure that ESLint does the bare minimum to compile it, since all it needs is justa syntax tree (no
       semantic analysis needed). I don't run `checkDiagnostics` in production because it recompiles the
       project from scratch with ts-morph's `Project` class, which is prohibitively slow. */
       
       if (testValidityOfTestsLocally) checkDiagnostics(tsMorphNode);
       GLOBAL_TOOLS!.symRefMap = buildSymRefMap(tsMorphNode);
       
       try {
              if (Node.isStatemented(tsMorphNode)) {
                     tsMorphNode.getFunctions().forEach(analyzeFunction);
                     tsMorphNode.getClasses().forEach(analyzeClass);
              }
              else {
                     /* This branch is hit whenever there's conflicting TypeScript parser versionsin the
                     dependencies. Most times, the node type here is the mysterious `UnparsedPrologue`
                     type, which is undocumented in the TypeScript compiler API. */
                     
                     const dependencyMessage = "This may be due to a dependency issue; make sure that the
                     same version of typescript-eslint is being used everywhere.";
                     panic(`Was expecting a statemented root node! Got this kind instead:
                     ${tsMorphNode.getKindName()}. ${dependencyMessage}\n`);
               }  	
       }  	
       finally {
              // Not keeping the tools around after being done with them
              GLOBAL_TOOLS = Nothing;
       }
}

And that concludes how the plugin is initialized. The plugin is now ready to analyze your TypeScript code with the help of ts-morph.

Building and Calling DBOS Error Checkers in the ESLint Plugin

An error checker is a function that takes in a piece of code (a node in the ts-morph AST, specifically), and returns an error message ID if it found an error. It can also return an ID paired with formatting data for the error message, if so desired. These IDs are turned into actual error messages once the error is reported.

Here’s how they’re defined in the code (permalink):

type ErrorMessageIdWithFormatData = [string, Record<string, unknown>];
type ErrorCheckerResult = Maybe<string | ErrorMessageIdWithFormatData>;
type ErrorChecker = (node: Node, fnDecl: FnDecl, isLocal: (symbol: Symbol) => boolean) => ErrorCheckerResult;

Here are the current error checkers that run in the plugin right now:

1. mutatesGlobalVariable, callsBannedFunction, awaitsOnNotAllowedType

These detect nondeterminism in workflows. Workflows are expected to be deterministic; otherwise, they may not run as expected. These checkers help you make sure that your DBOS applications runs reliably.

2. isSqlInjection, transactionIsMalformed

These check that your transactions are safe and well-formed. The first rule warns you about unvalidated user data being appended to the query parameter of a raw SQL callsite. The second rule makes sure that you use the client field within your transaction (otherwise it wouldn’t be a useful transaction) and it also makes sure that the database client you’re using is supported by DBOS.

We only want to apply error checkers to DBOS methods with certain decorators. Given this, we construct a mapping from decorators to a list of error checkers (permalink):

const decoratorSetErrorCheckerMapping: [Maybe<Set<string>>, ErrorChecker[]][] = [
       [new Set(["Transaction"]), [isSqlInjection, transactionIsMalformed]],
       [new Set(["Workflow"]), [mutatesGlobalVariable, callsBannedFunction, awaitsOnNotAllowedType]]
];

Next, we need to make a function that runs these error checkers over a node if the node is inside a method declaration with an applicable decorator. In runErrorCheckers below (permalink), if an error was found, the error is reported. The message ID and any possible format data, along with a reconstructed ESLint/ESTree node, are passed to the report function.

function runErrorCheckers(node: Node, fnDecl: FnDecl, isLocal: (symbol: Symbol) => boolean) {
       for (const [decoratorSet, errorCheckers] of decoratorSetErrorCheckerMapping) {	
              // If the decorator set is `Nothing`, that means that it should be applied regardless of any decorator set inclusion (this feature is not used at the moment though).
              if ((decoratorSet === Nothing || functionHasDecoratorInSet(fnDecl, decoratorSet))) {			
                     // Run every error checker!  			
                     for (const errorChecker of errorCheckers) {   
                            const response = errorChecker(node, fnDecl, isLocal);   				
                            
                            if (response !== Nothing) {
                                   // Make some empty format data if just a plain error message ID was returned.      					
                                   const [messageId, formatData] = typeof response === "string" ? [response, {}] : response;					
                                   // Report the error via the globally stored ESLInt context. 
                                   GLOBAL_TOOLS!.eslintContext.report({ node: makeEslintNode(node), messageId: messageId, data: formatData });
                            }
                     }
              }
       }
}

Now, we need to find out when to call runErrorCheckers. As such, we construct two functions: one named analyzeClass, and one named analyzeFunction. analyzeClass analyzes every component of a class (permalink):

function analyzeClass(theClass: ClassDeclaration) {
       theClass.getConstructors().forEach(analyzeFunction);
       theClass.getMethods().forEach(analyzeFunction);
}

And analyzeFunction analyzes every component of a function/method/constructor (permalink). It starts off by checking if the declaration has a body. If not (for example with a forward declaration), it skips the analysis:

function analyzeFunction(fnDecl: FnDecl) {
       const body = fnDecl.getBody();
       if (body === Nothing) return;

It then sets up some utilities for maintaining a stack per declaration. This is primarily used for the error checker mutatesGlobalVariable, which determines if a symbol that was defined outside the method was modified in the method. Note that the stack is defined only per function/method/constructor, and not for the whole program. Also, the terminology ‘frame’ here is not really a stack frame, it’s more like a block scope.

       const stack: Set<Symbol>[] = [new Set()];
       const getCurrentFrame = () => stack[stack.length - 1];
       const pushFrame = () => stack.push(new Set());
       const popFrame = () => stack.pop();
       const isLocal = (symbol: Symbol) => stack.some((frame) => frame.has(symbol));

The next step is to run the error checkers over the function/method/constructor declaration itself, and then analyze each stack frame for all of the declaration body’s child nodes.

       runErrorCheckers(fnDecl, fnDecl, isLocal);
       body.forEachChild(analyzeFrame);

This brings us to the main logic within analyzeFunction, which is the inner function analyzeFrame.

       function analyzeFrame(node: Node) {	
              // If we find a class declaration, mutually recur with `analyzeClass`.
              if (Node.isClassDeclaration(node)) {
                     analyzeClass(node); 
                     return;	
              }		
              // Do the same for `analyzeFunction`.		
              else if (Node.isFunctionDeclaration(node)) {
                     analyzeFunction(node); 
                     return;		
              }	
              // Upon finding a new block, make a new frame, analyze that frame's children, and then remove that frame.
              else if (Node.isBlock(node)) { 
                     pushFrame(); 
                     node.forEachChild(analyzeFrame); 
                     popFrame(); 
                     return;
              }		
              // Add the declared variable to the current frame.		
              else if (Node.isVariableDeclaration(node)) {
                     const symbol = getSymbol(node); 
                     if (symbol !== Nothing) getCurrentFrame().add(symbol);
              }
              else {	
                     // For all other types of nodes, run the error checkers. 
                     runErrorCheckers(node, fnDecl, isLocal);		
              }
       
              // Then, recur on the current node's children.		
              node.forEachChild(analyzeFrame);
       }
}

And that’s how error checkers are called. It took a bit of work to set up the recursion, but it’s not too bad.

The Plugin’s Error Checkers: A Deep Dive

This next section delves into the implementation details of each error checker.

Global Variable Detection

This is the first determinism-related rule, and it’s pretty simple (permalink). Global mutations are a sign that the workflow may not always behave deterministically (especially if it depends on this global); and it’s also a best practice to make sure that our functions are as contained as possible. If developers want to modify some application state in a workflow, they should let DBOS keep that state in their database, rather than performing such mutations.

const mutatesGlobalVariable: ErrorChecker = (node, _fnDecl, isLocal) => {	
       // Only match against binary expressions (e.g. `a *= 5`).
       if (!Node.isBinaryExpression(node)) return;
       
       // Only match against assignment tokens (e.g. `-` in `5 - 3` would not match).
       const operatorKind = node.getOperatorToken().getKind();
       if (!assignmentTokenKinds.has(operatorKind)) return;
       
       // Get the leftmost token (so reduce from `a.b.c` to `a`).
       const lhs = reduceNodeToLeftmostLeaf(node.getLeft());
       
       // Only match against allowed lvalues.
       if (!isAllowedLValue(lhs)) return;
       
       const lhsSymbol = getSymbol(lhs);
       
       // Return an error when the leftmost symbol is not defined in the scope of the function declaration
       if (lhsSymbol !== Nothing && !isLocal(lhsSymbol)) {
              return "globalMutation";
       }
};

Banned Function Calls

This is the second determinism-related rule, and is also pretty simple. It uses the constant, bannedFunctionsWithArgCountRanges, which maps banned functions to the number of arguments they are allowed to have, expressed as a range (permalink):

const bannedFunctionsWithArgCountRanges: Map<string, {min: number, max: number}> = new Map([
       ["Date",       		{min: 0, max: 0}],
       ["Date.now",   		{min: 0, max: 0}],
       ["Math.random",		{min: 0, max: 0}],
       ["console.log",		{min: 0, max: Number.MAX_SAFE_INTEGER}],
       ["setTimeout", 		{min: 1, max: Number.MAX_SAFE_INTEGER}],
       ["bcrypt.hash",		{min: 3, max: 3}],
       ["bcrypt.compare",	{min: 3, max: 3}]
]);

  • Date(), new Date(), Date.now(), and Math.random() are banned because they return different results every time.
  • console.log(...) is banned because it’s more idiomatic to use the logger housed in the workflow context.
  • setTimeout(...) is banned because it is nondeterministic (its timing delay is not 100% exact). Imagine that you called setTimeout(...) many times with many different callbacks: the order in which they execute might be different each time.
  • bcrypt.hash(...) and bcrypt.compare(...) are banned because they contain native code, and because bcrypt.hash(...) is nondeterministic (due to the randomized salt in its hashing).

Note that these functions are only banned in workflows. We encourage you to use them in communicators.

The function below (permalink) matches any case where a call expression (like Date()) or a new expression (like new Date()) are encountered. Errors are reported when a banned function is called, and their argument count matches the range in the table above.

const callsBannedFunction: ErrorChecker = (node, _fnDecl, _isLocal) => {	
       // Only match against expressions like `foo()` and `new Foo()`
       if (Node.isCallExpression(node) || Node.isNewExpression(node)) {	
              /* Attempting to stringify the function call. I am doing this, rather than
              just calling `.getText()`, since that may include code comments in it. */
              const expr = node.getExpression();
              const kids = expr.getChildren(); // This might not work for nodes with more child layers...		
              const text = (kids.length === 0) ? expr.getText() : kids.map((node) => node.getText()).join("");
              
              const argCountRange = bannedFunctionsWithArgCountRanges.get(text);
              
              if (argCountRange !== Nothing) { 
                     const argCount = node.getArguments().length;
                     
                     /* We do this range check because some function calls might match,	
                     but another argument count might make them deterministic (for example, 
                     the variants of `Date` with more than zero arguments return an output 
                     that is not based on the current time, but rather only based on their inputs). */  			
                     if (argCount >= argCountRange.min && argCount <= argCountRange.max) {
                            return text; // Returning the function name key as an error message ID
                     }
              }
       }
};

Awaiting On Not-Allowed Types

This is the third determinism-related rule (permalink). It ensures that you never await anything which is not a WorkflowContext in a Workflow. Awaiting in a method usually indicates I/O, which in turn may be nondeterministic. An example would be the call await fetch(url) for some URL: it could return different results each time you call it  (note that I/O like await fetch(url) should be put in a communicator; it can then be called safely from the workflow).

Awaiting on a leftmost WorkflowContext is allowed (in an expression like await ctxt.invoke(BankTransactionHistory)....), since it’ll behave in a predictable manner under DBOS’ workflow constraints.

Here are the only two possible cases where you can have an await in a Workflow:

  1. When the leftmost identifier is a WorkflowContext (for example, this).
  2. When you await upon an arbitrary function call, and you pass your WorkflowContext into that function (it is assumed that this helper function uses the WorkflowContext).
const awaitsOnNotAllowedType: ErrorChecker = (node, _fnDecl, _isLocal) => {
       
       // This is a utility function used to check if a function call's parameters has an acceptable type in it.
       function validTypeExistsInFunctionCallParams(functionCall: CallExpression, validTypes: Set<string>): boolean {
              const argTypes = functionCall.getArguments().map(getTypeName);
              return argTypes.some((argType) => validTypes.has(argType));
       }
       
       // Match against expressions like `await foo()`, or `await obj.foo()`.
       if (Node.isAwaitExpression(node)) {
              // The expression here is the right-side function call.
              const functionCall = node.getExpression();
              if (!Node.isCallExpression(functionCall)) return;
              
              const lhs = reduceNodeToLeftmostLeaf(functionCall);
              
              if (!Node.isIdentifier(lhs) && !Node.isThisExpression(lhs)) {
                     return;
              }
              
              //////////
              
              // You may only await on a `WorkflowContext`, as of release 3.3.3.
              const awaitingOnAllowedType = awaitableTypes.has(getTypeName(lhs));
              
              if (!awaitingOnAllowedType) {
                     /* If the above requirement fails, check that a `WorkflowContext` is oneof the parameters, but only if the `ignoreAwaitsForCallsWithAContextParam`flag is set. */
                     
                     if (ignoreAwaitsForCallsWithAContextParam && 
                     validTypeExistsInFunctionCallParams(functionCall, awaitableTypes)) {
                            return                     }
                     return "awaitingOnNotAllowedType";
              }
       }
};

Malformed Transactions

This is the first rule for validating transactions (permalink). The first three parts of this checker make sure that transactions have the expected parameters, that the database client type is supported, and other general things. In retrospect, I should have implemented validation like this for all DBOS methods, but I didn’t have enough time to implement that, unfortunately. The fourth part checks that the transaction context’s client is always used. It’s important that it’s always used because if it isn’t, the transaction still consumes database resources and can hold unnecessary database locks.

const transactionIsMalformed: ErrorChecker = (node, fnDecl, _isLocal) => {

The first part checks that the transaction has the expected TransactionContext<T> parameter.

       const params = fnDecl.getParameters();
       if (params.length === 0) return "transactionHasNoParameters";

The second part checks that a valid client type argument was supplied.

       const transactionContext = params[0];
       const typeArgs = transactionContext.getType().getTypeArguments();
       if (typeArgs.length === 0) return "transactionContextHasNoTypeArguments";

The third part checks that the supplied type argument is in the set of allowed database clients (defined outside this function).

       const clientType = getTypeName(typeArgs[0]);
       
       if (!ormClientInfoForRawSqlQueries.has(clientType)) {
              return ["transactionContextHasInvalidClientType", {clientType: clientType}];
       }

And the fourth part checks if the transaction context was never used. It does this by checking every reference to the passed-in transaction context. In the first case, if the client field of the context is ever used, the database is considered ‘used’. In the second case, if the transaction context is ever passed into a helper function, then the database is considered ‘used’ as well. It is then assumed that this helper function uses this transaction context’s client.

       const transactionContextSymbol = getSymbol(transactionContext);
       
       if (transactionContextSymbol === Nothing) {
              debugLog("No symbol was ever found for the transaction context!");
              return;  	
       }
       
       let foundDatabaseUsage = false;
       
       // Look at all the references to the transaction context
       for (const ref of getRefsToNodeOrSymbol(transactionContextSymbol)) {
              if (ref === transactionContext) continue; // Don't look at the original usage of it though
              
              const parent = ref.getParentOrThrow("Expected a parent node to exist!");
              
              // We hit this branch if the usage of the transaction context is within a `ctxt.client` access.
              if (Node.isPropertyAccessExpression(parent) && parent.getChildCount() >= 3) {	
                     // Index 1 here would be the dot token between `ctxt` and `client`
                     const left = parent.getChildAtIndex(0), right = parent.getChildAtIndex(2);
                     
                     // Database usage found!
                     if (getSymbol(left) === transactionContextSymbol && right.getText() === "client") {
                            foundDatabaseUsage = true;
                            break;
                     }
              }
              else {
                     const parentCall = ref.getFirstAncestorByKind(SyntaxKind.CallExpression);
                     if (parentCall === Nothing) continue;
                     
                     // We hit this branch if the usage of the transaction context is as a helper function parameter.
                     if (parentCall.getArguments().some((arg) => getSymbol(arg) === transactionContextSymbol)) {
                            foundDatabaseUsage = true;
                            break;
                     }
              }
       }
       
       if (!foundDatabaseUsage) return "transactionDoesntUseTheDatabase";
}

SQL Injection Detection

I would have liked to include a more elaborate section on this in the blog post, but I think that detailing how this code works would involve another blog post altogether. Here’s the permalink for the SQL injection detection rule, if you want to check it out. I’ll give a little idea on how I do the detection, though.

I consider a potential SQL injection vulnerability to be when you provide non-LR data to the query parameter of a raw SQL query callsite (like knex.raw, for example). Now, you may ask, what does LR mean? It’s a term I invented, which stands for literal-reducible. A LR-value is either a literal value, or a variable that reduces down to a literal value. Some examples of literal values would be literal strings, literal numbers, bigints, enums, etc.

Here's what's allowed for SQL string parameters (from a supported callsite):

1. A LR

2. LRs concatenated with other LRs

3. Variables that reduce down to LRs concatenated with other LRs

A LR-value is not flagged for SQL injection, since injection would typically happen in a case where you take some other non-literal datatype, cast it to a string, and then concatenate that with your SQL query string. As long as the final value passed to the callsite is only built up from LRs, then the final string should be okay.

This table maps the different database clients to its various callsites (permalink):

const ormClientInfoForRawSqlQueries: Map<string, string[]> = new Map([
       ["Knex", ["raw"]], // Knex
       ["PrismaClient", ["$queryRawUnsafe", "$executeRawUnsafe"]], // Prisma
       ["EntityManager", ["query"]], // TypeORM
       ["PoolClient", ["query"]], // Postgres
]);

If a call from one of these clients is found, the rule isSqlInjection will run over the query argument you passed it. Let’s say you’re using Knex, and your transaction looks like this:

@Transaction()
myTransaction(ctxt: TransactionContext<Knex>) {	
       ctxt.client.raw(x); // `x` could be many different things!
}

  1. If the parameter to ctxt.client.raw were a literal string, like “SELECT * FROM employees WHERE employee != ‘fred’”, that would then be classified as LR. The full list of allowed literal values is here.
  2. If the parameter were an allowed lvalue (e.g. an identifier), the rule would start here, and then check that every assignment to that allowed lvalue is also LR (via the function getAssignmentsToLValue).
  3. If the allowed lvalue is not an identifier, but an element access expression (like a[“b”]), the code then checks that every assignment involving the identifier a fully consists of LR-values on the righthand side of each assignment.
  4. If the allowed lvalue is a property access expression (like a.b), the code then checks that every assignment to a.b is LR. A current limitation of my logic here is that only single-layer property access is allowed. Any multilayered access like a[“b”][“c”][“q”] or a.b.c.q would be reduced down to a[“b”] and a.b respectively. This has a nasty effect: if I modify property a.b.c to something that is non-LR, and then pass a.b.d as a raw SQL query argument, both would reduce down to a.b, and there would be a false positive for SQL injection! Part of the reason for this is that while a.b.c and a.b.d have distinct symbols, for where a is declared and assigned initially  (like if I defined a like this: let a = {b: {c: {q: 1}, d: 2}}), the nodes c and d in the righthand side of that expression would not be equivalent to any other symbols in the program, even under an access like a.b.c and a.b.d. This means that I can’t easily find the initial values of a.b.c and a.b.d, since I can’t match against the needed symbols with my function getRefsToNodeOrSymbol. I have no idea why there’s no symbol equivalency here; perhaps it’s a bug in the TypeScript compiler API! Or maybe someone will claim it’s a feature. Because of this lack of symbol equivalency under object declarations, I dumb-down the property access logic to permit more false positives, rather than missing out on these possible errors altogether.
  5. The rest of the LR-analysis is much simpler, and consists of things like checking that ternary expressions’ branches are both LR (permalink), that all of an array literals’ fields are LR (permalink), and plenty more.

I do not trace external function calls in my LR-analysis because that would be a whole other endeavor.

The major trouble with the isSqlInjection rule was that there are many different types of expressions in TypeScript. I had to find reasonable ways to deconstruct as many as possible, and try to determine if it was something that could not be influenced by bad actors trying to hack into your database. While TypeScript isn’t horribly complicated, its long list of syntactic constructs all had to be accounted for to effectively determine whether a value was, in fact, LR.

Testing

Testing a linter plugin like this is very, very hard. Actually, testing is relatively easy; just write some false positive and negative cases, and see that they work. But that’s deceptive, since there’s so much code out there that uses odd syntactic constructs that you never thought other developers would  work with.

When writing tests, you’ll normally have a limited set of inputs that your functions should expect. There are an infinite number of variations of TypeScript code that could be passed into the plugin, so it is very difficult to know that the written error checkers will behave in a way that is expected. For example, in the SQL injection detection code, there are tons of different types of values that should be classified as literal-reducible. Plain literal types, regular expressions, function declarations/expressions, class declarations/expressions, arrow functions, template expressions with literal-reducible embedded expressions, object-literal expressions with access to fields that are literal-reducible, and so many more cases had to be handled. I kept on finding more cases to cover by writing a huge number of unit tests, and running the linter over the various DBOS repositories that use this plugin.

It’s possible that I’ve missed some cases here, and in other error checkers too: for example, it’s very hard to ensure on a practical level that workflows are deterministic when they depend on values not passed directly as parameters, so many, many cases there had to be thought about carefully to catch as many nondeterministic cases as possible. Through all of my extensive testing, I feel that I’ve reached an asymptote for the conditions that I’m looking for. I’m certain that my plugins cover 99% of my analysis goals, but I’m afraid I’ll never reach 100%.

Running the plugin over the demo apps actually resulted in some neat discoveries. When the bank demo in the dbos-demo-apps repository was in early development, the plugin managed to catch a few cases where there were asynchronous calls to Amazon S3 directly in workflows (this code should have been in a communicator). The plugin also caught some cases of banned function calls in the bank demo too (these calls were later moved to communicators as well). I hope that developers using DBOS will find this plugin to be a valuable tool in their development process.

Conclusion

And that summarizes how the error checkers in my linter plugin work! I had an awesome summer at DBOS. Programming language theory is definitely one of my favorite topics in the realm of computer science, so being able to apply what I already knew about language parsing and analysis resulted in a great internship.

© DBOS, Inc. 2024