Architecting Automation

One of the most valuable features of a returns, exchanges, and warranties platform is automations. When automations are implemented throughout a platform, it allows merchants to have fine-grained control over the entire lifecycle of a claim. From the moment a customer lands on a returns page, the brand gets to guide and direct the process through to resolution.

When software tries to implement specific logic for every single possible use case and scenario, it's a recipe for complication as well as pages and pages of settings. Automations can be an alternative to this complication, offering a structure where users build flows to meet their needs directly.

We knew that something as important as automations couldn't be an afterthought. We are building the best post-purchase platform in e-commerce, and we made automations a part of Crew from the very beginning.

An automation primer

Software automations are nothing new, of course. Zapier is perhaps one of the most well-known "automation engines", and was founded in 2011. IFTTT (If This Then That) is a similar platform that was founded about a year earlier. These types of tools target the tech-savvy general public and are geared towards automations between or inside of different platforms that expose an API.

Well before these general-purpose automation tools, the enterprise had business rules managers. If the phrase "Enterprise Business Rule Management System" sparks dread when you hear it, then you probably have a good idea of what to expect. Oracle and SAP both have implementations of business rules managers, and there are more dedicated offerings such as Pega, and InRule.

In fact, all the way back in 1983 the scientific Journal "Artificial Intelligence" published Rete: A fast algorithm for the many pattern/many object pattern match problem, which underpins many of the approaches taken by rules systems even today. Abstracting and standardizing rule management has been an area of ongoing research for almost as long as general computing has ben around.

Considering our approach

With this understanding, we knew that there was a lot of groundwork laid by existing automation systems.

As we examined the different solutions available, there were a variety of candidates. We immediately discarded all of the costly and complicated enterprise plays, and determined that we didn't want the external dependencies that would come with a hosted rules engine.

With these constraints, we knew we needed something that could run internal to our infrastructure in some way. Some of the solutions that fit this requirement take a server-oriented approach, where logic of "Here's the situation, what should I do?", is handled remotely - typically via a REST API. The tools that take this kind of approach include Drools, Camunda, and GoRules.

There's also various library-based approaches for different languages and environments. These include RulesEngine in the .NET world, json-rules-engine for Node/JS/TS, and durable_rules for Python. These libraries take a pre-defined rule and the various inputs for that rule, and processes it directly within the application for a decision. The rule will need to be fetched from somewhere such as a database and fed to the engine for the decision, but the processing happens without needing to reach out to a different server.

How do rules engines actually work?

Rules engines are effectively an abstraction that allows logic to be stored outside of the primary application code. This allows "logic blocks" to be built separately, and then evaluated when needed - and allow end-users to define the logic rather than requiring an engineer to write custom code.

One approach used with rules engines is to implement a decision table. Think of a decision table as a big set of if/then conditions, or a single case statement. Using a contrived example, let's suppose a fashion brand wants to automate whether a return should be refunded, exchanged, or given a gift card. The brand wants to control this based on 1) the value of the item, and 2) whether it has been worn or not.

A decision table for this scenario might look something like the following:

Value of the productWas it worn?Action to take
Less than $20YesRefund
Between $20 and $100YesGift Card
More than $100YesGift Card
Less than $20NoRefund
Between $20 and $100NoGift Card
More than $100NoExchange

This decision table will accept the two items as input, and return an action to take. The table data itself can be stored in the format preferred by your engine (often XML), but can be constructed and represented however you like. For example, the table above could be surfaced with the following UI:

This mockup would likely get you fired from a design team, but it shows how the structure of our example decision table could be presented to users in the context they need.

In contrast to a decision table, some rules engines take a more code-oriented approach. Many modern rules engines store their rules as JSON and don't use a table structure at all, which is useful when the rules you need to implement won't have a uniform shape. If you want to allow an arbitrary number of conditions rather than having a defined set of columns, JSON will work better for representing and storing those types of rules.

Some rules engines can even include expressions directly:

[
  {
    "WorkflowName": "DetermineReturnResolution",
    "Rules": [
      {
        "RuleName": "GiveGiftCard",
        "Expression": "item.value >= 10.00 AND item.isWorn == true"
      },
      {
        "RuleName": "GiveRefund",
        "Expression": "item.value < 10.00 AND item.isWorn == true"
      },
      {
        "RuleName": "GiveExchange",
        "Expression": "item.isWorn == false"
      }
    ]
  }
] 

Here our rule definition says "If the item has been worn, then offer an exchange. If the item hasn't been worn but costs more than $10.00 then give a gift card; if it hasn't been worn but costs less than $10.00 then give a refund".

This approach comes with a lot of flexibility, at the cost of needing to think a lot more about their presentation in the UI. This especially brings challenges when allowing users to define their own rules. Since there can be any combination of comparisons and logical operators, building a UI that represents those expressions in an intuitive way can require a lot of effort.

The Corso way

In software design there is often a balance between usability and flexibility. Often the more flexible and powerful your software, the more complex it is for users. Conversely, if you want your software to be simple to use then you may need to give up some of that power and flexibility. This is a generalization of course, and a well-designed UX can make complex software easier to use - while poorly-designed UX can make simple software terrible to use.

For our users, we have implemented an approach that we feel strikes the right balance between flexibility and usability. In the Crew rules engine, logic is constrained to a predefined set of expressions and operators, all of which are built into the engine. We don't support arbitrary expressions, but the tradeoff is that building an intuitive UI becomes much easier as a result.

An example

In the spirit of showing rather than telling, let's go through a real-world example of a Crew rule.

In Crew there is merchant-wide setting for how long after an order the customer can request a return. However, merchants may want to override this setting for different reasons. They might want to implement a special holiday policy that allows a longer return policy for anything bought between Thanksgiving and Christmas, for example.

Or, in the example here, they might want to allow a longer return window of 90 days for orders tagged as pre-orders:

{
    "conditions": {
        "all": [
            {
                "fact": "order/tags",
                "value": [
                    "Pre-order"
                ],
                "operator": "in"
            },
            {
                "fact": "order/created",
                "value": "2024-04-02T06:00:00.000Z",
                "operator": "lessThan"
            }
        ]
    },
    "event": {
        "type": "modifyResolutionWindow",
        "params": {
            "kind": "eligibilityDays",
            "refund": 90,
            "exchange": 0,
            "giftCard": 90,
            "warrantyReview": 0
        }
    }
}

This rule says, "If an order is tagged with 'Pre-order', and the order was created before April 2, 2024, then allow 90 days of eligibility on refunds and exchanges", meeting the need of the merchant.

To use this rule we will need to pass both the rule itself and a set of facts to the engine for evaluation. The facts object might look something like the following, and could be constructed from the information in the order that the customer looks up:

{
    "order/tags": ["Pre-order", "New Album", "2024 release"]
    "order/created": "2024-03-05T01:21:05.000Z"
}

Note how the names of the object properties match the "fact" names in the rule. These conditions will each be evaluated as true in this example, so the engine will evaluate the overall rule as true and return the defined event body. With this response from the engine, the application can allow the customer to move forward with a refund or reorder when requested on, say, May 16th - outside the standard 30 day return window but before the close of the pre-order window of 90 days.

No merchant is ever going to be writing JSON out to define their rules - but luckily for us this well-defined structure allows us to generate that rule body from within the UI of the application with ease.

To set the conditions, this is our UI. You can see the structure of the JSON displayed here, with a tag condition and a data condition combined with an AND:

With this UI the merchant can adjust or remove existing conditions, add new ones, and build out exactly what they need for custom management of their returns windows.

We can also define the actions we want, represented in the "event" in the JSON rule above. The rule can apply to any combination of the refund, gift card, variant exchange, and warranty review resolutions. It also allows the merchant to set a defined date, or allow it to be a relative number of days from the purchase date.

Now every time Crew needs to evaluate whether a specific return is inside or outside the return window, any return window rules can be retrieved and processed - and whatever the merchant has defined will be honored.

From here

Rules engines are powerful tools to keep in your toolbelt. They come with some initial development overhead as you implement them, and care must be taken to architect them in a way that doesn't introduce more complications than they solve. When implemented successfully though, the implementation effort is paid off as your users can define how they want the software to work, in ways that would otherwise require extra development.

If you are considering adding a rules engine to your software, there are a lot of options and opinions about how (or even if) it should be approached. We hope we've given a good high-level overview, and pointed you to enough additional resources here to get you started.

For Crew it was a clear win - our customers love the automation functionality and we loved building it.