Banner

Thursday, July 7, 2016

When Not to Use RxJava

Reactive programming is a game-changing technology. If you are using it correctly, it should change how you approach programming entirely. Over a year ago, I researched it hoping to find a better way to manage and compose UI events (and consequently took ownership of RxJavaFX). But I quickly learned it accomplishes much more than that. It changes the approach to almost every aspect of programming, from concurrency and IO to logic and algorithms.

In my enthusiasm, I started using RxJava for pretty much everything. Taking this ground-up approach to reactive programming greatly benefited the quality of my applications. For me, making everything reactive was probably the most effective way to learn RxJava too.

But after a year, I did find a few cases where reactive may not necessarily be a good fit. Every application I write now will be reactive. However, there may be a few places in the codebase I strategically choose to not make reactive. This might have partially been due to my switch to Kotlin, which made functional programming convenient whether it is push-based or pull-based, making Rx only one of several functional tools in my belt. But I digress. This article is merely my observations when using an RxJava Observable might not be optimal.

Keep in mind it is easy to turn pretty much anything into an Observable, including collections. Therefore, this post is about when it is appropriate to allow non-Observable items to be returned from your API. The clients of the API can always turn these items into Observables if they choose.

Case #1 Small, Constant, and Unchanging Data Sets

Here is the simplest case where an Observable might not be appropriate. Let's say you have a simple enum type for three EmployeeType categories.

public enum EmployeeType {
    FULL_TIME,
    CONTRACTOR,
    INTERN
}

If you need to iterate through this enumerable, does it always make sense to turn it into an Observable for the sake of?

Observable<EmployeeType> employeeTypes = Observable.from(Employee.values());

Perhaps it might make sense to turn EmployeeType into an Observable if you are already deep in an Observable chain of operators, and it simply follows the push-based flow to turn it into an Observable. But I would argue this is the exception rather than the norm.

Data sets that are small and do not change are probably not good candidates to turn into an Observable. From an API perspective, leave it as a traditional collection and allow it to be converted to an Observable when it makes sense. This does not apply to just enumerables, and perhaps this was an extreme example. But any small, static data set could fall into this case.

Case #2 Expensive, Cached Objects

Let's say you have a class called ActionQualifier which has some expensive regular expression fields. In case you didn't know, regular expressions are text wildcards on steroids. They are very powerful in finding text patterns but they are very expensive to compile at runtime. So creating an instance of this ActionQualifier could be very costly on performance if done redundantly.

public final class ActionQualifier {

    private final Pattern codeRegexPattern;
    private final int actionNumber;

    ActionQualifier(String codeRegex, int actionNumber) {
        this.codeRegexPattern = Pattern.compile(codeRegex);
        this.actionNumber = actionNumber;
    }

    public boolean qualify(String code) {
        return codeRegexPattern.matcher(code).find();
    }
    public int getActionCode() {
        return actionNumber;
    }
}

If you had an Observable that imported ActionQualifier instances from a database using RxJava-JDBC, it could be very expensive if it has several subscribers (since every subscription results in a re-query). For every subscription, it would rebuild ALL of them each time.

Observable<ActionQualifier> actionQualifiers = db
    .select("SELECT CODE_REGEX, ACTION_NUMBER FROM ACTION_MAPPING")
    .get(rs -> new ActionQualifier(rs.getString("CODE_REGEX"), rs.getInt("ACTION_NUMBER")));

You could use a cache() operator to hold onto them and "replay" them to each subscriber, and this is valid. But the actionQualifiers may become stale as cache() would hold onto them indefinitely.

 Observable<ActionQualifier> actionQualifiers = db
    .select("SELECT CODE_REGEX, ACTION_NUMBER FROM ACTION_MAPPING")
    .get(rs -> new ActionQualifier(rs.getString("CODE_REGEX"), rs.getInt("ACTION_NUMBER")))
    .cache();

Dave Moten has created a clever solution that expires the cache and re-subscribes to the source. But ultimately you got to ask if it is easier to simply hold the actionQualifiers in a List, and refresh them in a manual way. You might as well not be bound to a monad anymore.

List<ActionQualifier> actionQualifiers = db
    .select("SELECT CODE_REGEX, ACTION_NUMBER FROM ACTION_MAPPING")
    .get(rs -> new ActionQualifier(rs.getString("CODE_REGEX"), rs.getInt("ACTION_NUMBER")))
    .toList().toBlocking().first();

Then you can always turn the saved List into an Observable at any time, if you in fact want to use it as an Observable.

 Observable.from(actionQualifiers).filter(aq -> aq.qualify("TXB.*"));

Either way, collections of expensive items can be challenging to work with. In some situations, it is easier to manage them statefully rather than functionally. You can probably get creative and find ways to maintain a reactive nature depending on your situation, but be mindful to not over-complicate it. However, if you have a very large, memory-intensive collection then maybe an Observable is valid to prevent caching from taking up memory. It really depends on what "expensive" means.

Case #3 Simple "Lookups" and Single-Step Monads

What is great about RxJava is its ability to compose multiple steps. Take these, then filter that, map to this, and reduce to that. You can create a long, elaborate chain of operations doing tons of work with little code.

Observable<Product> products = ...

Observable<Int> totalSoldUnits = products
    .filter(pd -> pd.getCategoryId() == 3)
    .map(pd -> pd.getSoldUnits())
    .reduce(0,(x,y) -> x + y)

But what if you are only interested in doing one step, like looking up a single value with an ID?

Observable<Category> category = Category.forId(263);

Is an Observable overkill in this case? Maybe. It might just be simpler to return the Category without being emitted through an Observable.

Category category = Category.forId(263);

Maybe an Observable is warranted if you expect more than one Category to be emitted, or you want to return an empty Observable rather than a null value if no Category is found for that ID. But as we will discover next in Case #4, this excessive use of Observable might create more boilerplate code rather than reduce it. But for now note if you are expecting a single value that is simply a "look up", or it requires only one step, consider not using an Observable.

Besides, if you really want to utilize an Observable for a given usage, you can always convert it to one later. You can even filter out any null value to make it empty.

Observable<Category> category = Observable.just(Category.forId(263))
    .filter(c -> c != null);

Case #4 Frequently Qualified Properties

Let's expand on Case #3 to make another point. Say you have this Product class.

public final class Product { 
    private final int id;
    private final String description;
    private final int categoryId;

    public Product(int id, String description, int categoryId) { 
        this.id = id;
        this.description = description;
        this.categoryId = categoryId;
    }
    public Observable<Category> getCategory() { 
        return Category.forId(categoryId);
    }
}

Notice the getCategory() method returns an Observable<Category>. If we frequently qualify on the Category, this could be pretty messy. Suppose each Category has a getGroup() returning an int, and we want to filter out an Observable<Product> for only products where the category's group is 5.

Observable<Product> products = ...

Observable<Product> productsInGroup5 = 
    products.flatMap(p -> p.getCategory().filter(c -> c.getGroup() == 5).map(p -> c));

For a simple task, this requires a lot of FlatMap Kung-Fu. We flatMap() each Product to its Category, then filter for Categories where the getGroup() is 5, and then map the Category back to the Product. If we take the Product class and make its getCategory() a non-Observable, this would be a lot simpler.

public Category getCategory() { 
    return Category.forId(categoryId);
}
Observable<Product> productsInGroup5 = 
    products.filter(p -> p.getCategory().getGroup() == 5);

In summary, if you have fields on a class that are frequently filtered/qualified on, you might want to consider not making it an Observable to avoid complexity. This is especially true if the property returns a single value and not a sequence of values.

Case #5 Capturing State

RxJava is very anti-state. This is a good thing. It allows us to compose a series of actions and behaviors rather than manually manage a series of states. This allows operations to be agnostic to threads, and you can compose concurrency at any time with ease into any Observable chain.

But I have noticed with complex business applications (which definitely benefit from reactive programming), you sometimes want to capture state especially when trying to gather context of how your algorithm came up with an action. This is critical for business reporting. The simplest example I can think of is holding on to a snapshot of history.

public final class PricePoint { 
    private final int id;
    private final int productId;
    private final BigDecimal price;
    private final ImmutableList<BigDecimal> historicalPricePoints;

    public PricePoint(int id, int productId, BigDecimal price) { 
        this.id = id;
        this.productId = productId;
        this.price = price;
        historicalPricePoints = HistoricalPricePoints.forProductId(productId);
    }
    public ImmutableList<BigDecimal> getHistoricalPricePoints() { 
        return historicalPricePoints;
    }
}

You could retrieve the historical price points reactively, and this is fine if the operation is not expensive.

public final class PricePoint { 
    private final int id;
    private final int productId;
    private final BigDecimal price;

    public PricePoint(int id, int productId, BigDecimal price) { 
        this.id = id;
        this.productId = productId;
        this.price = price;
    }
    public Observable<BigDecimal> getHistoricalPricePoints() { 
        return HistoricalPricePoints.forProductId(productId);
    }
}

But if it is expensive, this goes back to Case #2. Also, if you want to capture the historical price points at the time the PricePoint is constructed, an Observable might not be optimal either. You do in fact want to hold on to state, and an anti-state solution like RxJava might undermine this.

If you want to gather and retain different states for different properties so you can hold onto a context snapshot, you might want to use a traditional pull-based solution to build all that.

Summary

Reactive programming is definitely a game-changer and you should use it liberally. But be aware that RxJava tackles moderate to high complexity problems, and not necessarily simple ones. Some of these cases I identified above may be obvious to Rx veterans. But to newbies encouraged to build applications "reactive from the ground-up", it may not be so obvious.

Again, these are just my observations. Please comment below if you have any questions, additional cases, affirmations, disagreements, or thoughts.

1 comment: