Spring Archives - Lightrun

Spring Transaction Debugging in Production with Lightrun

Lightrun Marketing — Mon, 18 Apr 2022 18:23:26 +0000

Spring makes building a reliable application much easier thanks to its declarative transaction management. It also supports programmatic transaction management, but that’s not as common. In this article, I want to focus on the declarative transaction management angle, since it seems much harder to debug compared to the programmatic approach.

This is partially true. We can’t put a breakpoint on a transactional annotation. But I’m getting ahead of myself.

What is Spring’s Method Declarative Transaction Management?

When writing a spring method or class, we can use annotations to declare that a method or a bean (class) is transactional. This annotation lets us tune transactional semantics using attributes. This lets us define behavior such as:

Transaction isolation levels – lets us address issues such as dirty reads, non-repeatable reads, phantom reads, etc.
Transaction Manager
Propagation behavior – we can define whether the transaction is mandatory, required, etc. This shows whether the method expects to receive a transaction and how it behaves
readOnly attribute – the DB does not always support a read-only transaction. But when it is supported, it’s an excellent performance/reliability tuning feature

And much more.

Isn’t the Transaction Related to the Database Driver?

The concept of transactional methods is very confusing to new spring developers. Transactions are a feature of the database driver/JDBC Connection, not of a method. Why declare it in the method?

There’s more to it. Other features, such as message queues, are also transactional. We might work with multiple databases. In those cases, if one transaction is rolled back, we need to rollback all the underlying transactions. As a result, we do the transaction management in user code and spring seamlessly propagates it into the various underlying transactional resource.

How can we Write Programmatic Transaction Management if we don’t use the Database API?

Spring includes a transaction manager that exposes the API’s we typically expect to see: begin, commit and rollback. This manager includes all the logic to orchestrate the various resources.

You can inject that manager to a typical spring class, but it’s much easier to just write declarative transaction management like this Java code:

@Transactional
public void myMethod() {
    // ...
}

I used the annotation on the method level, but I could have placed it on the class level. The class defines the default and the method can override it.

This allows for extreme flexibility and is great for separating business code from low level JDBC transaction details.

Dynamic Proxy, Aspect Oriented Programming and Annotations

The key to debugging transactions is the way spring implements this logic. Spring uses a proxy mechanism to implement the aspect oriented programming declarative capabilities. Effectively, this means that when you invoke myMethod on MyObject or MyClass spring creates a proxy class and a proxy object instance between them.

Spring routes your invocation through the proxy types which implement all the declarative annotations. As such, a transactional proxy takes care of validating the transaction status and enforcing it.

Debugging a Spring Transaction Management using Lightrun

IMPORTANT: I assume you’re familiar with Lightrun basics. If not, please read this.

Programmatic transaction management is trivial. We can just place a snapshot where it begins or is rolled back to get the status.

But if an annotation fails, the method won’t be invoked and we won’t get a callback.

Annotations aren’t magic, though. Spring uses a proxy object, as we discussed above. That proxy mechanism invokes generic code, which we can use to bind a snapshot. Once we bind a snapshot there, we can detect the proxy types in the stack. Unfortunately, debugging proxying mechanisms is problematic since there’s no physical code to debug. Everything in proxying mechanisms is generated dynamically at runtime. Fortunately, this isn’t a big deal. We have enough hooks for debugging without this.

Finding the Actual Transaction Class

The first thing we need to do is look for the class that implements transaction functionality. Opening the IntelliJ/IDEA class view (Command-O or CTRL-O) lets us locate a class by name. Typing in “Transaction” resulted in the following view:

This might seem like a lot, but we need a concrete public class. So annotations and interfaces can be ignored. Since we only care about Spring classes, we can ignore other packages. Still, the class we are looking for was relatively low in the list, so it took me some time to find it.

In this case, the interesting class is TransactionAspectSupport. Once we open the class, we need to select the option to download the class source code.

Once this is done, we can look for an applicable public method. getTransactionManager seemed perfect, but it’s a bit too bare. Placing a snapshot there provided me a hint:

I don’t have much information here but the invokeWithinTransaction method up the stack is perfect!

Moving on to that method, I would like to track information specific to a transaction on the findById method:

To limit the scope only to findById we add the condition:

method.getName().equals("findById")

Once the method is hit, we can see the details of the transaction in the stack.

If you scroll further in the method, you can see ideal locations to set snapshots in case of an exception in thread, etc. This is a great central point to debug transaction failures.

One of the nice things with snapshots is that they can easily debug concurrent transactions. Their non-blocking nature makes them the ideal tool for that.

Summary

Declarative configuration in Spring makes transactional operations much easier. This significantly simplifies the development of applications and separates the object logic from low level transactional behavior details.

Spring uses class-based proxies to implement annotations. Because they are generated, we can’t really debug them directly, but we can debug the classes, they use internally. Specifically: TransactionAspectSupport is a great example.

An immense advantage of Lightrun is that it doesn’t suspend the current thread. This means issues related to concurrency can be reproduced in Lightrun.

You can start using Lightrun today, or request a demo to learn more.

The post Spring Transaction Debugging in Production with Lightrun appeared first on Lightrun.

Spring Boot Performance Workshop with Vlad Mihalcea

Lightrun Marketing — Wed, 08 Jun 2022 15:07:49 +0000

A couple of weeks ago, we had a great time hosting the workshop you can see below with Vlad Mihalcea. It was loads of fun and I hope to do this again soon!

In this workshop we focused on Spring Boot performance but most importantly on Hibernate performance, which is a common issue in production environments. It’s especially hard to track since issues related to data are often hard to perceive when debugging locally. When we have “real world” data at scale, they suddenly balloon and become major issues.

I’ll start this post by recapping many of the highlights in the talk and conclude by answering some questions we missed. We plan to do a second part of this talk because there were so many things we never got around to covering!

The Problem with show-sql

After the brief introduction, we dove right into the problem with show-sql. It’s pretty common for developers to enable thespring.jpa.show-sqlsetting in the configuration file. By setting this to true, we will see all SQL statements performed by Hibernate printed on the console. This is very helpful for debugging performance issues, as we can see exactly what’s going on in the database.

But it doesn’t log the SQL query. It prints it on the console!

Why do we Use Loggers?

This triggered the question to the audience: why does it matter if we use a logger and not System.out?

Common answers in the chat included:

System.out is slow – it has a performance overhead. But so does logging
System.out is blocking – so are most logging implementations but yes you could use an asynchronous logger
No persistence – you can redirect the output of a process to a file

The reason is the fine grained control and metadata that loggers provide. Loggers let us filter logs based on log level, packages, etc.

They let us attach metadata to a request using tools like MDC, which are absolutely amazing. You can also pipe logs to multiple destinations, output them in ingestible formats such as JSON so they can include proper meta-data when you view all the logs from all the servers (e.g. on Elastic).

Show-sql is Just System Output

It includes no context. It’s possible it won’t get into your Elastic output and even if it does. You will have no context. It will be impossible to tell if a query was triggered because of request X or Y.

Another problem here is the question marks in the SQL. There’s a very limited context to work with. We want to see the variable values, not questions.

Adding a Log with Lightrun

Lightrun lets you add a new log to a production application without changing the source code. We can just open the Hibernate file “Loader.java” and add a new log toexecuteQueryStatement.

We can fill out the log statements in the dialog that prompts us. Notice we can use curly braces to write Java expressions, e.g. variable names, method calls, etc.

These expressions execute in a sandbox which guarantees that they will not affect the application state. The sandbox guarantees read only state!

Once we click OK, we can see the log appear in the IDE. Notice that no code changed, but this will act as if you wrote a logger statement in that line. So logs will be integrated with other logs.

Notice that we print both the statement and the arguments so the log output will include everything we need. You might be concerned that this weighs too heavily on the CPU and you would be right. Lightrun detects overuse of the CPU and suspends expensive operations temporarily to keep execution time in check. This prevents you from accidentally performing an overly expensive operation.

You can see the log was printed with the full content on top but then suspended to prevent CPU overhead. This means you won’t have a performance problem when investigating performance issues…

You still get to see the query, and values sent to the database server.

Log Piping

One of the biggest benefits of Lightrun’s logging capability is its ability to integrate with other log statements written in the code. When you look at the log file, the Lightrun added statements will appear “in-order” with the log statements written in code.

As if you wrote the statement, recompiled and uploaded a new version. But this isn’t what you want in all cases.

If there are many people working on the source code and you want to investigate an issue, logging might be an issue. You might not want to pollute the main log file with your “debug prints”. This is the case for which we have Log Piping.

Log piping lets us determine where we want the log to go. We can choose to pipe logs to the plugin and in such a case, the log won’t appear with the other application logs. This way, a developer can track an issue without polluting the sanctity of the log.

Spring Boot Connection Acquisition

Ideally, we should establish the relational database connection at the very last moment. You should release it as soon as possible to increase database throughput. In JDBC, the transaction is on auto-commit by default and this doesn’t work well with the JPA transactions in Spring Boot.

Unfortunately, we’re at a Chicken and Egg problem. Spring Boot needs to disable auto-commit. In order to do that, it needs a database connection. So it needs to connect to the database just to turn off this flag that should have been off to begin with.

This can seriously affect performance and throughput, as some requests might be blocked waiting for a database connection from the pool.

If this log is printed, we have a problem in our auto-commit configuration. Once we know that the rest is pretty easy. We need to add these two fields that both disable auto-commit and tell Hibernate that we disabled it. Once those are set, performance should be improved.

Query Plan Cache

Compiling JPQL to native SQL code takes time. Hibernate caches the results to save CPU time.

A cache miss in this case has an enormous impact on performance, as evidenced by the chart below:

This can seriously affect the query execution time and the response time of the whole service.

Hibernate has a statistics class which collects all of this information. We can use it to detect problematic areas and, in this case, add a snapshot into the class.

Snapshots

A Snapshot (AKA Non-breaking breakpoint or Capture) is a breakpoint that doesn’t stop the program execution. It includes the stack trace, variable values in every stack frame, etc. It then presents these details to us in a UI very similar to the IDE breakpoint UI.

We can traverse the source code by clicking the stack frames and see the variable values. We can add watch entries and most importantly: we can create conditional snapshots (this also applies to logs and metrics).

Conditional snapshots let us trigger the snapshot only if a particular condition is met. A common problem is when a bug in a system is experienced by a specific user only. We can use a conditional snapshot to get stack information only for that specific user.

Eager Fetch

When we look at logs for SQL queries, we can often see that the database fetches a lot more than what we initially asked for. That’s because of the default setting of JPA relations which is EAGER. This is a problem in the specification itself. We can achieve significant performance improvement by explicitly defining the fetch type to LAZY.

We can detect these problems by placing a snapshot in theloadFromDatasource()method ofDefaultLoadEventListener.

In this case, we use a conditional snapshot with the condition:event.isAssociationFetch().

As a result, the snapshot will only trigger when we have an eager association, which is usually a bug. It means we forgot to include the LAZY argument to the annotation.

As you can see, this got triggered with a full stack trace and the information about the entity that has such a relation.

You can use this approach to detect incorrect lazy fetches as well. Multiple lazy fetches can be worse than a single eager fetch, so we need to be vigilant.

Open Session in View Anti-Pattern

On the surface, it doesn’t seem like we’re doing anything wrong. We’re just fetching data from the database and returning it to the client. But the transaction context finished when the post controller returned and as a result we’re fetching from the database all over again. We need to do an additional query as data might be stale. Isolation level might be broken and many bugs other than performance might arise.

This creates an N+1 problem of unnecessary queries!

We can detect this problem by placing a snapshot on theonInitializeCollectioncall and seeing the open session:

Now that we see the problem is happening we can solve the problem by definingspring.jpa.open-in-view=false

It will block you from using this approach.

Q&A

There were many brilliant questions as part of the session. Here are the answers.

Could you please describe a little bit about Lightrun?

Lightrun is a developer observability platform. As such, it lets you debug production safely and securely while keeping a tight lid on CPU usage. It includes the following pieces:

Client – IDE Plugin/Command Line
Management Server
Agent – running on your server to enable the capabilities

I wrote about it in depth here.

Could Lightrun Work Offline?

Since you’re debugging production, we assume your server isn’t offline.

However, Lightrun can be deployed on-premise, which removes the need for an open to the Internet environment.

Wondering about this sample, will this be available for our reference?

The code is all here.

As the Instrumentation/manipulation happens via a Server, given that I do not host the instrumentation server myself, what kind and what amount of data is being transmitted? Is the data secured or encrypted in any way?

The instrumentation happens on your server using the agent.

The Lightrun server has no access to your source code or bytecode!

Source code or bytecode never goes on the wire at any stage and Lightrun is never exposed to it.

All transmissions are secured and encrypted. Certificates are pinned to avoid a man in the middle attack. The Lightrun architecture received multiple rounds of deep security reviews and is running in multiple Fortune 500 companies.

Finally, all operations in Lightrun are logged in an administrator log, which means you can track every operation that was performed and have a full post mortem trail.

You can read more about Lightrun security here.

As mentioned, these logs are aged out in 1 hr. Is it possible to save those and re-use them for later use rather than creating log entries manually every time?

Lightrun actions default to expire after 1 hour to remove any potential unintentional overhead. You can set this number much higher, which is useful for hard to reproduce bugs.

Notice that when an action is expired, you can just click it and re-create it. It will appear in red within the IDE and can still be used for reference.

Is IntelliJ IDEA the only way to add breakpoints/logging? Or how is debugging with Lightrun done in production?

You can use IntelliJ (also PyCharm and WebStorm) as well as VSCode, VSCode.dev and the command line.

These connect to production through the Lightrun server. The goal is to make you feel as if you’re debugging a local app while extracting production data. Without the implied risks.

Is there any case where eager loading should be configured always for One-to-Many or Many-to-Many or Many-to-One relations? I always configure lazy loading for the above relations. Is it okay?

Yes. If you see that you keep fetching the other entity, then eager loading for this case makes sense. Having eagerness as the default makes little sense for most cases.

Do we need to restart an application with the javaagent?

The agent would run in the background constantly. It’s secure and doesn’t have overhead when it isn’t used.

If we are using other instrumentation tools like say AppDynamics or dynatrace …… does this work alongside?

This varies based on the tool. Most APMs work fine besides Lightrun because they hook up to different capabilities of the JVM.

Does this work with GraalVM?

Not at this time since GraalVM doesn’t support the javaagent argument. We’re looking for alternative approaches, but hopefully the GraalVM team will have some solutions.

Is it free to use?

Using Lightrun comes at a cost, but a free trial is available to everyone.

Does it impact app performance?

Yes, but it’s minimal. Under 0.5% when no actions are used, and under 8% with multiple actions. Notice you can tune the amount of overhead in the agent configuration.

Does it work for Scala and Kotlin?

Yes.

How to use it in production without IDE?

The IDE will work even for production, since you don’t connect directly to the production servers and don’t have access to them. The IDE connects to the Lightrun management server only. This lets your production servers remain segregated.

Having said that, you can still use the command-line interface to get all the features discussed here and much more.

Apart from injecting loggers, what other stuff can we do?

The snapshot lets you get full stack traces with the values of all the variables in the stack and object instance state. You can also include custom watch expressions as part of the snapshot.

Metrics let you add counters (how many times did we reach this line), tictocs (how much time did it take to perform this block), method duration (similar to tictocs but for the whole method) and custom metrics.

You can also add conditions to each one of those to narrowly segment the data.

How do we hide sensitive properties from beans? Say Credit card number of user?

Lightrun supports PII Reduction, which lets you define a mask (e.g. credit card) that would be removed before going into the logs. This lets you block an inadvertent injection into the logs.

It also supports blocklists, which let you block a file/class/group from actions. This means a developer won’t be able to place a log or snapshot there.

How can we use it for performance testing?

I made a tutorial on this here.

When working air gapped on prem is required, how do you provide the Server, as a jar or docker…?

This is something our team helps you set up.

Will it consume much more memory if we run with the Lightrun agent?

This is minimal. Running the petclinic demo on my Mac with no agent produces this in the system monitor:

With the agent, we have this:

At these scales, a difference of 17mb is practically within the margin of error. It’s unclear what overhead the agent has, if at all.

Finally

This has been so much fun and we can’t wait to do it again. Please follow Vlad, Tom, and myself for updates on all of this.

There are so many things we didn’t have time to cover that go well beyond slow queries and spring data nuances. We had a really cool demo of piping metrics to Grafana that we’d love to show you next time around.

The post Spring Boot Performance Workshop with Vlad Mihalcea appeared first on Lightrun.

Debugging the Java Message Service (JMS) API using Lightrun

Lightrun Marketing — Mon, 25 Apr 2022 10:24:14 +0000

The Java Message Service API (JMS) was developed by Sun Microsystems in the days of Java EE. The JMS API provides us with simple messaging abstractions including Message Producer, Message Consumer, etc. Messaging APIs let us place a message on a “queue” and consume messages placed into said queue. This is immensely useful for high throughput systems – instead of wasting user time by performing a slow operation in real-time, an enterprise application can send a message. This non-blocking approach enables extremely high throughput, while maintaining reliability at scale.

The message carries a transactional context which provides some guarantees on deliverability and reliability. As a result, we can post a message in a method and then just return, which provides similar guarantees to the ones we have when writing to an ACID database.

We can think of messaging somewhat like a community mailing list. You send a message to an email address which represents a specific list. Everyone who subscribes to that list receives that message. In this case, the message topic represents the community mailing list address. You can post a message to it, and the Java Message Service handler can use a message listener to receive said event.

It’s important to note that there are two messaging models in JMS: the publish-and-subscribe model (which we discussed here) and also point-to-point messaging, which lets you send a message to a specific destination.

Let’s go over a quick demo.

A Simple Demo

In order to debug the Java Message Service calls, I’ve created a simple demo application, whose source code can be found here.

This JMS demo is a simple database log API – it’s a microservice which you can use to post a log entry, which is then written to the database asynchronously. RESTful applications can then use this database log API to add a database log entry and without the overhead of database access.

This code implements the main web service:

@RestController
@RequiredArgsConstructor
public class EventRequest {
   private final JmsTemplate jmsTemplate;
   private final EventService eventService;
   private final Moshi moshi = new Moshi.Builder().build();

   @PostMapping("/add")
   public void event(@RequestBody EventDTO event) {
       String json = moshi.adapter(EventDTO.class).toJson(event);
       jmsTemplate.send("event", session ->
               session.createTextMessage(json));
   }

   @GetMapping("/list")
   public List listEvents() {
       return eventService.listEvents();
   }
}

Notice the event() method that posts a message to the event topic. I didn’t discuss message bodies before to keep things simple, but note that in this case I just pass a JSON string as the body. While JMS supports object serialization, using that capability has its own complexities and I want to keep the code simple.

To complement the main web service, we’d need to build a listener that handles the incoming message:

@Component
@RequiredArgsConstructor
public class EventListener {
   private final EventService eventService;

   private final Moshi moshi = new Moshi.Builder().build();

   @JmsListener(destination = "event")
   public void handleMessage(String eventDTOJSON) throws IOException {
       eventService.storeEvent(moshi.adapter(EventDTO.class).fromJson(eventDTOJSON));
   }
}

The listener is invoked with the JSON string that is sent to the listener, which we parse and send on to the service.

Debugging the Hidden Code

The great thing about abstractions like Spring and JMS is that you don’t need to write a lot of boilerplate code. Unfortunately, message-oriented middleware of this type hides a lot of fragile implementation details that can fail along the way.

This is especially painful in a production scenario where it’s hard to know whether the problem occurred because a message wasn’t sent properly. This is where Lightrun comes in.

You can place Lightrun actions (snapshots, logs etc.) directly into the platform APIs and implementations of messaging services. This lets us determine if message selectors are working as expected and whether the message listener is indeed triggered.

With Spring with JMS support as shown above, we can open the JmsTemplate and add a snapshot to the execute method:

As you can see, the action is invoked when sending to a topic. We can review the stack frame to see the topic that receives the message and use conditions to narrow down the right handler for messages.

We can place a matching snapshot in the source of message so we can track the flow. E.g. a snapshot in EventRequest can provide us with some insight. We can dig in the other direction too.

In the stack above, you can see that the execute method is invoked by the method send at line 584. The execute method wraps the caller so the operation will be asynchronous. We can go further down the stack by going to the closure and placing a snapshot there:

Notice that here we can place a condition on the specific topic and narrow things down.

Summary

We pick messaging systems to make our application reliable. However, enterprise messaging systems are very hard to debug in production, which works against that reliability. We can see logs in the target of messages, but what happens if we did not reach it?

With Lightrun, we can place actions in all the different layers of messaging-based applications. This helps us narrow down the problem regardless of the messaging standard or platform.

The post Debugging the Java Message Service (JMS) API using Lightrun appeared first on Lightrun.