DEV Community

Yang Fang
Yang Fang

Posted on • Edited on

An Investigation into Poor UX in IntelliJ Idea on Windows with Subsystem for Linux

I recently installed IntelliJ Idea 2022.2.5 on a Windows 11 machine. I am fairly familiar with this IDE, having used it on Windows frequently from 2014 until 2018, then on Linux occasionally until 2022. And I'm also quite familiar with Android Studio, its sister product on which they share the same source code. I followed my own usual steps to install the IDE - downloading the zip, extracting to a suitable location, running the studio.bat file.

I expected the process to be a breeze, taking no more than 10 minutes. But to my surprise, after a brief setup dialog, my IDE was hanging after I click "New project" in the main project dialog. After digging deeper and deeper, what should have been 10 minutes ended up taking nearly 4 hours.

Symptoms

The bug manifests itself on a new installation of IntelliJ on a Windows 11 OS that has never run any version of IntelliJ before. WSL2 is installed with Debian 11. After clicking the "new project" button, the IDE shows a modal "Detecting JDK" window with a looping progress bar and a cancel button. The user cannot interact with any other UI element other than the modal window. But this window would not change for more than 10 minutes, appearing to not be making any progress. Clicking on the cancel button would disable the cancel button from being clicked again, but would do nothing to make this window disappear or any UI element interactable. From here, the only recourse is to kill the process in Windows Task Manager. Restarting IntelliJ would hit this issue again.

Video:

I've also tried a few other versions of IntelliJ including 2023.2, 2022.1, and 2021.2 with the same result, indicating a problem with multiple versions.

After doing some preliminary research, I found out this issue was reported by many users over the past 2 years on YouTrack (JetBrain's Bug Tracker), Reddit, StackOverflow. Here is a non-exhaustive list of reports:

A few comments mentioned Windows Defender. So I tried completely disabling Windows Defender's real time protection. But this did not fix the issue for me. IDE continued to hang for 10+ minutes.

Another person on StackOverflow mentioned it having to do with Podman on WSL. Coincidentally I do use Podman on WSL2. I cannot conclusively say whether Podman is a necessary condition for triggering this bug. I would guess not given the number of people reporting this issue and the Podman not being a very common tool on a WSL2 installation.

Some of these bug reports do not have complete information (such as logs and dumps), common for reports submitted by the public. But the number of these bug reports with the same symptoms should give off something.

What surprised me though, is that JetBrains engineers could not isolate a major UX breaking issue and fix it for good after 2+ years. Many users shared their frustration on https://youtrack.jetbrains.com/issue/IDEA-293604.

UX Issues

Before we go into the cause of the slowness, we should take a step back and ask what is wrong. Sometimes what is wrong with the User Experience does not have to do with the root cause of the issue, as shown in this case.

The first principle of UX is to always show your users a "live" UI where the user is in control. Short of a complete fatal crash of the program, in no circumstance should a "hanging" UI where the user cannot do anything be shown.

Imagine if you're dialing the emergency number, but your phone decides to show you a spinning circle. Clicking cancel doesn't change anything. In the meantime, the only thing you can do is to watch the spinning circle spin. Whether the underlying issue is weak signal, phone overheat, or low battery, the UI itself is an issue.

Here is a list of improvements that can improve the UX without having to solve the root cause:

  1. For a long blocking process (if we determined "detecting JDK" as one), an actual progress bar should be shown rather than a looping one indicating no progress
  2. The cancel button should actually cancel the process.
  3. Consider moving "detecting JDK" to the background if it is not blocking.
  4. Set a timeout on "detecting JDK" if it is blocking but a partial result is satisfactory to unblock the user
  5. Consider "detecting JDK" in steps, where a partial result is acquired initially, then the full result made available later
  6. A warning should be shown to the user if a process is expected to take a long time. (Another example would be if the user opens a project deemed too big to load in IntelliJ, the user should be warned "you're opening a huge project with xx number of files, it may take some time" rather than faced with a non-responsive UI without knowing what the problem is)

Another UX principle this issue violates is to not make all of your users suffer for features that would only benefit some of your users. After some deeper investigation, the issue indeed has to do with IntelliJ's WSL2 support code path. WSL2 can be used to do a lot of things (perhaps I just want to play SuperTuxKart on WSL2 and nothing else). But the feature is enabled for all IntelliJ users whether the user wants to be using IntelliJ with WSL or not. There is no switch to turn off WSL2 integration in IntelliJ settings, properties, or experiments (yes I tried, including a few things that normal people wouldn't try). In that process, IntelliJ serves a poor user experience to all users with WSL2 when only some of these users are interested in using WSL2 inside IntelliJ. The improvements:

  1. Offer an option to turn off WSL2 integration in IntelliJ settings. (It can be default ON if JetBrains wants to push WSL2 integration, though it should be a dialog asking to enable itself if you ask me)

These 2 UX principles should be kept in mind regardless of what the issue is. No slowness should lead to an unresponsive UI, and no feature with performance degradation should be forced on every user when only a portion would benefit.

Root Cause of Slowness

The following is the relevant stacktrace. This stacktrace is available either in a dump (in a few versions of IntelliJ a dump is automatically generated and sent when a hang is detected) or if you attach a performance monitoring tool into IntelliJ's JVM process.

Stack trace:
java.base@11.0.15/sun.nio.fs.WindowsNativeDispatcher.OpenNtQueryDirectoryInformation0(Native Method)
java.base@11.0.15/sun.nio.fs.WindowsNativeDispatcher.OpenNtQueryDirectoryInformation(WindowsNativeDispatcher.java:284)
java.base@11.0.15/sun.nio.fs.WindowsDirectoryStream.<init>(WindowsDirectoryStream.java:72)
java.base@11.0.15/sun.nio.fs.WindowsFileSystemProvider.checkReadAccess(WindowsFileSystemProvider.java:344)
java.base@11.0.15/sun.nio.fs.WindowsFileSystemProvider.checkAccess(WindowsFileSystemProvider.java:371)
java.base@11.0.15/sun.nio.fs.AbstractFileSystemProvider.exists(AbstractFileSystemProvider.java:151)
java.base@11.0.15/java.nio.file.Files.exists(Files.java:2510)
com.intellij.openapi.projectRoots.JdkUtil.checkForJdk(JdkUtil.java:87)
com.intellij.openapi.projectRoots.impl.JavaHomeFinderBasic.scanFolder(JavaHomeFinderBasic.java:194)
com.intellij.openapi.projectRoots.impl.JavaHomeFinderBasic.scanAll(JavaHomeFinderBasic.java:188)
com.intellij.openapi.projectRoots.impl.JavaHomeFinderBasic.findInSpecifiedPaths(JavaHomeFinderBasic.java:87)
com.intellij.openapi.projectRoots.impl.JavaHomeFinderBasic$$Lambda$1632/0x00000001011f7040.get(Unknown Source)
com.intellij.openapi.projectRoots.impl.JavaHomeFinderBasic.findExistingJdks(JavaHomeFinderBasic.java:100)
com.intellij.openapi.projectRoots.impl.JavaHomeFinderWindows$5.get(JavaHomeFinderWindows.kt:57)
com.intellij.openapi.projectRoots.impl.JavaHomeFinderWindows$5.get(JavaHomeFinderWindows.kt:17)
com.intellij.openapi.projectRoots.impl.JavaHomeFinderBasic.findExistingJdks(JavaHomeFinderBasic.java:100)
com.intellij.openapi.projectRoots.impl.JavaHomeFinder.suggestHomePaths(JavaHomeFinder.java:73)
com.intellij.openapi.projectRoots.impl.JavaHomeFinder.suggestHomePaths(JavaHomeFinder.java:61)
com.intellij.openapi.projectRoots.impl.JavaSdkImpl.suggestHomePaths(JavaSdkImpl.java:199)
com.intellij.openapi.roots.ui.configuration.SdkDetector.detect(SdkDetector.java:149)
com.intellij.openapi.roots.ui.configuration.SdkDetector$2.run(SdkDetector.java:208)
com.intellij.openapi.progress.impl.CoreProgressManager.startTask(CoreProgressManager.java:442)
com.intellij.openapi.progress.impl.ProgressManagerImpl.startTask(ProgressManagerImpl.java:114)
com.intellij.openapi.progress.impl.CoreProgressManager.lambda$runProcessWithProgressAsynchronously$5(CoreProgressManager.java:493)
com.intellij.openapi.progress.impl.CoreProgressManager$$Lambda$1617/0x00000001011f0040.apply(Unknown Source)
com.intellij.openapi.progress.impl.ProgressRunner.lambda$submit$3(ProgressRunner.java:252)
com.intellij.openapi.progress.impl.ProgressRunner$$Lambda$1498/0x0000000101118c40.run(Unknown Source)
com.intellij.openapi.progress.impl.CoreProgressManager.lambda$runProcess$2(CoreProgressManager.java:188)
com.intellij.openapi.progress.impl.CoreProgressManager$$Lambda$712/0x000000010087dc40.run(Unknown Source)
com.intellij.openapi.progress.impl.CoreProgressManager.lambda$executeProcessUnderProgress$12(CoreProgressManager.java:608)
com.intellij.openapi.progress.impl.CoreProgressManager$$Lambda$509/0x000000010025e040.compute(Unknown Source)
com.intellij.openapi.progress.impl.CoreProgressManager.registerIndicatorAndRun(CoreProgressManager.java:683)
com.intellij.openapi.progress.impl.CoreProgressManager.computeUnderProgress(CoreProgressManager.java:639)
com.intellij.openapi.progress.impl.CoreProgressManager.executeProcessUnderProgress(CoreProgressManager.java:607)
com.intellij.openapi.progress.impl.ProgressManagerImpl.executeProcessUnderProgress(ProgressManagerImpl.java:60)
com.intellij.openapi.progress.impl.CoreProgressManager.runProcess(CoreProgressManager.java:175)
com.intellij.openapi.progress.impl.ProgressRunner.lambda$submit$4(ProgressRunner.java:252)
com.intellij.openapi.progress.impl.ProgressRunner$$Lambda$1492/0x0000000101102440.get(Unknown Source)
java.base@11.0.15/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base@11.0.15/java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(Executors.java:668)
java.base@11.0.15/java.util.concurrent.Executors$PrivilegedThreadFactory$1$1.run(Executors.java:665)
java.base@11.0.15/java.security.AccessController.doPrivileged(Native Method)
java.base@11.0.15/java.util.concurrent.Executors$PrivilegedThreadFactory$1.run(Executors.java:665)
java.base@11.0.15/java.lang.Thread.run(Thread.java:829)
Enter fullscreen mode Exit fullscreen mode

This stack trace tells us the cause of the hang is inside the JDK detection routine JdkUtil.checkForJdk (so at least the "detecting JDK" window isn't lying). The Java process is using Windows Native file access method OpenNtQueryDirectoryInformation0. The thread does move between CreateFile0(Native Method) and WindowsNativeDispatcher.OpenNtQueryDirectoryInformation0(Native Method). That tells us this process is moving forward, but just incredibly slowly, so this isn't a (dead or live) lock.

One suspect immediately came to mind is WSL2 (although there is nothing mentioning WSL inside this stacktrace). Recall WSL2 is an underlying Hyper-V virtual machine with a Linux native filesystem and a Windows filesystem translation layer. This meant improved IO speed inside Linux but decreased IO speed from Windows accessing Linux files. Here is the WSL2 documentation from Microsoft describing the issue: "File performance across the Windows and Linux operating systems is faster in WSL 1 than WSL 2, so if you are using Windows applications to access Linux files, you will currently achieve faster performance with WSL 1".

Following my instinct, I opened process explorer in Windows to take a peek at the file handles opened by IntelliJ's process, but that did not show any suspicious file paths that are linked to WSL. But I was not ready to give up my suspicion on IntelliJ WSL integration.

The next hint came from IntelliJ logs in %LOCALAPPDATA%\JetBrains\IdeaIC2022.2\log\. There aren't any useful logs directly related to "detecting JDK". The last thing IntelliJ printed was the following:

2023-xx-xx xx:xx:xx,xxx [  xxxxx]   INFO - #c.i.e.w.WslDistributionManager - Fetched WSL distributions: [(Debian, version=2)] ("C:\Windows\system32\wsl.exe --list --verbose" done in 94 ms)
Enter fullscreen mode Exit fullscreen mode

So immediately before detecting JDK, IntelliJ pulled a list of WSL installations. This is certainly suspicious. The theory I rested on is that the WSL routine found the list of WSL installations, and handed the paths to the JDK detection routine to check for JDK installations. By using Windows file system access methods to check inside a WSL2 installation, translation is happening at an extremely slow pace and "hangs" the JDK detection routine. And because of poor UX designs, the user cannot stop or turn off this process, leading to a major UX issue.

Is Windows Defender involved? In a few comments and support articles (such as here and here), JetBrains employees recommended adding a list of exclusions. IntelliJ even went as far as implementing a feature to prompt the user to "add Windows Defender exclusions for improved IDE performance".

For this, I'll go over a brief overview of antivirus in general (I'm barely an expert myself). AntiVirus, including Windows Defender, has a list of binary patterns of known malicious programs inside its definition file. It checks each file not previously known to it by examining its content against the list of binary patterns using some kind of string matching algorithm. For compressed files (which jars - java archives - are), it will extract the files using the decompression algorithm then do the string matching in memory. This check is done periodically (scheduled scans) and just-in-time when the program is being started (when the user freshly downloaded the files). Decompression and string comparison is known to take time to execute. And to prevent malicious programs from running, the antivirus stalls the file access until it can finish checking the file. The slow scanning algorithms and intentionally stalling file access is what could cause a slow down. But after the antivirus has checked a file once and the file isn't modified, it would save a hash of the file into its "scanned" cache to avoid checking it again, like how incremental compilation is done if a source file is unmodified. So this would only affect IntelliJ on its first launch.

However, this particular problem is not caused by Windows Defender (at best, Defender only exacerbates what is already a severe issue). For one, if the user downloads IntelliJ 2020.3, which does not contain WSL support, the JDK detection process is immediate, even with Defender on. Another proof is that some users including myself are hit with this issue even after disabling Defender. So disabling Windows Defender should not be treated as a silver bullet in all IntelliJ slowness issues.

Workaround - Turning off WSL2 Integration in IntelliJ

Although the above theory is unproven until this point (mainly because I've never worked on IntelliJ's codebase and am not familiar with JDK detection), I'm eager to use it as a basis for experiments to workaround the issue.

The following workaround turns off WSL2 support in IntelliJ by modifying the IntelliJ source code. This workaround is useful for people who are using IntelliJ with Windows JDKs and not interested in Linux JDKs inside WSL2. (For people who wants to use Linux JDKs inside WSL2, do not use this workaround. Run your IntelliJ inside WSL2, and access your IDE through the web frontend, until a proper fix is in place.)

Since IntelliJ community source code is available, I found the class responsible in the log: com.intellij.execution.wsl.WslDistributionManager (Github). The compiled file resides in <intellij-dir>\lib\app.jar for versions 2022.2 and under, and <intellij-dir>\lib\util-8.jar for versions 2023.2 and higher (I didn't bother looking at versions between 2022.3 and 2023.1, it would be one or the other). In theory, we can modify the source code to always return "WSL not supported" and "no WSL installation is found" and recompile the program.

The interested methods are WslDistributionManager.getInstalledDistributions() and WSLUtil.isSystemCompatible(). Make getInstalledDistributions return an empty list and isSystemCompatible to always return false would do the trick. WslDistributionManager.getInstance() is also necessary because it is referenced by other classes. So here are the code you need to write:

public class WslDistributionManager {
    public static WslDistributionManager getInstance() {
        return new WslDistributionManager();
    }

    public List<WSLDistribution> getInstalledDistributions() {
        return List.of();
    }
}

public class WSLUtil {
    public static boolean isSystemCompatible() {
        return false;
    }
}
Enter fullscreen mode Exit fullscreen mode

Ideally, we clone the IntelliJ GitHub repo, replace the above files, then do a recompilation of the whole program. But since I'm short on time, we'll just inject the compiled code for the above classes directly into the java archives (this would cause breakage if you didn't implement all the methods referenced, so I cannot promise it won't break in some codepath). To compile, call javac directly on the .java source files, or use IntelliJ 2020.3 to create a new project and compile the files. The output should be .class files.

Extract app.jar, replace the files WslDistributionManager.class and WSLUtil.class in com\intellij\execution\wsl\. Then compress the extracted directory as a ZIP. Rename it to app.jar, and place it back in IntelliJ's installation directory (keep the original copy by renaming its extension to .bak). Check to make sure the file permissions are the same as the old file permissions.

Then start IntelliJ again, and you should be able to proceed as normal, albeit now IntelliJ won't see any of your WSL2 installations. This means the theory is proven - the slowness is indeed caused by the JDK detection routine interacting with WSL2.

How JetBrains Engineers can Fix the Problem

The problem seems to be extreme slowness when accessing the WSL2 file system through Windows file system APIs using the compatibility layer built by Microsoft. Detecting JDK is one such user of this API.

According to engineer Serge Baranov in this comment, JetBrains is aware of the technical differences between having a server running in WSL2 (also the approach VSCode takes) compared to accessing WSL2 through the Windows filesystem API. The server approach bypasses the Windows filesystem API and accesses the files inside Linux directly. The processed data is then passed from Linux to Windows over a network connection. This would be the ultimate solution for people who are interested in development inside WSL2, obsoleting the Windows API codepath.

But if JetBrains engineers are not ready to abandon the Windows version of IntelliJ on Windows with WSL2 integration, then an alternative can be taken to improve JDK detection performance. Rather than using Windows APIs from Java to detect JDKs, a Linux shell script can be written and run inside Linux. The output of the script can be written as a file into IntelliJ's local app data Windows directory. By doing JDK detection inside Linux, native IO performance can be obtained, and this issue should no longer manifest itself.

Wrap

IntelliJ versions 2021.1 - 2023.2 has WSL2 support, but this will negatively impact certain system configurations due to poor WSL2 <-> Windows cross filesystem performance. There are a few UX improvements to be made, as well as a deeper fix for the root issue. Meanwhile, there is a hack end users can execute to disable WSL2 support in IntelliJ to workaround the issue.

Top comments (0)