Date: 2025-10-02 13:00:16

by Eladrin
Hello everybody!
Shadows of the Shroud and the Stellaris 4.1 ‘Lyra’ update have been released, and today we’re going to go over some of our performance investigations and next steps moving forward.
Our 4.1.4 Open Beta found some critical issues, so will not be released this week. We hope to have a 4.1.5 Open Beta with some more fixes up before the weekend.
4.1.4 has a large number of fixes:
Improvements
Balance
Bugfix
UI
Stability
Post-release support will continue with more bug fixes and balance updates, with the next patch expected about a week from now.
Performance remains the highest priority of the internal Stellaris team. A portion of the team will be fixing bugs, balance, and stability issues (especially multiplayer stability), but a dedicated subteam from the Custodians exists that is exclusively focusing on performance.
Today I’m going to go over some of our internal investigations and leads. It’s our goal to make steady progress on performance, but this is going to be a long haul and won’t be a “magic patch” that fixes everything. We have several angles to pursue, however. How modifiers are applied, fleet counts, UI impacts, and economic balance are four different major targets under investigation.
While investigating some 4.0 saves from the bug report forum, I noticed an unusually large number of ships in the late game saves (over 24000 ships!), and had the analytics team dig up some information. I posted this on our forums earlier this week, which compares the average number of ships in all “default empires” across a month each of 3.14 and 4.0 games:

The more powerful economies of 4.0 are causing us to reach previously end-game counts of ships a hundred years earlier. Some less rigorous checks of saves from our overnights and those provided in the bug-report forum also indicate that the 4.0 AI is much better at filling out their naval capacity and expanding it, but not nearly to the degree that players are. The AI is not as adept at stacking non-linear bonuses the way you guys are, so even if it’s several times as effective as it used to be, if you’re ten times as strong, it feels weaker overall.
In my next investigation, I took a Huge Grand Admiral game 200 years into the game and ran a series of tests, exploring how the number of ships in the galaxy affected the save. During these trials I also found that there was a substantial difference based on what the camera was looking at. Viewing Havenstar, the capital system of one of the largest empires, generally performed worse than viewing the entire galaxy map as an observer, and that itself was worse than staring at nothingness in an exploded azilash.

Huge Galaxy Grand Admiral stats in the 4.1.3 release build, 2400.01.01 - 2401.01.01.
Results from the above performance tests on 3.1.1. What you’re looking at in-game can affect one_year times.
We’re also seeing 4.1 performance being marginally better than 3.14 until a little bit after 2300, after which the two suddenly diverge. We’re currently in the process of generating more saves and analyzing the data to get a better understanding of exactly what’s happening here.
3.14 (blue) vs 4.1 (red) one_year times at various dates throughout the game
These tests give us several directions to pursue - reducing the number of ships in the game through a combination of economic balancing and mechanical changes looks like it has some promise, but we also clearly have client-side improvements to investigate.
4.0.23 (blue) vs 4.1 (red) micro tick times at various dates throughout the game
4.1’s release day announcement was later than normal because we were concerned about whether 4.1 would harm performance. This was one of the performance graphs that contributed to the final approval for release.
I’ll now turn this over to Gosia to discuss the methodology and process.
Hello everyone,
My name is Gosia, and I work as a QA Analyst on Stellaris. I’ve been with Paradox and Studio Green for some time, and I’m excited to introduce myself to you. o/
First and foremost, I would like to thank you all for your commitment, continuous support, and dedication to reporting all the issues Stellaris has. You have played a major role in fixing thousands of bugs throughout the years, and I can tell you, albeit you might not believe me, that we see you, we hear you, and we appreciate you. You’ve been an enormous help to the QA team - you have no idea!
I was thinking about how to bite this topic in the dev diary and I think it would be best to start from scratch: do they even test it? Do they even care? What’s the process?
The answer to all of those questions is: yes. Performance has had our attention even before 4.0 was released, and since then, we have been continuously improving our workflows and adapting the methodology to make every pop in the Galaxy happy.
At Paradox we have machines ranging from low-spec to high-spec that we use to track performance 24/7. Both low and mid-spec machines have older generations of CPUs and GPUs, including specifics like a lower amount of RAM. This is to ensure we monitor how the game behaves on different desktops, investigate differences and issues in the performance, which lastly can result in fixes we all want.
Each week, our QA team runs performance sessions, rotating the responsibility among ourselves.
Once the results are in, we review them with our game design and programming colleagues to ensure a well-rounded evaluation and the best outcome.
Let me dive into some details - this could be particularly interesting for those of you who are curious and like experimenting with performance testing yourselves.
1. Preparing the performance run
At first, we need to make sure that this launching argument is set:

This keeps the random seed (all the random events in the Galaxy) the same on the save you made.
Because our game is so complex, with so many different systems, different crises or situations might impact the performance and, as a natural consequence, invalidate the testing results. War in Heaven may not distress the game as heavily as Cetana, for example, and vice versa.
That is why having the same saves when executing the comparison between versions is crucial.
Of course, sometimes, due to the new content we release, it is not possible - only then we allow ourselves to run different saves BUT trying to make the settings and save as identical as possible. Keeping the same random seed will significantly reduce the divergence of the two galaxies.
From there, the only debug command that should ever be used is human_ai and nothing else.
Any other command may falsify the results - i.e fast_forward might generate more issues that we are not aware of. The more natural, the better.
Once the performance machines are done, we gather the necessary files generated by the game and, using a special tool, generate graphs, as presented below.
For example, here, you can see that both versions, 4.0.17 and 4.0.22, run quite similarly, which means that we neither introduced new issues nor improved.
2. one_year, thirty_year commands
Once we have graphs, regardless of the results (improved or not), we load the save and do in-game checks.
One of them that is well known to you all is the one_year debug command that checks how fast the game can get through X years. We run thirty_years, either from the last save made or starting a completely new game and running in parallel on different PCs with different versions of the game.
Think 2200 to 2230, compare, then again 2230 to 2260, compare until - if time allows - the end-game.
This type of test has exposed in a couple of runs that 4.1 generally is faster in the early game, but slows down heavily around mid-game.
It is essential to highlight here that it matters where you are looking - if it’s a system or galaxy view. If you run it only in the system view filled with cosmic storms or ships, the game will run slower.
3. Script profiler
One of the most important checks, often exposing the biggest offenders of our performance.
Shows all the checks that we do in the script and code.
The values above show the total time, the number of hits (checks), and the average time to perform one check.
As presented in the screenshot, the red square shows that the event.bio.755 has a very high total time value, with 1845 hits on a monthly tick. The average is also high if you look very closely.
One could suspect that BioGenesis is causing the distress and the aforementioned event, but if you expand bio.755, you will see that, in fact, the culprit is the UpdateModifiers (modifier threading) - and in many others, by the way.
This check is the most extensive, and sometimes, we sit there for hours, dwelling on the results and observing.
4. Overnight debug command
Overnight is not commonly used for performance tests, as it might easily generate wrong results, but what is good about overnight is its speed and the amount of saves it generates. Essentially, we can quickly get to 2500 and from there check saves made every 50 years to see what is happening in the Galaxy. It’s a good way of testing performance issues, if we notice that there is a breaking point in the graphs.
Currently known issues:
Back to Eladrin:
Modifiers:
As Gosia mentioned, the Custodians on the Performance Team are currently rewriting how Modifiers are calculated and improving their threading. We consider this a “potentially dangerous change” but it should have some positive (but hardware-dependent) impact, with greater impact on the late game.
Due to modifiers being a core system of the game and touching literally everything, we’re going to have to be very careful with deploying any changes here.
Ship Counts:
One of the common suggestions we see is reducing the number of ships and fleets while increasing their relative power. A mod was posted this week that happens to do precisely this - multiplying ship costs and power by ten. This is somewhat similar in nature to one of my internal design experiments, though my variant is based partially on how Hearts of Iron handles advanced vehicle design.
In Hearts of Iron, as you progress through the technology tree, you unlock progressively more advanced vehicles. In the case of planes, for instance, you’ll go from propeller-based Inter-war Airframes with pretty mediocre stats, up through Basic, Improved, Advanced, and Modern Airframes.
Our experiment with this includes advanced variants of the basic hull types which use more Naval Capacity than their base variants (but not Command Limit) and cost more to build and upkeep, but are significantly more powerful and have inherent shield and armor hardening. In our current experiment, the Mk.3 hulls were given an additional Auxiliary slot on the Stern ship section.
This change reduced mid and end-game ship counts by approximately 25% with some of the benefits expected from that, but the inherent hardening and change in the relative value of ships also has an interesting effect on making bypass weapons less oppressive. The next iteration of this experiment will be more extreme and increase the difference between the tiers - I quite like the effect it has of making individual ships and fleets more important.
We’ve also investigated a system where enough of a specific type of ship would group together into a “squadron”, but our early experiments didn’t prove especially promising. This might get revived again later if ship counts remain high, or if there are other combat changes planned.
Economic and Modifier Balance:
The biggest design-side task is to analyze and curb some of the non-linear growth in the post-4.0 economy. While we want planet design and the decisions you make to remain rewarding, currently there are some elements that create excessive positive feedback loops or are otherwise above the desired curve. Like the Technology Open Beta, we need to bring some things back to a proper baseline.
Open Betas:
I expect that any of these will need to go through an Open Beta process with the intention of reducing risk, gathering feedback, and helping refine things.
We are aware that there are stability issues with multiplayer, and need your help to fix them.
Reproducing desyncs in multiplayer is much harder than finding regular bugs and fixing them, as there are many different ways and causes the game can desync. Often one desync “hides” under another, so even if the patch notes say we’ve fixed one, it can often still remain an issue in a different form. Saves that reliably desync 100% of the time are pure gold from our perspective.
How does Stellaris multiplayer actually work?
Stellaris uses lock-step multiplayer. Essentially, a day is split up into 10 “turns”, and for each turn the client sends the instructions (commands) the player executed on their computer to the host, and a list of the relevant values for each command (such as country_resources, ship_count, etc.) to the host. The host then verifies the validity and executes that player’s instructions on the galaxy, and generates its own list of values and then compares the two lists. If everything goes as intended, and all the instructions are passed to the host, the values will be the same and the game will then continue to the next tick. If there is a mismatch between the two lists of values, the game declares a desync and generates an OOS report that is saved locally on the client and hosts computers.
Why don’t you fix the desyncs?
We do actively fix desyncs with just about every version. Unfortunately, we also sometimes introduce new desyncs that look the same as the old desyncs, or uncover new desyncs that were “hiding” under the old desyncs. A desync is identified by a set of names (like RANDOM_COUNT, NUM_POP_GROUPS or ECONOMY_OVERLORD) that tell which part of the gamestate was misaligned: you can think about the desync name as the “symptoms”.
How do you fix desyncs?
When we get a bug report that has the two OOS reports (host and desynced client) our devs will compare the OOS reports and find the differences between them. They will then trace back through the code and find what didn’t synchronize between the host and client, sometimes this is a relatively easy fix, and other times just finding the culprit can take days.
I’ve experienced a desync, how do I report it?
Report your desync on the Bug Report Forums. In your bug report, please include an OOS report from the host and a client that desynced. If you have an autosave from the month before the desync, that will also help since sometimes these autosaves will desync every time – which helps us immensely when trying to reproduce desyncs on our end, as well as helps testing to be sure the desync is actually fixed.
Next week we’ll likely be going over either the preliminary release notes or the actual release notes for 4.1.5, depending on how the timing and patching works out.
If there are any major breakthroughs regarding anything in this dev diary we’ll also share more about them, but otherwise we’ll be going “headphones on” as we focus on these issues. I’ll likely elaborate more on some of the questions I expect this dev diary will create.
See you then, and thank you for playing Stellaris!