Sunday, February 25, 2024

Month 3 - Week 3 - Fixing the Event System and Continuing with Audio Fix

Last week we went into the discussion of all the different things that were being pushed for merging, different issues encountered when refactoring the audio code for the bug fix, and then ending it with the return of the event system issue that was starting to plague throughout the Debian build of the library. This week, we will discuss the changes to the event reporting system and what I did to fix this bug.

This was a rather frustrating issue to tackle as I ended it on the complicated matter of figuring out how to locate a way to implement the fix. I was originally working on the idea that since it was X11 causing the problems, I could check which version of X11 was running and then build a solution for that version and the ones occurring after it. However, this solution wasn’t a clean one. What if the check fails at some point? What if a later version fixes the problem that was being encountered? Lastly, is my solution a topic of over-engineering?

I started to do some minor research into the topic, and the code itself. I looked into the code that specifically gave us the property return value. I figured—maybe—it could shine a light on the matter and help me understand in better detail what could be happening. Looking into some of the scarce documentation on `XGetWindowProperty`, I found something rather interesting of note. The property return isn’t exactly a single value. It can contain a single value—but is capable of sending more than just one return code. So, the property return is treated more like an array rather than a singular value. I had already known this due to the code below, however, I did not consider that there may be something more to this.

Furthermore, it is noted by other developers that the function that gives us the property return values is known to have—in the only words I can describe—undefined behavior. The return can move, reorder, position, and even be absent of values in events. It’s not entirely known why it does this, but depending on the event you are reading, they are things to take careful consideration of.

I started working on this by creating a little logging of the property event codes, this was put inside of a while loop that would cycle through all the codes in the event being fired. Whenever I would maximize the window, or minimize the window, I took note of all the events it would fire. The thing I found interesting about the maximize event is it would report a vertical and horizontal maximize property return.  However, with the minimize, I found that it would report the same two property returns, along with a hidden property return. And much like what was mentioned previously, on Arch, the order of events was different compared to Debian, thus explaining why my initial fix wasn’t working.

I decided that instead of using bitmasks to try and fix the event reporting problem as I did initially, I would instead use these multiple return values to properly determine if a maximize or a minimize event was happening. The first step was the logic of the loop I made for logging. I rearranged the code and proceeded to check for the specific conditions to be met. If the property return value was vertical max, toggle the vertical maximize flag, if it was horizontal, toggle the horizontal flag, and so on. The exception to this was the hidden flag. If the hidden flag was present, we want to toggle the other two off, this is because if the other two are active, the minimize event could be mistaken as a maximize since the maximize only reports the vertical and horizontal property return values.

The next thing needed was to simply form the condition of this check. We can determine a maximize event is occurring fairly quickly, we just need to confirm that 1) the window is the same, 2) the vertical and horizontal flags are true, and 3) the previous event is not a maximize (this will prevent spamming of the event). If we meet all these requirements, then a maximize event has occurred. The minimize event is much the same, but it is looking for the hidden flag being raised over the vertical and horizontal flags.

After we’ve gotten these conditions settled, the replacement of the bitmask was very straightforward. With this replacement, we now report the maximize and minimize of the window without any extra hassle. Furthermore, this also fixes the other bug that was occurring last week where the event test case in the unit test was failing due to a bitmask overflow occurring which resulted in the events inversing, showing minimize on maximize and maximize on minimize.

Lastly, the final change I made was removing this code. I chose to do this as the if chain it was connected to has an else block that will return a maximize immediately, I figured that removing this would improve the performance of the code by minimizing some of the conditional checking it has to do. Something else to add to this is that with the addition of my code, there is a chance this code will no longer run since my code can pick up maximize events very quickly just by reading the multiple return values. So, I removed this and tested and found all the tests worked as expected, and my manual testing also passed, on both Debian and Arch. So, with that, this code was now effectively fixed.

Of course, there was more work this week. Since this bug was fixed rather early in the week, and was considered a high-priority issue, I found myself open to working on the audio issue once more.

I found myself mostly refactoring a lot of the audio code. It was written back when C++98 was used for the library—at least that’s what I’ve been told—so a majority of the code reflected that. Usages of new and delete, the use of NULL over nullptr, and general low-level management that doesn’t reflect the more modern usages seen in C++11. A lot of my changes focused on trying to modernize the code and get it into a more functional state while also removing the rather tricky memory leaks that were spread throughout it. Almost all of them looked to be small oversights that were not noticed when initially written. The way I fixed this problem was by using `unique_ptr`. The fix wasn’t nearly as easy though. Some of the changes required making custom delete functions for the pointers since the pointers were managed by PulseAudio. These had to be handled in a very particular way since the library reference counts these allocations. So, neglecting to delete them properly results in the leakage of memory or worse, runtime crashes.

Just by a surface look at the code, it is clear the kind of work that has to be done to effectively ensure these pointers are managed. There may have been more elegant solutions for this, but this is the one I found to be most readable and straightforward from an outsider's perspective. You may be wondering how these functions get passed into the unique pointer. That, in the end, is handled on the assignment of the pointer.

Almost all of the pointers that need these delete functions are handled like this. This code, in my complete opinion, is messy, however, I can’t use the modern luxuries of `std::make_unique` since it wasn’t introduced until C++17, even then, `std::make_unique` doesn’t support setting the custom delete function we need, so this is our best option. Throughout all the code, we replace the initialization and usages of these variables with the unique pointer equivalent, while also fixing instances of new and delete with other local variables.

This will ensure that no possible memory leaks can occur. While this example is rather clear that no memory leak would happen, what about this one?

We can’t be sure, but I know that `CreateFilePath` performs a memory allocation, and I know that it isn’t freed before that reallocation occurs, so in this instance, a memory leak occurred.


Now in these changes, we know for certain that the memory is no longer leaking. We are returning a unique pointer and instead letting it maintain the memory being used and letting it deallocate using RAII over user control. These little changes just save the confusion and make it clear for new developers what to use without them having to mentally track every allocation and deallocation they make. While I can do it, and many others can, others may have a harder time with that. Especially with classes like this one which do a lot.

After performing these changes and testing them, I continued to do minor refactoring while also including another bug to close the merge request. This other bug simply remarked fixing memory leaks found with the audio class. Since I just made this rather large change to the code, I added it as it will more than likely be fixed along with the initial bug.

Now, to close the topic for this week, I have created two new issues that will have to be tracked. Firstly, recent PulseAudio versions have started crashing during 3D tests. The crash is happening from an assertion failure within the internal code for PulseAudio, within the module `pulsecore`. This is something that will have to be worked on just to ensure that newer versions are supported, as my unit testing is currently broken right now because of it. Lastly, the other issue is that Linux users using PipeWire will suffer from distorted, fast-forwarded, skipping, and corrupted audio. I was using PipeWire initially without realizing it and found out that PipeWire with `pipewire-pulse` does not like or play with the GSound and GMusic code all that well. The workaround for this is using PulseAudio directly, but that introduces the other issue with the new versions not liking GMusic3D. So, in the end, both of these issues are related, and both will need to be fixed and closed at some point soon. I may work towards trying to fix this issue before tackling the ghost window that Lari wants me to work on. Leaving GAudio in the state it is in is not sitting right with me, and the more I tinker with it and use it, the more I find pockets of issues that need to be sorted out. I’d like to work on these and fix them before I feel comfortable enough to move on to the next issue, which would be ghost windows on macOS.

No comments:

Post a Comment