Sunday, March 31, 2024

Month 4 - Week 4 - WAV File Reader

Last week I went over all the changes with the audio that were done. There was a lot that happened, and a lot of different alterations were made, albeit small just to add finalizing touches. With quite some relief, the changes were merged and are now a part of the official release. However, some more modifications can be done, one, in particular, is the implementation of the WAV file reader. In its current state, there are over six individual implementations. Two for each platform (Windows, UWP, and Linux). I want to implement a universal form of this WAV reader and get it synced, so there is only one implementation that handles everything needed. That way, if we support more WAV features, it will sync across all platforms without us having to fight over all 8 implementations just to get it working as intended. We want easy usability for audio, the less complex, the better. Especially since we want to focus on the APIs we use to make the audio happen rather than the small stuff like WAV file reading implementation.

Linux was rather easy since I had just been working on it for the last month and a half, leading on to two months now. There was a lot that I focused on since Linux was probably the most low-level implementation of this WAV file reader. It had to introduce many particular structures to work the same way as Windows. The total implementation once Linux was finished was ~600 lines of code. This was the main chunk of code needed to get the base implementation down.


This is a very small glimpse into the WavReader class, but it shows perfectly what this class aims to do. You can how a lot of the code is condensed and built around making the WAV file reading as simple as possible. The code itself is fast and straightforward, and while I would enjoy simply using `constexpr` for a lot of what this class does, I sadly don't have access to that feature quite yet (however, when we update to C++17, I would like to make this class more optimized). So far, everything is looking good and the implementation across GSound and GMusic both show massive improvements in file size.



Now, we move to Windows. Windows has a lot of architecture built into it by default, so there are small improvements we can make if we are compiling for it. In particular, we can replace some of the data types we have to manually make in Linux.


As you can see, most of the code changes to WavReader from here on will simply modify basic data representation, with most of the functionality staying exactly the same. This allows us to focus more on getting WavReader properly hooked into all GSound and GMusic implementations with little confusion or frustration on how to get it done.

Same as before as well, we see a massive deduction in code once WavReader is implemented for both GSound and GMusic on Windows.



Lastly, UWP. I have just started trying to get its implementation working correctly. I had hoped that GFile and the previous data representation would work for it, however, GFile fails to work correctly for this type of use case. What has to be done now is for me to implement the low-level file handling that WinRT provides so I can read the WAV file data correctly. The reason for these issues, I suspect, is due to UWP using UTF-8/UTF-16 to read file data, making the generic GFile fail when attempting to read the files we need. I'm not sure if this is intended behavior for GFile, but I will be discussing it during the next meeting on Wednesday.

Overall, the UWP changes will be more extensive than the others but should be able to pass once I'm finished updating WavReader to support it. After that, the rest should be pretty straightforward. And with hope, this should be merged by Wednesday. For now, this is all that has been done.

Sunday, March 24, 2024

Month 4 - Week 3 - Finalization of the Audio

 Last week we had discussed closely about the audio and the eventual finalization process that would take place to merge the large amount of changes that were done. This week, these changes have been made. The theory I mentioned last week was tested and sadly bared no fruit towards fixing the popping issue. So, instead of focusing more on attempting to fix the problem, I chose to move forward with finalizing the changes and merging those as it still provides fixes to over three issues on the board, each of which was much anticipated. We will quickly go over all three issues that were fixed.

Newer PulseAudio versions results in GMusic3D crashing (#255)

This issue was a tricky one. On the surface it looked to be a change to the library that altered the original functionality, however, this issue was much easier to fix than expected. Instead of internal changes, I inspected the error message that had been printing.


This assertion was hard to break down initially, but it made much more sense over time. The assertion was telling the developer that the thread's pool stat exceeded the allotted size of the stream's buffer. To make it even more simple to state, the stream's buffer reached its maximum allowed limit, and we were attempting to write more than what it could handle on the server side. Because of this deduction of the assertion, I inspected the code deeper and found we weren't polling at all for the server on how much data we were allowed to write. Because of this, I added the functionality, which was much easier than expected.


This little change fixed the issue entirely, removing any need to worry about GMusic failing. This also fixed PipeWire's distortion of music as well. This is primarily because PipeWire, instead of asserting for this situation, it will just overwrite data to fit the new data, which results in the audio sounding distorted or broken to the end user. If we poll like this, the audio sounds completely normal as we expect it to.

Pulseaudio holds onto memory after cleanup (Linux) (#17)

This is an old issue but related heavily to memory leaks that were present in the audio before I started my fixes. This issue was spotted by using Valgrind (a popular memory leak detection tool). It detected several memory leaks in the audio, which resulted in me stepping in with a keen interest in what could be causing it.


I fancy myself as a memory developer. I enjoy these kinds of issues because memory leaks are fun to tackle and understand. They are tricky, but that's why I like them in the end. Due to how sneaky they can be sometimes, I enjoy breaking them into sections and really approach how to fix them. However, this change was a little intense to fix the problem. A lot of the code internally was C++98, because of this, there was a lot of code that was old and not set to modern C++ standards. I decided the fix for this would be replacing the manual `new` and `delete` allocations with the more modern `unique_ptr` usage. This will handle the actual allocation while also managing the deallocation without us worrying about anything slipping through the cracks. The implementation was rather extensive but improved the overall code massively.



There are of course many more changes to the code, but this is just a brief glimpse into the changes made. Overall, I was happy with the results, mainly because of this; if the memory leak is still present, then it is internal to PulseAudio and not with the implementation code.

Pipewire support for PulseAudio is skipping, distorted, and sometimes fails working (#256)

Lastly, we have this behemoth.


There is not one change that fixed this issue, instead, it was a collection of different changes made to PulseAudio that improved the usage which eventually fixed this problem. The implementation in its current state is a 1:1 ratio, meaning that PulseAudio and PipeWire are acting the same. This is good because that means any outside issues are now fixed across both subsystems. Besides the popping on both subsystems, everything is working as expected now.

Final Notes

With everything done, there were over 1,600 lines of code changed. About 780 lines were added with 820 removed, that's a lot of changes. The review took a minute, there were over 103 commits to comb through and lots of changes to go through, but in the end, it was merged successfully. So, what now?

Well, there is one last change I'm doing for both Windows and Linux, a universal WAV file reader. This will lessen the amount of implementations that exist for reading WAV files. I'm mostly done with it now, just have to get the GMusic implementation to work correctly, and then we'll be golden to replace full implementations. I would provide screenshots for this, but I forgot to actually push these changes to GitLab, so they're still on my local Linux sadly, and I'm writing this blog post on my Windows. I will be providing the solution I have written up next week however once everything is fully completed, which should be by tomorrow (I would've had this done sooner, but personal life took away my ability to write it up quick enough).

Beyond that, the next issue will be the ghost window issue for Mac which will more than likely take all my focus for the rest of this month and next month. I'm hoping learning Objective-C won't be too painful, but I know XCode will have its quirks that I'll have to get used to.

Overall, this was everything done over the last month and a half, and I'm looking forward to tackling some new issues that are unrelated to audio, as, if I'm being honest, audio has been tiring. Doing a different issue would be like tasting a different ice cream flavor, and I'm looking forward to that, but I have a job to do, and I plan to at least do it right. With that, this is all I have to say for now. Till next week.

Sunday, March 17, 2024

Month 4 - Week 2 - Health Issues and Finalizing the Audio Fix

 This merge request has been a beast, and sadly, the ending is not the one I would've preferred. However, a lot of time was spent on this, and a lot of effort was poured into it, so even though I couldn't get everything I wanted, the outcome is still substantially better than before.

Let's provide a little preface before we continue. My blog post this week will be very brief compared to my traditional blog posts. I've been dealing with some pesky health issues this week that have affected a lot of things for me, from food consumption to digestion issues to general body weakness, it's been a lot. To add onto this, it has also spurred a rather severe form of insomnia so my ability to work effectively has taken a massive toll as compared to normal. You can see this by the timestamps on my commits.


Safe to say, I'm in a bit of a rut right now, but with hope, I'll be back to normal soon and will be able to pour my best effort into continuing to fix bugs and get some of the issues present on the board taken down.

Now, onto the changes with this merge request, as the outcome of it has changed. Sadly, two issues will still be present after this issue closes. Over the last couple of weeks, I've been pouring in constant effort to attempt to fix this popping audio bug, however, any fix I provide doesn't get it. I've done many different solutions, from new drain flags to partitioning how much data is sent to the server, all the way to even corking the stream before flushing the buffer in an attempt to rid it of popping when the audio stops. Even with all of these changes, nothing seems to work, and the issue still persists. Even more frustrating is that the PipeWire support is affected by some of these edits, particularly the drain flag and corking. Both "fixes" result in the audio cutting off early or sounding more distorted for PipeWire, which leads me to believe that those solutions aren't correct and won't help the overall PipeWire/PulseAudio compatibility I'm seeking for the library.

So, with that, those changes have been reverted.



The last attempt I could throw is writing over what the polling write size gives for the server side of PulseAudio. This could fix it because this is exactly what GMusic does. Since GMusic relies on the streaming buffer size, PulseAudio has to support this buffer size on the server side. Something else to note is when I was making PulseAudio print out the write size it allowed initially, it supported only up to the sample rate. If you need to get more familiar with audio, the sample rate is how many samples make up a single second in audio. So PulseAudio was only allowing 1 second of audio to be written. This is a little problematic as writing just a second could cause skipping or more issues since the code to write to the stream itself could take longer than a second. GMusic doesn't suffer from this, however, since it writes the streaming buffer size, which is around ~65,000 bytes. This gives GMusic more than enough time to process and calculate what it needs to before it needs to write to the stream again. Pretty useful.

Possibly, as a theory, this is what GSound is suffering from. However, I won't be sure until I test it.

That will probably be my last attempt to fix this issue. By Monday, I will be finalizing the refactoring I've done, and the fixes I have made, before merging it completely. Even though I could fix the issue initially targeted, I was able to greatly modernize the code and bring it up to speed with C++11. I also improved much of the memory safety with the usages of `unique_ptr`, though this safety to performance won't be felt until C++17 for library developers, since `constexpr` is not introduced for `unique_ptr`s until C++14/17. Thankfully, those using the library can enjoy high performance and memory safety since this is a single header library, thus it will only compile when the user builds their application (which tends to default to C++14/17 now).

In total, there were over 93 commits, and almost 1,000 lines of code changed.


This took a lot of effort and focus to get all the changes implemented, and overall, I'm happy with the results. Though, as I mentioned, I would've liked to have seen this popping issue fixed. I have that last theory I can test, but if it doesn't fix the issue, it will have to sit unsolved for a little while longer.

While this blog post doesn't go over any problems I faced and acts more as an overview of the last month and a half of work I've done on this single interface, I feel it is worth writing all of this. Not just for me, but for others who may have to approach my code in the future. This way they know what I've tried and tested. It will help check off a lot of things from the list, and help them with diagnosing and coming up with solutions. I'm fine with that.

So what am I doing after this merge request? Well, I'm going to be focusing on implementing a universal WAV file reader, that incorporates a more C++-compliant structure to everything, as well as providing proper safety checks for everything. There were several safety checks that the old system didn't have.


There are also a lot of odd choices in terms of how the file is read for each implementation, particularly with GSound and GMusic. One will read all the data, and the other will read the file twice (if I'm not mistaken). Once for the header information, and another time to find the stream data. I find this odd and feel there could be a class for this, something that you can control to simply load all the data for you, or get a nice index/pointer to the data.

Besides that, the next issue to focus on after the quick implementation of the WAV file reader will be the long-awaited ghost windows that are plaguing Mac. I'm not sure how well I'll do tackling it, as I have no experience in Objective-C, but I should get the hang of everything relatively quick. At the end of the day, any C language compiles to the same assembly, so I should be okay.

With that, this is all I have to give for the blog post this week. I wish I could've done more, but my health has been in constant waves this entire week, so I'm having to balance everything lightly. I hope this provides some good insight into everything that has been done.

Sunday, March 10, 2024

Month 4 - Week 1 - Diagnosing The Popping Sound

I will take the time to preface this blog post by saying there will not be a lot of information within it. This is mainly due to some personal health issues that suddenly sprang up earlier in the week, so I was a little tight on time in terms of what kind of work I could do. However, even with this limitation, I still want to be able to do some work and try to figure out some of the issues that seem to be occurring with PulseAudio.

The first thing I will do is a brief breakdown of the issue. This is pretty easy. When the audio is playing, there are occasional popping sounds that can be heard when the audio ends. On PipeWire, the audio sounds a little worse as a static sound can form within the audio occasionally. PipeWire's PulseAudio support is pretty great and isn't known for having issues like these, so this points to our implementation being error-prone or heavily misused from how PulseAudio should be used.

Unlike my other posts, I don't have a fix for this currently, so instead I will be listing the fixes that I have tried thus far and show what has been done in an attempt to get this issue solved.

Volume Misuse

One theory I had was possible volume issues as some of the tests that caused popping manipulated the volume, this includes some of the surround sound tests and all of the 3D audio tests. However, I can't be certain of this because all of the math involved with it looks to be correct. Though, there could be a misuse of functions inside the volume controls. I may look into this further.



Multiple Drain Calls

Another possible issue is that `pa_stream_drain()` is called repeatedly whenever GSound reaches the end of the buffer. This function is inherently supposed to only be called once. So, maybe the repeated calls to it are caused by the popping sound. I made a simple fix for this, but the results showed no difference. However, I may still implement the change just to have cleaner interfacing with PulseAudio. The fix simply incorporates an atomic flag for draining to prevent it from being called repeatedly.



Not Corking Stream Before Flushing

This is more of a critique of my own fix. As we discussed last week, there were some changes I made to fix some issues I found in the code. One of those is the erroneous use of `pa_stream_cancel_write()`. The fix for this was using `pa_stream_flush()` instead, but this could also cause a possible popping sound. This is because when you call the function, it flushes the buffer and empties it entirely. What this means is that the buffer will effectively lose all the data it has. So, if by some means, it was halfway through a sample when it was flushed, it could cause a popping sound. The fix for this would be corking the stream before flushing it. I tried this fix and did notice a difference, but found the result to sound a little weird. Some samples would end a little earlier than expected, and the popping, most of all, was still present.



Incomplete Samples Written to the Buffer

I won't be spending a lot of time explaining this. The basic idea is that when we write to the stream's buffer, we may be writing an incomplete sample due to not sizing based on how large the samples are. This, however, is not how the buffer works. The stream's buffer should never reach the end where incomplete samples are, solely because the buffer is treated like an hourglass. Imagine the buffer, with all of the data, as an hourglass. You have the sand at the top, our data, and the middle where the glass is pinched is what's playing, with the bottom being the already played data. The sand is going to slowly drain into the bottom at the pace of the sample rate. So when the sand at the top starts getting low, we can just add more sand to the top so it can continue playing. This might be a bit of an oversimplification, but this is generally how most audio libraries handle stream data. Because of this, this "issue" isn't really a problem. As long as we are adding sand at the correct pace, there shouldn't be any popping or skipping of the audio.


Conclusion

I wish I could have included more detail or even a fix this week, but this problem has proved to be a rather large headache that is difficult to locate. I'm starting to believe that all of these different issues that I've spotted may be connected to each other thus making the popping sound happen. So possibly fixing all of them could fix the popping issue itself. I will be spending next week making major changes to the code to get it more structured and grounded for PulseAudio to effectively work with it. There is also the topic of making a universal WAV file reader implementation that I still have to do, however, I have been a little distracted trying to get this problem sorted out. This is primarily because if I fix this issue, then PulseAudio popping is fixed and PipeWire support is fully functional, meaning two issues are handled in one. The problem is just trying to get it fixed.

With that, these are all the changes I have made so far, I will hopefully be providing a larger update next week showing what solution I came up with. And that will also include fewer health issues I hope.

Sunday, March 3, 2024

Month 3 - Week 4 - Fixes to Many Bugs in Audio

Last week we touched upon the audio system. There was a lot that needed to be focused on, and a lot that happened after the unique pointer change. One of the main adjustments was further refactoring. I wanted to set a key point in trying to modernize the code somewhat, as a lot of it was written around the time the library used C++98. For this blog post, however, I'm not going to be focusing on all of the changes I did. I'm instead going to be a little bit more targeted and focus specifically on certain issues that I was working on. Some of these involved library code changes, other bits involved fixing issues that were found present in the code, and a lot of it involved research into understanding what some of these functions were doing in more detail. So let's start with the easiest.

Fixing Erroneous Function Calls with PulseAudio

One of the first things I really started focusing on was researching in more detail what these function calls were actually doing. I had an idea of what the underlying PulseAudio system was like, that the library itself communicated with the PulseAudio server that ran on the Linux operating system. This is rather easy to understand about Linux if you've worked in the space long enough. It's not as simple as how Windows works, where you're more or less interfacing with a driver, this is a little bit more direct in that it doesn't communicate with the drivers but more or less handles how the audio will be outputted to the drivers. With this distinction in mind, we have to imagine that we are a client that is communicating with the server whenever we are handling audio. Thus all of these function calls have to send a packet to the server, and then receive that packet with additional information on what's happening. I wanted to understand how some functions worked, particularly, a function that I had a curiosity about. This function was `pa_stream_cancel_write`.

Whenever you call to write data to the stream on the server, you have a lot of control over how this data can be written, transferred, and even used. Now what this function is supposed to do is that it is supposed to cancel the write operation. However, from reading the documentation, this function has to be called before the write is fully called. This function is only used with a particular function, `pa_stream_begin_write`. There is a problem with this, we never call that function. So what does the cancel operation do? Well, it does nothing. So when we call to stop the audio, we actually don't stop it at all, we just call to a function that doesn't do anything.

I looked deeper into this, and then started sourcing alternatives, eventually landing on the function `pa_stream_flush` which will stop the playback and empty the buffer fully. This might be what stop is expected to do, however, if not, we can use the cork function to instead pause the playback. The change for this was rather important and definitely made calls to stop much clearer.


Fixing Looping Failure Inside GMusic

This issue was a little bit more involved. There are normally two models you can roll with when doing write operations to PulseAudio. The first is asynchronous, and the second is polling. There was a problem with GMusic, it didn't use either of these models. So what would happen to cause the looping failure was that GMusic was constantly writing to the PulseAudio stream buffer, on the server side. Eventually, due to the sample song being large enough, it crossed the maximum allocation limit allowed for the stream, and the server would assert a failure that the client would receive. This was the failure that was being reported out of `pulsecore` which I mentioned last week. The fix is pretty clear, we just need to adapt one of these models for GMusic. The easiest, since we are already threading, would be polling. All we need is to know the size that's available to write to in the PulseAudio server. We can then wait until the server has a size available, and then write data to it. It doesn't have to be exact, it just needs enough room that it can write the data effectively.

This fix was surprisingly very easy.


Just with these small additions, now GMusic can play audio without overloading the buffer. This also had another benefit. Due to this change, PipeWire can now play the audio perfectly fine. There is still static and popping, but that is a result of another issue that has to be worked on. What this means, however, is that PipeWire support may be entirely possible here soon. However, I'm going to need more time to implement the changes needed for this.

Furthermore, due to both of these fixes, I've also included in the merge request that both of the new issues made around the audio will be fixed upon merge. This means that this merge request will now close a total of 4 issues in one merge, which is something I'm rather looking forward to.

Beyond these two major fixes, there was a plethora of refactoring changes that were done. Most of these work to make the code more readable and understandable. An example of this is the picture below.


Changes like these will not only make the code easier to get into but also make reading it more pleasant. This change was larger than my previous refactors, but it sought to fix the amount of nesting and confusing branch control calls (`break`, `continue`, etc.) that were being done.

Most of this week was devoted entirely to researching and understanding the PulseAudio library. There was a key focus on understanding this before I made any major functional changes. I wanted to ensure that my fixes wouldn't only stand the test of time afterwards, but also be reliable overall for users that are working on the Linux side of code.

With that, this is what I've done this week. It wasn't as intense as the previous weeks, but I found the progress to be more than fruitful for the overall stability of the audio system on Linux.