Wednesday, August 14, 2019

Making Sense of a Cross-Platform Codebase...




Making Sense of a Cross-Platform Codebase Before Development 
And Other Hilarious Jokes You Can Tell Yourself 

Author: Jacob Pendleton


  Whether you found this post before you started development, three months in, searching for a reason to not give up, or somewhere in between, there are a few mistakes one must get through and a few epiphanies one must experience to be productive in a cross-platform codebase. Myself being mid-development for the GAudio libraries, here are a few things I wish I knew before starting development.

1. How the whole thing compiles on three different OSes


  Before Gateware I had never heard of CMake and hardly knew at all what went into the build process for any project. So here it is. CMake is a program that, once configured, generates any IDE-specific solution so that you, the developer, can open the solution, hit the build button, and it just works regardless of what OS your compiling this jumble of cpps, hpps, mms, hs, or whatever languages and extensions you are using for development.

  Below the hood, this is just the CMake application inside the CMake folder (gateware.git.0/CMake) being run from the command line through a batch file or the OS equivalent. CMake then recursively constructs a tree structure for the project using the CMakeLists.txt files found in each directory and creates a solution for VisualStudio on Windows, Code::Blocks on Linux, and Xcode on Mac. This is how the mac specific hpps and mms are only included in the Xcode solution, etc.

  For platform-agnostic files, you can use #ifdef __APPLE__ for mac specific code, for example, or #ifndef __linux__ if you did not want something to be compiled on Linux. These preprocessor commands are what link the end-users' Gateware interfaces to the correct source code as well.

  This solution is placed in a directory called GW_OUTPUT which is placed outside the repository but references the repository. This makes GW_OUTPUT completely disposable, and you can always generate a new solution by rerunning the setup batch file, and the source code you wrote is preserved.

2. How the documentation works


  Gateware like most libraries is built with a backend and a front end: the source code and the interface respectively. The interface is what the end-user has access to in his project if he #includes a Gateware library, but the end-user may not know how to use every function he has access to. This is of course what documentation is there for. Gateware uses software called Doxygen to turn the raw text documentation found in the interface header files into beautiful, professional Html and LaTeX documentation you would see on MSDN or anything else similar. To open the Gateware documentation, just open Documentation/html/index.html with your preferred browser and voila.

  Doxygen accomplishes this by finding specially formatted comments in the gateware interface h files. The comments appear in a comment block ( /* */ ), begin with a bang denoting the heading ( ! ) and use asterisks for each behavior ( * ). Examples can be found in the interface, online, and in the Gateware docs.

3. The unit tests are your friend—Write them well and use them


  Gateware uses a test-driven development model (TDD) where developers first create an interface, instantiate that interface in each unit test, and require each test to successfully pass a single function. If you got none of that, don't worry, I'll break it down.

  Unit tests are bits of code that you run that reveal if your code did what you wanted it to. They are a great tool for finding bugs, and they can also be used to prevent bugs from ever existing if you use them correctly. Gateware's unit tests use the Catch library to call the REQUIRE and CHECK functions in the tests. These functions both evaluate the passed expression and record the result. If an exception is thrown, it is caught, reported, and counted as a failure. These are the macros you will use most of the time. Require is what we use for the final assertion, and if a require fails, the test is aborted.

  If you are working in an already existing interface, say you're fixing bugs or adding features, the first place to go is the unit test cpp file for that interface. Here you can create a test that has one assertion for the behavior you are trying to create. This test should fail of course, as you have not made the function do anything yet in the source code. Once you run the test and it fails, you can then move on to your function and add the minimum amount of code to make the test pass.

  The unit tests should be created in such a way that each test should stand on its own and contain little to no references to other tests; If you comment out a previous test, it should not break any future tests. Each test follows the template:
  TEST_CASE("unique title", "[function tested]")

  Each test should ideally test exactly one behavior. Unit tests have three steps: Arrange, Act, and Assert. During the arrange phase, you create any objects and parameters needed to test the behavior. You then Act by calling the method you are testing and save its return value. You then finally Assert on that value to test that behavior. In C++, the final step is to clean up any memory that was allocated dynamically to prevent any leaks.

  Code coverage is a term that refers to the amount of code executed over all of your unit tests. Because each test only tests for one behavior, any function with branching logic needs multiple assertions to test each branch. For example: if my constructor checks if nullptr was passed as one of the parameters, even if it throws an invalid argument exception, we need to test that it successfully fails. this is called a negative test case and can be added with a:
  REQUIRE(createObject(nullptr) == GW::INVALID_ARGUMENT)

  With this, we run the branch in our function where nullptr is passed as an argument. In a perfect world, our unit tests reach 100% code coverage. That is to say, each branch in every function has its own test. This ideal is not always reached, but it is common to be held to the standard of 80% code coverage.

  Once you have all your tests written, they not only become one of your most valuable debugging tools but also allow you to write less code and program faster as you add features and functions. This is because TDD mandates you write the minimum amount of code to make your assertions pass, and stops you from getting distracted adding unnecessary infrastructure where bugs tend to fester.

4. How the architecture works: COM and UUIDs


  Interfaces are one of the most fundamental ideas of object-oriented programming. You may have used them in several languages and learned the various syntaxes used in higher-level languages; you may have used them in C++ without even realizing it, or perhaps you're an interface pro. Regardless, interfaces at their core are a way of referring to an object as something interactable without exposing all of its members or functions. Each Gateware library follows this principle, and because of this, the same Gateware code will compile and run on any of the three platforms without any changes.

  The GAudio interface, for example, could be referring to a WindowAppAudio object, a MacAppAudio object, or a LinuxAppAudio object, but a GAudio object always has the same functions available to it, no matter how much you change its various implementations. In many programming languages, GAudio would be defined as interface GAudio; C++, however, has no such type. Instead, it is defined as class GAudio with all of its functions pure virtual.

  This means a GAudio cannot be instantiated and must be created by a factory method instead of something like the following:
  GAudio audio = new GAudio(); //will not compile
Because GAudio has pure virtual functions (and no actual implementation code) we defer the creation of a GAudio to a platform-agnostic CreateGAudio(GAudio **_outAudio)like so:
  GAudio * audio = nullptr;
  CreateGAudio(&audio);
On Windows, this creates a new WindowAppAudio and points GAudio to it, and on Mac and Linux, a MacAppAudio and LinuxAppAudio respectively.

  At the beginning of each Gateware interface, there is a serial number-looking thing defined as a GUUIID (Gateware Universally Unique Interface ID). These perform an important function for Gateware end-users when Gateware releases a new version, as the GUUIID will be different for every new version of an interface. Note that the GUUIID only changes if the interface behavior changes. This means it will not be updated if the Doxygen comments are changed, or any other non-behavioral change.

  GUUIID are just Gateware's name for a UUID (Universally Unique ID), also called a GUID (Globally Unique ID). These have been around a while, and generating a new one is guaranteed to be unique because they are so long. Here's GAudio's current GUUIID: 82DE61C1-C47A-41E5-90BE-C31604DF1140.

  Using a UUID for each interface has been around since Microsoft introduced the COM (Component Object Model) in 1993. The one other COM-like approach Gateware uses is the COM inheritance structure for interfaces. In Microsoft's inheritance structure, all interfaces derived from the IUnknown interface. This allows all interfaces to require certain functionality across every interface in the architecture.

  Gateware's base interface is called GInterface. It is GInterface that contains the request interface function which can be used by the end-user to query for a new interface using the GUUIID of the interface they want, and thereby implement an interface update.

5. How the architecture works part 2: the reference counting system


  There are three more functions in GInterface which are implemented in every Gateware class: GetCount, IncrementCount, and DecrementCount. These three functions form the reference counting system, which you can think of as a garage collection system. Reference counting keeps track of the number of objects and users with pointers to a Gateware object so that when the object's reference count falls to 0, the object deletes itself.

  Normally if all pointers to an object are lost or fall out of scope, there is no way to access that object and delete it, and all dynamically allocated memory is leaked. So when one is done with an object, he calls delete on the pointer and the object is deleted. This is only the case when the object was dynamically allocated, that is, objects where the keyword new is used. With our factory method CreateGAudio, as far as the end-user is concerned, nothing was dynamically allocated. And so we have its count start at one. If the user creates another pointer to the object and it is being referenced from multiple places, the end-user must call IncrementCount.

  To fully understand the reference counting system, we must look at a case where both the user and another process are using a Gateware object. In the audio system, when a new sound is created, the end-user calls audio->CreateSound(&sound). Notice how it is the audio system that creates and has a handle to the sound, but the user also has a handle? This GSound object's count starts at 2. When the user is done with the GSound object and queues the GSound for destruction (DecrementCount), GAudio still has control of the sound, and my attempt to access it with PlayAll, PauseAll, or StopAll. This is why, at a safe point, GAudio iterates through its GSounds and GMusic, disconnects, and decrements the ones that are sitting at a count of 1.

In Conclusion


  If I had known these 5 not so simple things before I started development, it would have saved me entire days of work. I write these so that you can apply them in any cross-platform environment, and have a deeper understanding of whats going on quicker than you ever would have otherwise.