Avoiding Race Conditions with LabVIEW Programming

Tags: LabVIEW, Software

Some of the most common issues in software programs are race conditions. Race conditions introduce some of the most perplexing and difficult to track bugs. In this blog post, we’ll be taking a deeper look into race conditions, what they are, what are their consequences, and how you can avoid them through LabVIEW programming.

Avoiding Race Conditions with LabVIEW Programming

What are Race Conditions?

A race condition is an event that can occur in electronics or software where the outcome depends on the order in which two or more events occur. This can be whichever signal reaches a logic gate first or which software thread changes a stored value. Typically, a race condition is caused by poor control over the order of operations when acting upon shared resources, whether those resources are shared between interleaved or multi-threaded logic and programs.

When a race condition occurs outcomes can be erratic and indeterminant. This happens on part of the speed of operations influencing the order of operations and decisions. On the surface, this can create problems in functionality in a very inconsistent manner. When a race condition is present then many times the software may function as intended, but occasionally have times it does something unintended despite no change in the user’s actions. It is important to prevent race conditions to also prevent some of the most challenging bugs to track and unintended outcomes.

In the software world, race conditions can only exist in environments that are multi-threaded and where those threads interdependently rely on some shared resource (often a location in memory, or hardware access).

Race conditions aren’t something that can easily be monitored which is one of the reasons that make them very hard to find when issues arise. They are best managed by following proper software development best practices and using mitigation tools when potential conditions could exist.

One of the simplest race conditions can be illustrated with the following LabVIEW code. This code contains two free-spinning loops that read the value from the Race Count indicator on the front panel, add one, then write that incremented value back to the indicator. These loops may represent multiple independent test counters tracking the total number of tests performed on parts: each loop would track how many times the loop’s part was tested and would be adding to the count of total units tested on the system. The race condition in this case would cause “lost” or “missed” test counts even though they were performed.

An example of a race condition, illustrated in LabVIEW.

Here is a zoom in of a Race Count indicator in LabVIEW software, including total iteration count and a tracking of the percentage of missed race conditions.

In an ideal world, this Race Count variable should track the number of iterations being executed in each of the two loops (i.e., the race count and the total iteration count should be the same), but as can be seen from the results in the above example, 26% of the counts have been lost. Breaking down this example, you will find that the first loop reads the value of the race count and before it can increment and write the value back of the next race count, the bottom loop also reads that same race count value. When both write the same incremented value back, one of the counts is lost. In other words, one of the loops is reading the race count before the other loop is updating it; then, both loops iterate and write back the same race count, thus resulting in one of the race counts not being recorded. In order to prevent this race condition, we need to isolate control access to the value so only one of the loops will access it at a time.

How to Prevent Race Conditions

Race conditions can result in unintended outcomes more often than 26% of the time, but also much less making erroneous events difficult to capture and investigate. In a large application One way to do isolate control access of a resource is to use timing. The read > increment > write section of the code occurs in a brief period of time (e.g., 1-2 microseconds). Using the following code modification, we can make one loop wait to run only on even millisecond counts and the other to run only on odd millisecond counts.

An example of timing conditions used to prevent a Race Condition.

The apparent Race Count after modifying timing conditions.

While according to the Race Count indicator, modifying the timing conditions successfully prevents a Race Condition, this method is not flexible and is not guaranteed to work. There are some significant shortcomings to the timing method:

  • Different time intervals must be assigned to every location in the code that accesses the variable

  • The more access points there are, the slower your code will run. Even in the scenario with two operations competing for the resource, execution speed slowed by a factor of 1000

  • Myriad presumptions can occur in a non-real-time operating system like Windows due to hard drive access, antivirus scans, and other applications monopolizing the CPU, that can easily lead to more than a millisecond delay between the frames shown above

Preventing Race Conditions with a Semaphore

A more robust way to prevent race conditions is to use a resource lock, or semaphore. A semaphore is a piece of code that gates access to a resource. When writing your code, you create a semaphore and pass its reference to each loop that may need to access the resource in question. When access is desired, use the semaphore reference to request access, or “acquire” the semaphore. Whenever the resource is free, the semaphore is granted and code may proceed to access the resource. Once your segment of the code is executed, the semaphore must be “released” so that other parts of the code may access the resource in turn. A modification of our example that uses semaphores is shown below.

An example of software written using a semaphore to control access to a shared resource.

This takes care of all the shortcomings from the timing example and guarantees access is only allowed at one location at a time. However, semaphores do not really lock access to a resource: locking it in one location does not prevent the resource from being used elsewhere. The semaphore for the resource must be passed to every location that needs access to it in order to enforce when you or anyone on your team is using it.

In its simplest form, the Acquire Semaphore function will block the resource and wait for it to be available. An optional timeout can be implemented so that your loop may provide alternate functionality if the resource doesn't become available after a certain amount of time has elapsed.

Semaphores can also be used to gate access to a limited resource, such as in a situation where you might have four DAQ cards but want access from five or more locations. When you obtain the semaphore reference, you can configure it to only block access when there are more than a particular number of outstanding locks by wiring that number to the size input.

Avoid Race Conditions with Software from Genuen

At Genuen, all our code is developed using our company’s best practices to prevent race conditions as much as possible. Additionally, we perform formal software reviews of modules created, which helps us to reduce or eliminate the number of issues that make their way into the final stages of system debug. We offer system architecture design and software development services to those not familiar with all the pitfalls and traps that could arise when developing LabVIEW software. Contact Genuen today for help developing or architecting testing software for your application.

CONTACT US

Rebecca Slota

Rebecca Slota

Rebecca is a software engineer with 8 years of experience at Genuen. She primarily designs, develops, and debugs test system software. She specializes in LabVIEW and Python integration, developing hyperautomated test processes, and NI Package Manager software distribution.

About Genuen

Our goal is to improve time to market without compromising product quality or safety standards. With experience in mission-critical applications and regulatory compliance, Genuen creates custom test systems across the product lifecycle, including hardware-in-the-loop (HIL), fluid power test, and electromechanical test. Headquartered near Kansas City, we have offices across the United States and serve clients in aerospace, transportation, national security, and beyond. The company's Quality Management System (QMS) is certified to ISO 9001.