Chapter 8: bugs, clarifications, and tips
Chapter 8: Protecting Data and Synchronizing Tasks (pg 177)
● Tip, page 178
o In running Build All, here's a way to check for build errors:
■ Before running Build All, clear the console
■ After running Build All, check the Console messages for errors.
● In the Console frame, click on the icon "Display Selected Console", then select "CDT Global Build Console". This will show the console messages for all six of the projects.
● In the console, search for "errors" to check the six projects were built correctly
● Tip, page 181
o I found it very helpful here to take the time to use SystemView to analyze mainSemExample.c and to get skill in using SystemView.
o SystemView tips:
■ In the Timeline window, set the zoom level: right-click : Zoom : View 200us
■ To go-to an event of interest in the Timeline: in the Event or Terminal window, click on the event
● Additional info, page 181
o This info provides analysis of how the scheduler works, and its relationship to SysTick ISR and the Idle task. This info is from analyzing mainSemExample.c, and from the book and Stack Exchange.
o SysTick ISR:
■ Every 1 ms, the SysTick ISR is run
■ Info from the book on processing ticks:
● "...the scheduler is still running at predetermined intervals. No matter what is going on in the system, the scheduler will diligently run at its predetermined tick rate.", page 47
● "All of this switching [between tasks] does come at a (slight) cost – the scheduler needs to be invoked any time there is a context switch. In this example, the tasks are not explicitly calling the scheduler to run. In the case of FreeRTOS running on an ARM Cortex-M, the scheduler will be called from the SysTick interrupt...
● A considerable amount of effort is put into making sure the scheduler kernel is extremely efficient and takes as little time to run as possible. However, the fact remains that it will run at some point and consume CPU cycles. On most systems, the small amount of overhead is generally not noticed (or significant), but it can become an issue in some systems.", page 45
■ Apparently, part of what the SysTick ISR does is call scheduler-code, to perform scheduling functions. This can be seen in the SystemView screenshot below. This analysis involves some speculation, so it might not be fully correct:
● The SysTick ISR starts running at event 1053.
● The SysTick ISR runs scheduler-code, and this scheduler-code runs under the SysTick ISR.
● At event 1054, the scheduler-code has detected that BlueTaskB's delay has ended, and the scheduler-code moves BlueTaskB to the Ready state.
o Tasks that are marked ready for execution are displayed with a light grey bar until their execution starts, e.g., as here, for BlueTaskB.
● At event 1055, the scheduler-code moves BlueTaskB to the Running state. The SysTick ISR ends, and BlueTaskB is started.
■ How long the SysTick ISR runs (some speculation is involved here):
● It appears that when the SysTick ISR does not have much to do, it runs for 2-4 micro-seconds. This is seen in events 1051-1052 and 1059-1060.
● The SysTick ISR can run longer when it has more scheduler-code to run. This is the case from event 1053 to 1055, which is about 10 micro-seconds.
● Since SysTick ISR is run every 1 ms, its system overhead appears to be around 0.2% to 1%
o Idle task:
■ The Idle task runs when no other tasks are running.
■ For GreenTaskA and BlueTaskB, their priority is set relative to the Idle task, and higher than the Idle task.
■ For mainSemExample.c, the Idle task runs for relatively long periods. During that time, the SysTick ISR is run every 1 ms (shown in events 1058 and 1059, above)
o Scheduler
■ "...FreeRTOS doesn't even have a real scheduler. It maintains a list of runnable tasks, and at every scheduling point (return from interrupt or explicit yield), it takes the highest priority task from that list."
● https://stackoverflow.com/questions/7506461/implementing-scheduler-in-free-rtos
■ That Stack Overflow post seems plausible, but I haven't confirmed it.
■ From the screen-shots above, it appears that when BlueTaskB runs vTaskDelay, it results in scheduler-code being run (shown in the row Scheduler.) And, the scheduler-code changes BlueTaskB to the Blocked state, and it starts the Idle task (event 1057 and blue vertical-line, and event 1058).
● Bug in book and code (mainSemExample.c), page 182f
o Problem:
■ The operations on the variable "flag" here need to be atomic, but they are not.
■ flag is a global variable. It's set and referenced by the two concurrent tasks GreenTaskA and BlueTaskB.
■ In general, for two tasks to use a variable like flag in this manner, atomic instructions are needed, for the code to work properly.
● If this code does not require atomic instructions to work properly, determining that with certainty would be difficult and impractical, in my estimation.
■ It appears that the generated assembly language does not use atomic instructions.
● The generated assembly language uses LDR and STR. It's shown below.
● In my Internet searches, I didn't see mention of LDR and STR being atomic. But, my search wasn't exhaustive.
o Possible solutions, for concurrent operations on the flag variable:
■ A mutex could be used, when accessing flag.
■ ARM does have atomic instructions, which might work here.
● http://www.doulos.com/knowhow/arm-embedded/implementing-semaphores-on-arm-processors/
● https://stackoverflow.com/questions/11894059/atomic-operations-in-arm
● Bug in the code (mainPolledExample.c), page 183
o Problem (minor bug):
■ There's a compiler warning for mainPolledExample.c, for line 124, which is:
● while(!flag);
■ Warning message:
../Src/mainPolledExample.c:124:6: warning: this 'while' clause does not guard... [-Wmisleading-indentation]
while(!flag);
^~~~~
o Solution: while(!flag){}
● Clarification, page 185
o The program described here is mainSemTimeBound.c
● Clarification (mainSemTimeBound.c), page 185
o StmRand(3,7);
■ Returns a random number between 3 and 7, inclusive.
o The only documentation I could find for StmRnd() is from STM32CubeIDE. It can be accessed by right-clicking on the function, and selecting "Open Definition."
● Additional info (mainSemTimeBound.c), page 185
o In GreenTaskA, numLoops is changed on every iteration of the loop. This makes it non-trivial to figure-out expected behavior:
■ How much time is there between calls to xSemaphoreGive()?
■ How often will xSemaphoreTake()time-out?
o Results from using SystemView, and running the program for over 2 minutes:
■ xSemaphoreTake()timed-out 40% of the time (107 times), and succeeded 60% of the time (145 times).
● Clarification (mainSemTimeBound.c), page 185-186
o The code shown on page 185 differs from the code on GitHub (mainSemTimeBound.c). The code on GitHub has calls to SEGGER_SYSVIEW_PrintfHost(), and those calls are not in the code on page 185.
o The figure on page 186 was created from running the code on GitHub. It shows the messages from calling SEGGER_SYSVIEW_PrintfHost().
● Bug in the book, page 186
o There are two related errors in the figure:
o The events shown in the Terminal window are not the same events shown in the Events-List and Timeline windows. The events in Terminal window have different timestamps than the events in the other two windows.
o In the Terminal window, the events labeled "2" and "3" do not correspond with the events labeled "2" and "3" in the Timeline window. For each window, the time between those events is very different.
■ In the Terminal window, the time between events "2" and "3" is about 800ms.
■ In the Timeline window, the time between events "2" and "3" is about 300ms.
● Bug in the book, page 186
o There's a minor error in this sentence:
■ "Marker 1 indicates TaskB didn't receive the semaphore within 500 ms. Notice there is no follow-up execution from TaskB – it immediately went back to taking the semaphore again."
o This part of the sentence seems incorrect: "there is no follow-up execution from TaskB".
o There is "follow-up execution from TaskB". When TaskB doesn't receive the semaphore within 500 ms, it sets the red LED on.
● Bug in the book and code (mainSemPriorityInversion.c), page 190-193
o Problem
■ The program mainSemPriorityInversion.c does not result in "priority inversion", as intended.
■ TaskB is intended to spin so much that TaskA is often prevented from getting the semaphore in time. However, TaskB doesn't spin enough for that to happen. TaskA never times-out.
o Analysis
■ Two tests were run that show this:
● The program was run for 7.5 minutes, and the SystemView log was saved to a CSV file.
o TaskA: failed to get semaphore: 0 times; semaphore taken: 1310 times
o TaskC: failed to get semaphore: 32 times; semaphore taken: 1279 times
o TaskB: iterations: 15,632
● The program was modified, so log messages contained a count of the times the semaphore was taken, and failed to be taken. The program was run for 20 minutes.
o TaskA: failed to get semaphore: 0 times; semaphore taken: 3900 times
o TaskC: failed to get semaphore: 56 times; semaphore taken: 3517 times
o TaskB: iterations: 41,000
■ The spin-loop in TaskB does not run long enough to cause priority inversion:
● In TaskB, for each iteration, there is a random delay between 10 and 25 ms, and a spin-loop that runs a random amount between 3 and 8 ms.
o From my testing, lookBusy(94000000) is about 1 second.
o So, lookBusy(StmRand(250000, 750000)) would be 3 to 8 ms.
● TaskC will hold the semaphore for at most 196 ms
o Its delays are 172 ms.
o TaskB could run (spin) concurrently for at most 24 ms: 8ms before TaskC's first delay, 8ms before the second delay, and 8ms after the second delay.
o It's very unlikely that TaskB would run concurrently for 24ms, or even 20ms.
● TaskA's timeout for the semaphore request is 200 ms. But, it will wait on TaskC holding the semaphore for at most 196 ms. So the timeout never occurs.
■ The problem is fixed by making TaskB's spin-loop run longer.
■ It was changed to run 18 to 28ms: lookBusy(StmRand(1692000, 2632000))
■ So, if TaskC has the semaphore, and it waits on TaskB's spin-loop just once, it will wait 18 to 28 ms for it. And TaskC will hold the semaphore for 190 to 200 ms.
■ Testing results for 7.5 minutes are:
● TaskA: failed to get semaphore: 1335 times; semaphore taken: 865 times
● TaskC: failed to get semaphore: 544 times; semaphore taken: 950 times
● TaskB: iterations: 7600 (rounded)
● Clarification, page 190-191
o TaskA and TaskC use functions to set the red LED on and off (RedLed.On(), RedLed.Off()). For these functions to work properly here, the functions must be atomic operations. However, the requirement for atomic operations isn't discussed.
■ For the board and software used here, those functions do appear to be atomic operations. However, if those functions weren't atomic operations, the way they are used here would probably be incorrect.
■ And, the code is not portable to systems in which those functions are not atomic.
o Why the functions must be atomic operations
■ Functions that turn an LED on and off are likely to have shared data. If two tasks can operate on that shared-data concurrently, those operations must be thread-safe.
■ In the code here, the critical-sections turn the red LED off, but the LED is turned-on outside of the critical sections. So, one task could be turning the LED off while another is turning it on.
o From investigating the system's source code, it appears those functions are atomic operations.
■ The functions are implemented here:
● C:/projects/packtBookRTOS/BSP/Nucleo_F767ZI_GPIO.c
■ They call HAL_GPIO_WritePin(), which is implemented here:
● C:/projects/packtBookRTOS/Drivers/STM32F7xx_HAL_Driver/Src/stm32f7xx_hal_gpio.c
■ The comments for HAL_GPIO_WritePin() state it is an atomic operation:
● "This function uses GPIOx_BSRR register to allow atomic read/modify..."
● Bug in the book, page 193
o The figure has errors similar to those described for the figure on page 186.
o The events shown in the Terminal window are not those shown in the Events-List and Timeline windows:
■ The differing timestamps reveal that.
■ Also, in the Terminal window, there's a message from TaskB at 30.385, but it's not in the Events List window.
● Bug in the book and code (mainMutexExample.c), page 194-195
o The code has the same bug described earlier for mainSemPriorityInversion.c.
■ The spin-loop does not run long enough to potentially cause priority inversion.
o mainMutexExample.c was tested, and it produced results very similar to those from mainSemPriorityInversion.c. TaskA's semaphore-request did not timeout, however, it wasn't due to the mutexes being used.
o mainMutexExample.c was fixed by making TaskB's spin-loop run longer, as with the fix in SemPriorityInversion.c
■ It was changed to run 18 to 28ms: lookBusy(StmRand(1692000, 2632000))
■ Testing results for 7.5 minutes are what is expected:
● TaskA: failed to get semaphore: 0 times; semaphore taken: 1700 times
● TaskC: failed to get semaphore: 608 times; semaphore taken: 1092 times
● TaskB: iterations: 7600
● Bug in the book, page 195
o The program discussed is mainMutexExample.c. Its mutex-use and task-scheduling are described, but the description is not fully correct. The incorrect parts are underlined:
■ TaskA returns the mutex, but it is immediately taken again. This is caused by a variable amount of delay in TaskA between calls to the mutex. When there is no delay, TaskC isn't allowed to run between when TaskA returns the mutex and attempts to take it again.
o "TaskA returns the mutex, but it is immediately taken again."
■ The sentence's second phrase is saying that TaskA immediately takes the mutex again, at (2) in the example. This is the case referred to as "When there is no delay,"
■ However, after TaskA returns the mutex there is always a delay of at least 5 ticks:
● vTaskDelay(StmRand(5,30)); // In the .c file (different than book)
● vTaskDelay(StmRand(10,30)); // In the book, pg 190
■ Also, that description is inconsistent with the Timeline, as the Timeline does not show TaskA returning the mutex and immediately taking it.
● Instead, the Timeline shows about a 5ms delay between when TaskA returns and takes the mutex.
● The Timeline shows that TaskA runs briefly at approximately -35ms, just before point (2). Point (2) is at approximately -30ms. A screenshot is below.
● It appears TaskA returns the semaphore at -35ms. This is apparent because TaskC is waiting on the semaphore then, and at that point, TaskC's row is shaded grey. The grey shading indicates TaskC is marked Ready for execution. The mutex it's waiting for is now not taken.
● However, at (2), TaskA takes the mutex. And, at (2) TaskC is no longer marked Ready for execution (TaskC is no longer shaded grey). This is because the mutex TaskC is waiting for is taken by TaskA.
● Bug in the book, page 195
o In the Terminal and Timeline windows, the corresponding events shown have different timestamps.
o I'm guessing they are not the same events. For each event in the Terminal window, its corresponding event in the Timeline window has a greater timestamp. The only way I know for that to happen is by using SystemView's "reference" time-feature.
o Regardless, using different time-frames in the two windows is confusing.
● Additional info, page 195
o This additional-info shows how a mutex is allocated, when two different tasks have requested it, and one task has higher priority than the other. In that case, the higher-priority task is given the mutex, even if the lower-priority task requested the mutex first.
o For example, task X has higher priority than task Y.
■ Task X holds the mutex, and blocks
■ Task Y requests the mutex and blocks
■ Task X releases the mutex
■ Task X requests the mutex, and it is given the mutex
o The behavior is specified in the FreeRTOS doc, Mastering the FreeRTOS Real Time Kernel. It's in the section, "Mutexes and Task Scheduling".
■ If two tasks of different priority use the same mutex, then the FreeRTOS scheduling policy makes the order in which the tasks will execute clear; the highest priority task that is able to run will be selected as the task that enters the Running state.
● Typo in the book, page 202
o "oneShotCallBack() will simply turn off the blue LED after 1 second has elapsed"
o It should state 2.2 seconds (the value calculated by: (2200/portTICK_PERIOD_MS) )
● Bug in the book and code (mainSoftwareTimers.c), pages 204-205
o Problem:
■ SystemView can't record the events shown in the book, for mainSoftwareTimers.c
■ With this code, it's not possible to start SystemView's Record feature in time for it to record the green LED being turned-on for the first time, nor to record the blue LED being turned off.
■ The figure on page 204 shows SystemView recording those events, but I don't think it's possible with this code and the info on SystemView in the book.
o Description:
■ For SystemView to record events, its Record feature has to be started after vTaskStartScheduler()is run. This SystemView requirement does not appear to be mentioned in the book. (That omission is described in the study-guide's web-page for Chapter 7, for a bug on page 163.)
■ Starting SystemView's Record feature takes about 10 seconds. So, practically, it can't be started between when vTaskStartScheduler() runs and when the timer-functions are first called.
■ The call to ReadPushButton() may be intended to facilitate starting SystemView Record at the right time, however, it doesn't solve that problem.
o Solution
■ Design
● One way to fix mainSoftwareTimers.c is to create a task in it, which we'll name startTimersTask. Then, move the timer-related code from main() to that task. The task's priority must be lower than the timer-task's.
● The task startTimersTask turns-on the red LED, waits for the push-button, creates the timers, and then spins (e.g., while(1)).
● SystemView Record must be started after the red LED is on, but before pressing the push-button.
■ Implementation
● The fixed code is at the study-guide's GitHub repo:
o chapter-8--mainSoftwareTimers--fixed.c
● Below is a screenshot from a SystemView recording of the fixed code: