en
de

To “C” or not to “C” – Challenging the Embedded Standard

26 May 2015
| |
Reading time: 8 minutes

C is often considered the language of choice when developing embedded applications. Is C used due to its paramount suitability, or it is used due to the industry’s reluctance to adopt more modern language features?

Introduction

As an embedded software engineer, I often find myself working on a project where C is the selected language. After one or two weeks of struggling with the limitations of this language I am often led inexorably to ask the following question: “Why are we using C when we have more powerful languages at our disposal with a far greater number of features/functionality, for example C++?” When this question is raised it is frequently met with one or more of the following reservations:

  1. C is much quicker than X
  2. C is much safer than X
  3. We don’t have a compiler that supports X

I hope to demonstrate that these reservations are unfounded and that a higher level language such as C++ can be an equally, if not more so, suitable language for development in an embedded environment.

“C is much quicker than x”

When writing code for an embedded system, it is often necessary to write code that has to be very speed critical. Interrupt Service Routines (ISRs) are a prime example.

Let’s say we have a device that is capable of sending and receiving messages across a serial communications line. A somewhat naïve approach could be to setup an interrupt handler that is triggered whenever a byte of data is received, and then use the ISR to process the incoming data. If the ISR takes too long to execute, then the interrupt flag is not cleared by the time the next byte of data is received and the byte is discarded, and the message subsequently corrupted. Therefore it is a much better approach to do as little processing as possible in the ISR. In the above example, we would push the byte onto a queue, and leave the processing of the data for when we are no longer executing the ISR.

Many C programmers are of the opinion that C++ introduces an overhead that makes it an unacceptable choice for speed critical code. Many of these reservations arise in conjunction with the use of virtual methods and inheritance.

If we have a look at the compiled assembly of a virtual function call we would normally see something akin to the following pseudo-code.

mov R1, dword ptr [this]	//move the object’s address into register 1
mov R2, dword ptr [R1]		//move the address of the virtual function table into register 2
mov R3, dword ptr [R2 + offset of function] //move the address of the function into register 3
push R1		//push the object’s address onto the stack so that the this pointer is available
call R3	//call the function

A non-virtual function call would look like

mov R1, dword ptr [this]	//move the object’s address into register 1
mov R2, dword ptr [address of function]		//move the address of the function into register 2
push R1		//push the object’s address onto the stack so that the this pointer is available
call R3	//call the function

Based upon the microprocessor’s instruction set, the virtual function call could occupy 1 or 2 additional opcodes, given that a mov opcode will generally take in the region of 2/3 clock cycles we would see a performance drop of 2 to 6 clock cycles; hardly significant when we compare this to the inefficiencies that we introduce through the code that we write ourselves.

If we decide that the overhead provided through the use of virtual functions is still too great to be used in an ISR then we can always revert to using a non-virtual function call, providing identical overhead to a similar function call in C.

We have seen that C++ can be used to write code that operates at the same speed as C, but are there language features of C++ that can make the code run quicker than C?

Consider the following C code:

typedef struct _TFixedLengthString
{
    char stringData[256];
} TFixedLengthString;

void printString(TFixedLengthString fixedLengthString)
{
    //print string to the console
}

In order to print the string using the above function, the entire 256 bytes of data (assuming 8-bit ASCII encoding) must be pushed onto the stack, wasting time and stack memory.

A more efficient alternative would be:

void printString(const TFixedLengthString* fixedLengthString)
{
    //print string to the console
}

Notice now how the string is passed as a pointer, meaning that only 4 bytes of data (assuming a 32bit processor) need to be pushed onto the stack. This approach introduces an additional problem, the print string function could now be called in the following manner:

printString(NULL);

In order to avoid a segmentation fault, the function must be amended to:

void printString(const TFixedLengthString* fixedLengthString)
{
    ASSERT(fixedLengthString != NULL);
    //print string to the console
}

Where the ASSERT function will cause some runtime error. C++ provides references in addition to the pointers available in C. By using a reference, we still only require 4 bytes of data (again assuming a 32 bit processor), however we no longer need to defensively program for NULLs, as a properly constructed reference always point to an object.

void printString(const TFixedLengthString& fixedLengthString)
{
    //print string to the console
}

This ultimately results in faster code, as the ASSERT code does not need to execute.

Now, the C programmer may point out that the ASSERT line is generally a macro that expands to nothing in a release build, causing no additional execution time. Whilst this is true, removing the ASSERT does not remove the possibility of calling the function with a NULL, which brings us nicely onto the next point.

“C is much safer than x

This is an argument that is often put forward, but when you investigate what the programmer means, the statement transforms from “C is much safer than x” to “I have more confidence that the code I write in C does what I expect it to do”.

Let’s have a look at why a language such as C++ can produce much safer code.

Constant Data

If we consider the following two erroneous functions:

void cStyleFunction(const void* mem1, const void* mem2)
{
    char* mem1AsChar = (char*)mem1;
    char* mem2AsChar = (char*)mem2;
    if (mem1AsChar[0] = mem2AsChar[0])
    {
        //do something
    }
}

void cppStyleFunction(const void* mem1, const void* mem2)
{
    char* mem1AsChar = static_cast<char*>(mem1);
    char* mem2AsChar = static_cast<char*>(mem2);
    if (mem1AsChar[0] = mem2AsChar[0])
    {
        //do something
    }
}

These functions take two addresses in memory and then do a comparison of the first byte in the respective memory addresses, however the programmer has made the mistake of using an assignment (=) rather than a comparison (==), and has inadvertently cast from a constant value to a mutable one.

In the first C style function, the compiler will happily compile the code and proceed as if there is nothing wrong with it. The second C++ style function, on the other hand, will produce a compiler warning indicating that it cannot cast from a const void* to a char*, thus notifying the developer of the mistake, and thus producing much safer code.

This example shows us that the concept of constant data in C is very weak, and can easily be removed through a cast. Provided that the use of C style casting has been forbidden by the imposed coding conventions, and the use of const_cast is only allowed under exceptional circumstances, i.e. to rectify mistakes in third party libraries, C++ can ensure that constant data is not modified.

Scope

Let’s consider another example:

typedef struct _CStruct
{
    void* memoryBuffer;
} CStruct;

void Initialise_CStruct(CStruct* this, void* buffer)
{
    this->memoryBuffer = buffer;
}

void Write_CStruct(CStruct* this, const void* data, unsigned int dataSize)
{
    //write data to memory buffer
}

struct CPPStruct
{
    CPPStruct(void* buffer): memoryBuffer(buffer) {}

    void Write(const void* data, unsigned int dataSize)
    {
        //write data to memory buffer
    }
private:
    void* memoryBuffer;
};

The above example is representative of a simple class that stores a buffer and then provides functions to write data to the buffer. The intention is that once the initialise function has been called the memoryBuffer pointer is immutable and is guaranteed to stay in the same location.

In the first C style example, the memoryBuffer is fully accessible. Anyone who has access to the CStruct object is capable of changing the memoryBuffer’s location, in contravention of the desired behaviour.

The second C++ style example provides the memoryBuffer as a private member, meaning that only the CPPStruct itself can change the data, providing a greater guarantee that the data is not going to change, and therefore ensuring a higher degree of safety.

“We don’t have a compiler that supports x”

Of all of the reasons presented above, this is the only one that could cause me to consider using C over C++. The main reason is that several microcontroller processors support non-standard C syntax that is required in order to get the processor working as expected. In this situation use of the C language can be unavoidable. However if the compiler in question is using plain ANSI C then even in the absence of a C++ compiler it still may be possible to build the application in C++.

There exist toolsets both free and commercial, of varying quality that are capable of converting C++ code into C code. This means that efficient, modular, object-oriented C++ code can be written and then ported onto the device in question. . For example clang with llc. The output C code may be unreadable, but provided it is only the C++ code that is maintained then this shouldn’t be a problem. The trade-off using this approach is that C++ can be used as the development language at the expense of compilation time.

Why stop at C++? Why not C# or Java

Whilst we are exploring the use of higher level languages for embedded applications, it seems apt to ask why certain other memory-managed languages are not commonly used, e.g. C# or Java.

There are several issues with these memory-managed languages that make them unsuitable. The main reason that these languages are not used for embedded development is that a run-time environment is required.

Both of the suggested languages above require some sort of run time environment, (CLR and JRE) respectively, within which they operate. As many embedded platforms possess limited memory and lightweight operating systems, it may not be possible to install the run-time environment given these limitations.

Other reasons why they are not suitable include:

New keyword: Often in embedded applications, the use of dynamic memory is forbidden; see MISRA C:2004 Rule 20.4. In Java, objects must be allocated using the ‘new’ keyword, effectively placing them on a heap, in violation of the dynamic memory requirement.

Garbage collection:  In embedded applications where the use of dynamic memory is permitted, the ‘new’ keyword can be used. However, in Java and C#, there is no concept of a ‘delete’ operator. Both Java and C# use a garbage collector to free up memory when it is no longer required. It can be difficult to gain control of when garbage collection is performed, and it is unacceptable in a speed critical section of the application for execution to be interrupted in order to perform memory cleanup at that particular time.

Difficulty in accessing memory addresses directly: In an embedded application it may be necessary to access memory directly. Consider a Direct Memory Access (DMA) controller for example. When an analogue signal is converted to a digital value, it may be necessary to use the hardware to store this result in a specified location in memory, which can subsequently be accessed at a later point. In Java and C#, it is very difficult to gain access to the memory directly; the use of Unsafe classes are required, which is a commonly discouraged practice.

Conclusion

It is apparent that many of the objections to using a high level language are unfounded and stem from the industry’s unwillingness to change. As any software developer will know, trying to break the “but we’ve always done it this way” mentality can be a difficult exercise.

Of the three objections presented here, the only one that may stop you using a high level language such as C++ is the lack of compiler support. However this obstacle only exists due to a reluctance within the industry to adopt newer languages; if there are more people requesting different language support, then there will be more onus on the manufacturers to provide it.

 

Comments (2)

Avatar

Decryphe

11 June 2015 at 12:28

What is the state of new and upcoming languages like Rust or Go?

    Christopher Burdett Smith Whitting

    Christopher Burdett Smith Whitting

    11 June 2015 at 13:46

    Having studied Rust, I believe that it could become a very good language for embedded projects (there are some very nice features particularly ownership and borrowing), however, as the language is still in its infancy, it is not yet viable; no chipset manufacturer will want to provide support for a language until it stops mutating and settles down to a stable version (I believe that version 1.0.0 was released in May 2015, so it could be a while before there are commercially viable compilers!). Unfortunately I haven’t had much experience in Go, so I don’t feel comfortable commenting on how suitable it would be as a language for embedded systems.

×

Sign up for our Updates

Sign up now for our updates.

This field is required
This field is required
This field is required

I'm interested in:

Select at least one category
You were signed up successfully.

Receive regular updates from our blog

Subscribe

Or would you like to discuss a potential project with us? Contact us »