Monday, August 26, 2013

Is Java Riskier than C/C++?

The following is a contributed article from Jon Jarboe of Coverity:

Is Java Riskier than C?
Jon Jarboe
Senior Technical Manager, Coverity

Lately, I’ve heard a number of folks discussing whether Java development is riskier than development in C/C++ (from here on out, I’ll just refer to "C"). They’re not rehashing the age-old discussion of which language is best, but are wondering whether teams developing in Java have unique risks compared to teams developing in C. They are particularly interested to learn how projects can better understand and manage risk.

For purposes of this post, I’ll ignore sources of risk that span languages—things like social engineering and configuration/training problems, which are presumably constant. What follows is based on my own experience - biases and all - with the understanding that this isn't a numbers discussion.

To my shame, I initially thought the answer was obvious: Java was designed to address many of the most important pitfalls of C, so it must be safer. Hackers have been exploiting C vulnerabilities like out of range memory accesses and null pointer dereferences for decades, circumventing controls and interrupting business-critical—and in some cases, mission- and life-critical—functions. (Activist hacker Barnaby Jack was planning to disclose lethal, exploitable vulnerabilities in implantable medical devices before his recent, untimely passing)

Java eliminates many of these vulnerabilities entirely, providing internal protections that automatically enforce buffer bounds and intercept and pass null and invalid memory references to the application for graceful handling. Whereas C programs need to navigate a maze of APIs like POSIX and Windows for threading and concurrency management capabilities, Java provides language primitives to enable developers to consistently manage concurrency.

While C memory management is notoriously tricky, Java provides a garbage collector to make resource management almost foolproof. While C applications are compiled to have direct hardware and OS access, Java programs are run in a virtual machine sandbox that tries to prevent them from interacting with the rest of the system. Surely that means that Java is less risky.

Of course, we regularly hear about data exposures and web site attacks—areas where Java is very commonly used. In fact, the vast majority of security vulnerabilities, such as injections, cross-site scripting, and authentication/session management, originate in faulty code—meaning that Java applications can’t be immune to these security problems. There are plenty of other types of risk that originate in the application code, impacting things like availability, data integrity and performance. When it comes down to it, this question is more complex than I initially realized.

After taking a step back, I realized that my initial reaction was focused on the risk inherent in the runtime environment. Java improves upon the weaknesses of C, especially in preventing application users from being able to gain control of the machine(s) upon which the application runs. But gaining control of the machine is rarely the goal—it is often just a means to gain access to, or interrupt the flow of, data. Considering the relevance of the application vulnerabilities that obviously still exist, those improvements are apparently just a step in the right direction.

To understand the implications, maybe I should consider the bigger picture. General purpose programming languages like Java and C are by definition designed to apply to a wide variety of programming tasks. They provide the toolbox that enables developers to implement their applications. It is the developer’s responsibility to craft a good logical model for the application, and to correctly implement that model using the available tools. From that perspective, the "riskier" language might be the one where developers have limited ability to effectively manage that risk.

Both C and Java have robust tools to manage application risk—from feature-rich IDEs to sophisticated analysis and debugging tools to testing frameworks and high-quality reusable components. I don’t think that either language is at a disadvantage for managing risk, so maybe the language isn't the source of risk. Maybe it's actually the applications or the developers.

I guess we have answered the original question: neither language is riskier than the other. But I don’t think we’re done yet; there is still the interesting question of managing risk. I don’t think it’s particularly important to characterize the source of the risk further. Suffice it to say, much of it originates in source code, with the nature of the application generally defining the major types of risk.

So where does that leave us as developers?

Above all, deeply understand your toolbox. What are the strengths and weaknesses of your languages/environments of choice? Do they insulate you from system-level concerns? Are there performance tradeoffs? For each project you undertake, understand the risks you’re facing with the application. Do you need to worry about concurrency, data security and/or resource constraints?

Java has a garbage collector that will free certain resources sometime after your program finishes using them but that doesn't mean that you should be lazy. Not all resources are handled automatically, and it can take a significant amount of time for resources to become available for reuse—impacting application availability and performance. By preparing for the worst case, you can minimize the chance it will happen. Actively manage as many resources as you can, to lessen the load on the garbage collector and reduce the chance of availability and performance problems.

Concurrency is tricky, even when you have the luxury of language primitives. If you want to get it right, you need to understand your application’s behavior on a deep level and you need to understand the intricacies of the concurrency tools you are using. Even sophisticated analysis and debugging tools have their limits—just because the code is valid doesn't mean that it’s correctly implementing your requirements.

Once you understand the risks, see whether you can find existing components and frameworks that have proven themselves in similar applications—it is often much harder to invent a better wheel than we anticipate. After understanding the risks of your application and the strengths and weaknesses of your tools, you can start to match up tool strengths to application needs and find the best way forward.

Ultimately, the question of which language is best – or riskiest – is just a distraction. To understand and manage risk, you need to ensure that your developers understand the intricacies of their tools, that you have a solid plan for attacking the challenges presented by your application, and that you have sufficient internal controls to recognize when you are off course.

The article above was contributed by Jon Jarboe of Coverity. I have published this contributed article because I think it brings up some interesting points of discussion. No payment or remuneration was received for publishing this article.

No comments: