Re: firebox build fails post clang-3.4 merge

From: David Chisnall <theraven_at_FreeBSD.org> Date: Fri, 28 Feb 2014 09:09:32 +0000 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:47 UTC

On 28 Feb 2014, at 01:51, Michael Butler <imb_at_protected-networks.net> wrote:

> I guess what I'm trying to get at is that I am used to a compiler which
> takes one of two actions, irrespective of the complexities of the source
> language or target architecture ..
> 
> 1) the compiler has no definitive translation of "semantic intent"
> because the code is ambiguous - produces an error for the programmer to
> that effect
> 
> 2) the compiler has no idea how to translate unambiguous code into
> functional machine code - produces an error for the compiler author(s)
> benefit to expose missing code-generation cases

If you're actually used to compilers like this, then I can only assume that you normally only compile programs that are a single compilation unit with no dynamic flow control (e.g. function pointers or data-dependent conditionals), because that's the only case where it is possible to implement a compiler that does what you claim you are used to.  You're certainly not used to any C/C++ compiler that's been released in the last 20 years.

The reason for the 'land mines', as you put it, is that the compiler has determined that a certain code path ought to be unreachable, either because of programmer-provided annotations or some knowledge of the language semantics, but can't statically prove that it really is because it can't do full symbolic execution of the entire program to prove that it is in all cases, for all possible inputs.  It then has two choices:

- Merrily continue with this assumption, and if it happens to be wrong continue executing code with the program in an undefined state.

- Insert something that will cause abnormal program termination, allowing it to be debugged and (hopefully) preventing it becoming an arbitrary code execution vulnerability.

In the vast majority of cases, sadly, it will actually do the first.  It will try, however, to do the latter if it won't harm performance too much.  You can, alternatively, ask the compiler not to take advantage of any of this knowledge for optimisation and aggressively tell you if you're doing anything that might be unsafe.  Compiling with these options gives you code that runs at around 10-50% of the speed of an optimised compile.  Maybe that's what you're used to?

Now, things are looking promising for you.  The estimates we did a couple of years ago showed that (as long as you don't use shared libraries), it should be feasible (although quite time consuming) to do the sorts of analysis that you're used to for moderate sized codebases once computers with 1-2TB of RAM become common.  At the current rate of development, that's only a couple more years away.  It may still take a few weeks to compile though...

David