Inline library functions based on constant-ness of arguments #352

CAFxX · 2020-06-24T12:01:11Z

Many fundamental functions could be effectively inlined without causing significant binary size increase if they could be inlined only if certain arguments were constants, as in such a case significant parts of the function body would turn into dead code, and could be therefore eliminated.

Samples of this kind of functions include: digitalWrite, digitalRead, analogWrite, analogRead, pinMode, turnOffPWM and so on.

GCC apparently does not do this automatically, but it can be done. One way is as follows (let's take analogWrite as an example):

void analogWrite(uint8_t pin, int val)
{
	// body...
}

If we accept to turn analogWrite into a macro, we can rewrite it as follows:

#define analogWrite(pin, val) \
	( __builtin_constant_p(pin) ? _analogWrite_inline((pin), (val)) : _analogWrite((pin), (val)) )

void _analogWrite(uint8_t pin, int val) {
  _analogWrite_inline(pin, val); 
}

inline __attribute__((__always_inline__)) _analogWrite_inline(uint8_t pin, int val) {
	// body...
}

In this way, callsites that have a non-constant pin argument will get the regular _analogWrite function as they do today without causing any increase in binary size, whereas callsites that have a constant pin argument will get an inlined version, and constant propagation will take care of eliminating dead code. If done consistently, this would allow GCC to turn a call to e.g. analogWrite(PIN, HIGH) into the following:

	uint8_t oldSREG = SREG;
    cli();
	*reg |= regbit; // reg, regbit are constant
	SREG = oldSREG;
	uint8_t oldSREG = SREG;
	cli();
	*out |= outbit; // out, outbit are constant
	SREG = oldSREG;

(that does not take much more space than the original call and that, potentially, opens up the potential, if all callsites use constant args, for _analogWrite and related functions to be completely omitted from the final binary)

The text was updated successfully, but these errors were encountered:

matthijskooijman · 2020-06-24T12:06:47Z

Yeah, this could indeed help a lot. I think the approach you describe is already used in some places (I think SPISettings, maybe others too).

GCC apparently does not do this automatically,

With LTO, GCC might be able to do aggresive inlining automatically, but I'm not sure how well it can predict what the effects will be.

One complication is that some functions involve PROGMEM arrays, an pgm_load_* cannot be resolved at compiletime, but I think those can also be specialcases with __builtin_constant_p (skipping the pgm_load for constants). Maybe just this would allow LTO to already automatically inline a lot. I've been meaning to look at this, but never gotten around to this.

matthijskooijman · 2020-06-25T18:53:17Z

Also see this somewhat-related discussion from a long time ago: https://groups.google.com/a/arduino.cc/forum/#!topic/developers/OGAGuF-HQJQ

CAFxX · 2020-06-26T00:41:44Z

Regarding the progmem arrays, not sure if this has been suggested before but a potential way to sidestep the issue could be to turn them into switches, that hopefully (needs to be tested) gcc will compile as jumptables in text anyway for the non-constant argument case (sure, they would likely occupy a few bytes more)

Once they're switches, we can force inlining on them and this should make everything work regardless of LTO.

matthijskooijman · 2020-06-26T08:13:27Z

Regarding the progmem arrays, not sure if this has been suggested before but a potential way to sidestep the issue could be to turn them into switches,

I would be hesistant of this approach, since I'm afraid that even if they become proper jumptables, the flash size of this is significantly higher (I expect the jump table will be 2-bytes per case, against 1 or 2 bytes per case for the current table, and then the target of each jump will need 4 more bytes for a ldi and a rjmp probably, so that means the flash size of these tables will triple or more).

I guess that a switch (which would be inside an inline function declared in the header, I suppose) is an interesting way to sneak a constant table into a header (which is not otherwise possible by duplicating it by making it static, I think, since inline variables are not introduced until C++17 IIRC).

Also, I wonder if this is really needed: if we can write things so that not having LTO will just not optimize the constant case, or maybe raise an error (but not silently break), it would be ok to only support the LTO-case with this optimization?

CAFxX · 2020-06-27T13:11:33Z

it would be ok to only support the LTO-case with this optimization?

It would, at least as far as I'm concerned. The non-LTO case was only mentioned in the link you posted.

nerdralph · 2020-07-11T16:09:28Z

With LTO, gcc will evaluate progmem arrays at compile time. It's a trick I use in my ArduinoShrink library.
https://github.com/nerdralph/ArduinoShrink

ArminJo · 2021-02-06T11:27:06Z

For the digitalRead and Write you can just use the https://github.com/watterott/Arduino-Libs/tree/master/digitalWriteFast library

matthijskooijman · 2021-05-21T15:12:03Z

Also see #264, which discusses a similar idea.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inline library functions based on constant-ness of arguments #352

Inline library functions based on constant-ness of arguments #352

CAFxX commented Jun 24, 2020 •

edited

Loading

matthijskooijman commented Jun 24, 2020

matthijskooijman commented Jun 25, 2020

CAFxX commented Jun 26, 2020 •

edited

Loading

matthijskooijman commented Jun 26, 2020

CAFxX commented Jun 27, 2020 •

edited

Loading

nerdralph commented Jul 11, 2020

ArminJo commented Feb 6, 2021

matthijskooijman commented May 21, 2021

Inline library functions based on constant-ness of arguments #352

Inline library functions based on constant-ness of arguments #352

Comments

CAFxX commented Jun 24, 2020 • edited Loading

matthijskooijman commented Jun 24, 2020

matthijskooijman commented Jun 25, 2020

CAFxX commented Jun 26, 2020 • edited Loading

matthijskooijman commented Jun 26, 2020

CAFxX commented Jun 27, 2020 • edited Loading

nerdralph commented Jul 11, 2020

ArminJo commented Feb 6, 2021

matthijskooijman commented May 21, 2021

CAFxX commented Jun 24, 2020 •

edited

Loading

CAFxX commented Jun 26, 2020 •

edited

Loading

CAFxX commented Jun 27, 2020 •

edited

Loading