volatile notwendig?

jenz

hustbaer schrieb:

@jenz
Du vermischt hier immerfort 2 Dinge. Einmal den volatile Trick, wo man ausnutzt dass man auf ein "volatile X&" nur volatile Methoden aufrufen darf (analog zu const), und dann die Sache dass volatile bestimmte Optimierungen unterdrückt.

Das eine wird vom Standard garantiert, das andere nicht.

hm, eigentlich wollte ich das gerade nicht vermischen.

was wird denn genau vom standard garantiert? kann das mal jemand zitieren?
wäre super.

7H3 N4C3R

Ponto schrieb:

Ja, wobei man hier nicht an mmap denken sollte, was auch memory mapped i/o heisst. Bei mmap braucht man kein volatile.

Ja, ich dachte hierbei schon an die alten Zeiten, wo man an Speicher-Adresse xxxx die Tastatur ausgelesen hat oder die Com-Ports ansteuerte. Also eben genau Hardware-Ports auf konkrete Speicheradressen gemappt sind.

Ich sehe zwar immernoch das theoretische Problem, dass der Compiler Code generiert, in dem geteilte Statusvariablen sich in Registern gemerkt werden (von mir aus auch mit Synchronisation), allerdings ist das zugegebener Maßen sowieso "schmutzig", da Mutexe und Stati nicht zum Warten da sind. Von daher wohl kein praxisrelevantes Problem.

7H3 N4C3R

jenz schrieb:

was wird denn genau vom standard garantiert? kann das mal jemand zitieren?

Einmal ISO9899 6.7.3/5+6:

5
If an attempt is made to refer to an object defined with a volatile-qualified type through use of an lvalue with non-volatile-qualified type, the behavior is undefined.

6
An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3. Furthermore, at every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously.114) What constitutes an access to an object that
has volatile-qualified type is implementation-defined.

A volatile declaration may be used to describe an object corresponding to a memory-mapped input/output port or an object accessed by an asynchronously interrupting function. Actions on objects so declared shall not be ‘‘optimized out’’ by an implementation or reordered except as permitted by the rules for evaluating expressions.

Und 5.1.2.3/2+5+8+9

2
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects,11) which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.

5
The least requirements on a conforming implementation are:
— At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred.

8
EXAMPLE 1 An implementation might define a one-to-one correspondence between abstract and actual semantics: at every sequence point, the values of the actual objects would agree with those specified by the abstract semantics. The keyword volatile would then be redundant.

9
Alternatively, an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries. In such an implementation, at the time of each function entry and function return where the calling function and the called function are in different translation units, the values of all externally linked objects and of all objects accessible via pointers therein would agree with the abstract semantics. Furthermore, at the time of each such function entry the values of the parameters of the called function and of all objects accessible via pointers therein would agree with the abstract semantics. In this type of implementation, objects referred to by interrupt service routines activated by the signal function would require explicit specification of volatile storage, as well as other implementation-defined restrictions.

Falls ich etwas übersehen habe, bitte ergänzen.

Es ist zu bedenken, dass die "abstract machine" Single Threaded und Single Processor ist.

jenz

7H3 N4C3R schrieb:

5
If an attempt is made to refer to an object defined with a volatile-qualified type through use of an lvalue with non-volatile-qualified type, the behavior is undefined.

6
An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3. Furthermore, at every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously.114) What constitutes an access to an object that
has volatile-qualified type is implementation-defined.

A volatile declaration may be used to describe an object corresponding to a memory-mapped input/output port or an object accessed by an asynchronously interrupting function. Actions on objects so declared shall not be ‘‘optimized out’’ by an implementation or reordered except as permitted by the rules for evaluating expressions.

ich bin kein standardkönner, aber das lese ich schon so, dass asynchrone änderungen gesehen werden können.

und wenn ich das jetzt richtig lesen, dann bedeuted (aus 6)

What constitutes an access to an object that
has volatile-qualified type is implementation-defined.

das es aber dann doch nicht wirklich so implementiert sein muss?

das wirft mich dann natürlich aus der bahn und ich gebe zu, dass volatile zum signalen nicht verwendet werden sollte. besonders dann, wenn man mal den compiler wechselt.
aber ganz klar finde ich diese formulierung noch nicht.

jenz

7H3 N4C3R

Naja, letztendlich ist auch das Abfragen einer synchronisierten Statusvariable nicht sicher, da du möglicherweise die Variable nie im richtigen Status zu sehen bekommst, wenn du gerade mal das Lock erhältst. Daher bräuchtest du sowieso Condition Variables, auf die du wartest, um eine Änderung auch wirklich mitzubekommen und die Notwendigkeit von volatile im Multithreading Bereich löst sich in Luft auf.

Der Compiler kann auch nur den Code generieren - was die CPU damit macht ist eine andere Sache. Daher kann der Compiler, und damit auch der Standard, kaum verlässliche Aussagen machen, insbesondere weil die abstract machine single threaded ist und keinerlei Aussagen oder Annahmen über die Existenz von threadsicheren Operationen gemacht werden. C/C++ selbst ist schlichtweg nicht "thread-aware", alles was man braucht muss man selbst programmieren und absichern.

hustbaer

@jenz:
da steht genau garnix von threads. interrupt != thread der auf einer anderen CPU läuft. grosser unterschied

aber ganz klar finde ich diese formulierung noch nicht.

was auch eines der grossen probleme mit volatile ist.
in einer library die auf ein bestimmtes system/compiler hingeschrieben wird, vielleicht portabel, aber dann mit unterschiedlichen implementierungen für verschiedene systeme/compiler, kann man volatile einsetzen, wenn man sich nach dem richtet was das jeweilige system/compiler garantiert. allerdings braucht man dann auch gute kenntnisse des jeweiligen systems/compilers um es "korrekt" hinzubekommen.

in "normalem" code hat volatile IMO nix verloren.

jenz

richtig, auf unterschiedlichen cpus ist das ganze noch mal was anderes.
so weit habe da noch gar nicht gedacht, hätte ich vielleicht dazu schreiben sollen.

auf einer cpu sollte es sich dann aber eher so wie bei einem interrupt verhalten, oder nicht?
ist doch auch asynchron.

hustbaer

@jenz: auf single-cpu, single-core, non-hyperthreading systemen wird normalerweise ein interrupt verwendet um zwischen verschiedenen threads umzuschalten.
garantieren tut dir das aber wohl auch keiner.

der prozessor könnte selbst eine liste von threads verwalten, und zwischen denen beliebig, und zu beliebigen zeiten umschalten. die grenzen zu hyperthreading verschwimmen dann natürlich, aber der punkt ist ja gerade dass es kaum garantien gibt, und wenn, dann gelten die meist nur für eine bestimmte plattform.

oder ganz anders: die CPU könnte ja theoretisch auch verschiedene caches haben zwischen denen man umschalten kann. wenn thread 1 cache 1 verwendet und thread 2 cache 2, und diese caches im write back modus arbeiten, dann kann genau wie bei einem multi CPU system der fall auftreten dass thread 2 die änderungen von thread 1 nicht sieht.

das sind viele gründe die alle dafür sprechen dass im neuen standard endlich ein speichermodell festgelegt wird. und ich bin *sehr* froh dass es so aussieht als ob das mit C++09 endlich kommen wird. mit den geplanten atomic operations, threading support etc. alles standardisiert kann man sich dann endlich mal dran machen ordentliche libraries für threading in C++ zu schreiben. und mit solchen libraries, vorausgesetzt sie sind "gut" genug, könnte man einiges an geschwindigkeit rausholen. dann können sich diverse sprachen mit message-passing oder weiss-gott-was wieder brausen gehen

jenz

@jenz: auf single-cpu, single-core, non-hyperthreading systemen wird normalerweise ein interrupt verwendet um zwischen verschiedenen threads umzuschalten.
garantieren tut dir das aber wohl auch keiner.

ich sehe nicht, dass unbedingt ein interrupt genommen werden muss.
ich sehe auch im standard nicht das wort interrupt so wie es auf maschinen funktioniert.

da steht interrupting functions, das müssen keine interrupts sein.

oder ganz anders: die CPU könnte ja theoretisch auch verschiedene caches haben zwischen denen man umschalten kann. wenn thread 1 cache 1 verwendet und thread 2 cache 2, und diese caches im write back modus arbeiten, dann kann genau wie bei einem multi CPU system der fall auftreten dass thread 2 die änderungen von thread 1 nicht sieht.

das würde ich aber für ein sehr merkwürdiges cacheverhalten halten. gibt es soetwas wirklich?
und sind auf mehrcpu-systemen die caches wirklich unsynchronisiert?

jenz

hustbaer

und sind auf mehrcpu-systemen die caches wirklich unsynchronisiert?

mehr oder weniger ja.

auf intel x86 sind sie "fast ganz synchron", auf powerpc schon weniger, ganz wild wirds mit alpha oder ähnlichem. gibt (bzw. gab) sogar systeme wo man die caches manuell flushen musste (spezieller ASM befehl).