suche schnelle little-to-bigendian Conversion

Hallo,

kennt jmd eine schnellere Umwandlung von little-to-bigendian and vica verca?

#define SWAP32(x) (((x) & 0xff) << 24 | ((x) & 0xff00) << 8 | ((x) & 0xff0000) >> 8 | ((x) >> 24) & 0xff)

Dieses Define funktioniert zwar, allerdings stellt es auf einem kleinen 8051 einen großen Overhead dar. Zumal dieses defines sehr häufig eingesetzt werden muss.

Gruß
Gregor

^^besser gehts eigentlich nicht. wenn das zuviel overhead ist, dann mach 'ne funktion draus oder code was in assembler.
ach ja, fang bloss nicht an dir was mit unions zu basteln. das wird weder kleiner nioch schneller und ist ausserdem unportabel.

Erhard Henkes

BSWAP r32 http://faydoc.tripod.com/cpu/bswap.htm

Erhard Henkes schrieb:

BSWAP r32
http://faydoc.tripod.com/cpu/bswap.htm

ist das für'n 8051? ich dachte der kann nur swap auf'm akku (4-bit swap)

c_newbie

ob es nun schneller ist kann ich nicht sagen allerdings werden doch bei einem
shift die neuen stellen mit null gefüllt, daher sollte sowas ausreichen.

#define SWAP32(x) (                         \
                    ((x) << 24)             \
                   |(((x) & 0xff00) << 8)   \
                   |(((x) & 0xff0000) >> 8) \
                   |((x) >> 24)             \
                  )

pointercrash**

Die hier tut und das recht flott:

void SwapOrder32(int *in32)
{
	char c;
	char *swap32;

	swap32 = (char*) in32;

	c = swap32[0];
	swap32[0] = swap32[3];
	swap32[3] = c;

	c = swap32[1];
	swap32[1] = swap32[2];
	swap32[2] = c;
}

Ich hatte bisher noch keinen Prozessor, wo der char != 8 Bit war, bis dahin spare ich mir die Mantelung mit defines.

c_newbie

sparst dir noch ne variable

void SwapOrder32(int *in32){

    char *swap32 = (char*) in32;

    if(swap32[0] != swap32[3]){
        swap32[0] ^= swap32[3];
        swap32[3] ^= swap32[0];
        swap32[0] ^= swap32[3];
    }

    if(swap32[1] != swap32[2]){
        swap32[1] ^= swap32[2];
        swap32[2] ^= swap32[1];
        swap32[1] ^= swap32[2];
    }
}

pointercrash**

Achjaaa, den ^^^^^ hatte ich schon fast vergessen ... tricky
Wobei "schnell" dann eher prozessorabhängig wird ...

Also das if-Gedöns würde ich nicht machen. Ansonsten würde ich mir im Zweifel den Assembler-Output anschauen, ob nicht vielleicht doch die Variante mit temporärer "Variable" besser ist. Lesbarer ist sie allemal

c_newbie

Also mein Fazit sieht so aus:

Der gute alte Xor-Swap macht in diesem Fall keinen Sinn.
Für nen kleinen memory footprint am besten die Lösung von pointercrash()
Für absolute Performance der Vorschlag von Gregor_
Mein erster Post mit dem Vorschlag 2 unds wegzulassen brachte keine veränderung.

@Gregor_

8048420:       55                      push   %ebp
 8048421:       89 e5                   mov    %esp,%ebp
 8048423:       53                      push   %ebx
 8048424:       8b 5d 08                mov    0x8(%ebp),%ebx
 8048427:       8b 0b                   mov    (%ebx),%ecx
 8048429:       89 ca                   mov    %ecx,%edx
 804842b:       89 c8                   mov    %ecx,%eax
 804842d:       81 e2 00 ff 00 00       and    $0xff00,%edx
 8048433:       25 00 00 ff 00          and    $0xff0000,%eax
 8048438:       c1 f8 08                sar    $0x8,%eax
 804843b:       c1 e2 08                shl    $0x8,%edx
 804843e:       09 c2                   or     %eax,%edx
 8048440:       89 c8                   mov    %ecx,%eax
 8048442:       c1 e0 18                shl    $0x18,%eax
 8048445:       09 c2                   or     %eax,%edx
 8048447:       c1 e9 18                shr    $0x18,%ecx
 804844a:       09 ca                   or     %ecx,%edx
 804844c:       89 13                   mov    %edx,(%ebx)
 804844e:       5b                      pop    %ebx
 804844f:       5d                      pop    %ebp
 8048450:       c3                      ret

@pointercrash()

8048460:       55                      push   %ebp
 8048461:       89 e5                   mov    %esp,%ebp
 8048463:       8b 45 08                mov    0x8(%ebp),%eax
 8048466:       0f b6 08                movzbl (%eax),%ecx
 8048469:       0f b6 50 03             movzbl 0x3(%eax),%edx
 804846d:       88 48 03                mov    %cl,0x3(%eax)
 8048470:       0f b6 48 01             movzbl 0x1(%eax),%ecx
 8048474:       88 10                   mov    %dl,(%eax)
 8048476:       0f b6 50 02             movzbl 0x2(%eax),%edx
 804847a:       88 48 02                mov    %cl,0x2(%eax)
 804847d:       88 50 01                mov    %dl,0x1(%eax)
 8048480:       5d                      pop    %ebp
 8048481:       c3                      ret

@xor-swap

8048490:       55                      push   %ebp
 8048491:       89 e5                   mov    %esp,%ebp
 8048493:       8b 55 08                mov    0x8(%ebp),%edx
 8048496:       53                      push   %ebx
 8048497:       0f b6 0a                movzbl (%edx),%ecx
 804849a:       0f b6 42 03             movzbl 0x3(%edx),%eax
 804849e:       38 c1                   cmp    %al,%cl
 80484a0:       74 0c                   je     80484ae <swap32_3+0x1e>
 80484a2:       31 c8                   xor    %ecx,%eax
 80484a4:       88 02                   mov    %al,(%edx)
 80484a6:       32 42 03                xor    0x3(%edx),%al
 80484a9:       30 02                   xor    %al,(%edx)
 80484ab:       88 42 03                mov    %al,0x3(%edx)
 80484ae:       0f b6 4a 01             movzbl 0x1(%edx),%ecx
 80484b2:       8d 5a 01                lea    0x1(%edx),%ebx
 80484b5:       0f b6 42 02             movzbl 0x2(%edx),%eax
 80484b9:       38 c1                   cmp    %al,%cl
 80484bb:       74 0d                   je     80484ca <swap32_3+0x3a>
 80484bd:       31 c8                   xor    %ecx,%eax
 80484bf:       88 42 01                mov    %al,0x1(%edx)
 80484c2:       32 42 02                xor    0x2(%edx),%al
 80484c5:       88 42 02                mov    %al,0x2(%edx)
 80484c8:       30 03                   xor    %al,(%ebx)
 80484ca:       5b                      pop    %ebx
 80484cb:       5d                      pop    %ebp
 80484cc:       c3                      ret

Nobuo T

c_newbie: Dir ist aber klar, dass es hier um die Performance des Codes auf einem 8Bit µC geht? Das kannst du nicht mit dem Intel-Monster vergleichen.
Zudem sind die Codes in Sachen Optimierung IMHO auch so alle Muell. Siehe zB. Erhards Anmerkung.

pointercrash**

Nobuo T schrieb:

c_newbie: Dir ist aber klar, dass es hier um die Performance des Codes auf einem 8Bit µC geht?

Ich hab' das jetzt aufm 16- Bitter gebencht, optimal wär' ein Compiler für'n 8- Bitter, aber da hab' ich grad nix offen.
- Gregors Macro schlägt zur Funktion umgemodelt mit 148 Zyklen zu.
- c_newbies Xor- Swap braucht immer noch 142 Zyklen.
- mein Zeug braucht 70 Zyklen.
Dürfte aufm 8- Bitter noch dramatischer sein.

Nobuo T schrieb:

Das kannst du nicht mit dem Intel-Monster vergleichen.

Der 8051 ist auch ein ganz schauriges Intel- Monster. Ich weiß nicht, warum der so verbreitet ist, muß so in Richtung Fliegen und Kuhscheiße gehen.

Nobuo T schrieb:

Zudem sind die Codes in Sachen Optimierung IMHO auch so alle Muell. Siehe zB. Erhards Anmerkung.

Meinst Du den Compileroutput oder den C-Input?