mit MASM 32bit Code in .com Dateien erstellen

supernicky

Hallo und Guten Morgen,

ich habe mir mal die .lst Datei nach der Erstellung angesehen.
Leider setzt MASM vor alle Befehle mit 32bit Registern ein 066h Präfix.

Gibt es hier eine Anweisung wie [Bits 32] unter NASM?

Nicky

supernicky schrieb:

Hallo und Guten Morgen,

ich habe mir mal die .lst Datei nach der Erstellung angesehen.
Leider setzt MASM vor alle Befehle mit 32bit Registern ein 066h Präfix.

Im 16 bit adressmode(ob nun im RM, oder im PM) müssen solche Operandsize- und auch Adresssize- Prefixe in einem 16 bit codesegment verwendet werden, wenn man 32 Bit Operanden/Register und Adressen benutzen möchte.

Das D-Flag im code-segment descriptor in einer GDT/LDT bestimmt die default-Grösse der Operanden/Adressen von unserem Code-Segment, oder wenn noch keine GDT/LDT vorhanden ist, dann bekommen wir den 16 Bit Adressmode nach dem Anschalten.(Ausnahme bei Mainboard mit (U)EFI-Bios.)

Gibt es hier eine Anweisung wie [Bits 32] unter NASM?

Nicky

[USE32]

http://www.tortall.net/projects/yasm/manual/html/nasm-directives.html

Damit erwartet der Assembler dann, das der folgende Code für den 32 Bit-Adressmode geschrieben wurde und dafür auch das D-Flag bereits verändert wurde.

Dirk

Moin.

Hier sind die relevaten Angaben über die Operand-size und Address-size prefixe von Intel und von AMD.

Intel:

Instruction prefixes can be used to override the default operand size and address size of a code segment. These prefixes can be used in real-address mode as well as in protected mode and virtual-8086 mode. An operand-size or address-size prefix only changes the size for the duration of the instruction.

The following two instruction prefixes allow mixing of 32-bit and 16-bit operations within one segment:
•The operand-size prefix (66H)
•The address-size prefix (67H)

These prefixes reverse the default size selected by the D flag in the code-segment descriptor. For example, the processor can interpret the (MOV mem, reg) instruction in any of four ways:
•In a 32-bit code segment:
—Moves 32 bits from a 32-bit register to memory using a 32-bit effective address.
—If preceded by an operand-size prefix, moves 16 bits from a 16-bit register to memory using a 32-bit effective address.
—If preceded by an address-size prefix, moves 32 bits from a 32-bit register to memory using a 16-bit effective address.
—If preceded by both an address-size prefix and an operand-size prefix, moves 16 bits from a 16-bit register to memory using a 16-bit effective address.

•In a 16-bit code segment:
—Moves 16 bits from a 16-bit register to memory using a 16-bit effective address.
—If preceded by an operand-size prefix, moves 32 bits from a 32-bit register to memory using a 16-bit effective address.
—If preceded by an address-size prefix, moves 16 bits from a 16-bit register to memory using a 32-bit effective address.
—If preceded by both an address-size prefix and an operand-size prefix, moves 32 bits from a 32-bit register to memory using a 32-bit effective address.

The previous examples show that any instruction can generate any combination of operand size and address size regardless of whether the instruction is in a 16- or 32-bit segment. The choice of the 16- or 32-bit default for a code segment is normally based on the following criteria:
•Performance — Always use 32-bit code segments when possible. They run much faster than 16-bit code segments on P6 family processors, and somewhat faster on earlier IA-32 processors.
•The operating system the code segment will be running on — If the operating system is a 16-bit operating system, it may not support 32-bit program modules.
•Mode of operation — If the code segment is being designed to run in real-address mode, virtual-8086 mode, or SMM, it must be a 16-bit code segment.
•Backward compatibility to earlier IA-32 processors — If a code segment must be able to run on an Intel 8086 or Intel 286 processor, it must be a 16-bit code segment.

The D flag in a code-segment descriptor determines the default operand-size and address-size for the instructions of a code segment. (In real-address mode and virtual-8086 mode, which do not use segment descriptors, the default is 16 bits.) A code segment with its D flag set is a 32-bit segment; a code segment with its D flag clear is a 16-bit segment.

Executable code segment. The flag is called the D flag and it indicates the default length for effective addresses and operands referenced by instructions in the segment. If the flag is set, 32-bit addresses and 32-bit or 8-bit operands are assumed; if it is clear, 16-bit addresses and 16-bit or 8-bit operands are assumed.
The instruction prefix 66H can be used to select an operand size other than the default, and the prefix 67H can be used select an address size other than the default.

The 32-bit operand prefix can be used in real-address mode programs to execute the 32-bit forms of instructions. This prefix also allows real-address mode programs to use the processor’s 32-bit general-purpose registers.
The 32-bit address prefix can be used in real-address mode programs, allowing 32-bit offsets.

The IA-32 processors beginning with the Intel386 processor can generate 32-bit offsets using an address override prefix; however, in real-address mode, the value of a 32-bit offset may not exceed FFFFH without causing an exception.

Assembler Usage:
If a code segment that is going to run in real-address mode is defined, it must be set to a USE 16 attribute. If a 32-bit operand is used in an instruction in this code segment (for example, MOV EAX, EBX), the assembler automatically generates an operand prefix for the instruction that forces the processor to execute a 32-bit operation, even though its default code-segment attribute is 16-bit.

The 32-bit operand prefix allows a real-address mode program to use the 32-bit general-purpose registers (EAX, EBX, ECX, EDX, ESP, EBP, ESI, and EDI).

When moving data in 32-bit mode between a segment register and a 32-bit general-purpose
register, the Pentium Pro processor does not require the use of a 16-bit operand size prefix;
however, some assemblers do require this prefix. The processor assumes that the 16 least-significant
bits of the general-purpose register are the destination or source operand. When moving a
value from a segment selector to a 32-bit register, the processor fills the two high-order bytes of
the register with zeros.

--------------------------------------------------

AMD:

3.3.2. 32-Bit vs. 16-Bit Address and Operand Sizes
The processor can be configured for 32-bit or 16-bit address and operand sizes. With 32-bit
address and operand sizes, the maximum linear address or segment offset is FFFFFFFFH
(232-1), and operand sizes are typically 8 bits or 32 bits. With 16-bit address and operand sizes,
the maximum linear address or segment offset is FFFFH (216-1), and operand sizes are typically
8 bits or 16 bits.
When using 32-bit addressing, a logical address (or far pointer) consists of a 16-bit segment
selector and a 32-bit offset; when using 16-bit addressing, it consists of a 16-bit segment selector
and a 16-bit offset.
Instruction prefixes allow temporary overrides of the default address and/or operand sizes from
within a program.
When operating in protected mode, the segment descriptor for the currently executing code
segment defines the default address and operand size. A segment descriptor is a system data
structure not normally visible to application code. Assembler directives allow the default
addressing and operand size to be chosen for a program. The assembler and other tools then set
up the segment descriptor for the code segment appropriately.
When operating in real-address mode, the default addressing and operand size is 16 bits. An
address-size override can be used in real-address mode to enable 32-bit addressing; however, the
maximum allowable 32-bit linear address is still 000FFFFFH (220-1).

3.6. OPERAND-SIZE AND ADDRESS-SIZE ATTRIBUTES
When the processor is executing in protected mode, every code segment has a default operandsize
attribute and address-size attribute. These attributes are selected with the D (default size)
flag in the segment descriptor for the code segment (see Chapter 3, Protected-Mode Memory
Management, in the Intel Architecture Software Developer’s Manual, Volume 3). When the D
flag is set, the 32-bit operand-size and address-size attributes are selected; when the flag is clear,
the 16-bit size attributes are selected. When the processor is executing in real-address mode,
virtual-8086 mode, or SMM, the default operand-size and address-size attributes are always 16
bits.
The operand-size attribute selects the sizes of operands that instructions operate on. When the
16-bit operand-size attribute is in force, operands can generally be either 8 bits or 16 bits, and
when the 32-bit operand-size attribute is in force, operands can generally be 8 bits or 32 bits.
The address-size attribute selects the sizes of addresses used to address memory: 16 bits or 32
bits. When the 16-bit address-size attribute is in force, segment offsets and displacements are 16
bits. This restriction limits the size of a segment that can be addressed to 64 KBytes. When the
32-bit address-size attribute is in force, segment offsets and displacements are 32 bits, allowing
segments of up to 4 GBytes to be addressed.
The default operand-size attribute and/or address-size attribute can be overridden for a particular
instruction by adding an operand-size and/or address-size prefix to an instruction (see
“Instruction Prefixes” in Chapter 2 of the Intel Architecture Software Developer’s Manual,
Volume 3). The effect of this prefix applies only to the instruction it is attached to.
Table 3-1 shows effective operand size and address size (when executing in protected mode)
depending on the settings of the D flag and the operand-size and address-size prefixes.

Dirk

supernicky

Hallo Dirk,

also kann man mit MASM "nur" 16bit .com Dateien erstellen?

Jonas OSDever

Meines Wissens nach duerfen .com-Dateien per Definition nur 16-Bit-Code enthalten. COM ist mit das einfachste ausfuehrbare Format. Es hat keinen Header, gleich am Anfang beginnt der Code (mit der Offsetadresse 0x100, da DOS vorher das PSP ablegt), und auch Sections werden nicht unterschieden. Das Programm kann maximal die 64KB Speicher nutzen, die es durch ein Segment adressieren kann (zumindest laut dem Assembler-Buch von Joachim Rode). Das heisst, wenn ueberhaupt, muesstest du von Hand in den Protected Mode schalten, um 32-bit-Code ausfuehren zu koennen. Das mochte unter DOS vielleicht noch gehen und wenn du nach Programmende wieder in den RM geschaltet hast, war das auch voellig Transparent (gab ja durchaus PM-Programme unter DOS). Aber unter einem heutigen Windows, und da auch nur die 32-bit-Varianten, kannst du in einer COM-Datei wirklich nur 16-bit-Code ausfuehren. Jeglicher versuch, die Privilegierten Instruktionenn wie lgdt zu nutzen, wuerde in einem General Protection Fault enden. Und unter 64-bit-Windowsen kannst du gar keinen 16-bit-Code mehr ausfuehren (ohne Emulatoren oder VMs), da der VM86 im Longmode nicht mehr funktioniert.
Wenn du 32-bit-Code schreiben willst, tu dir selbst den gefallen und nimm Exe-Files.

supernicky

Jonas OSDever schrieb:

Meines Wissens nach duerfen .com-Dateien per Definition nur 16-Bit-Code enthalten.

Hallo Jonas,

bei NASM kann man mit

[Bits 32]

auch 32-Bit Code in .com Dateien erzeugen.

Dann werde ich wohl noch etwas bei NASM bleiben.

Danke und Gruß

Nicky

Jonas OSDever

Naja, USE32 sollte auch beim MASM gehen (ohne Gewaehr, ich benutz auch nur NASM). Aber funktioniert das denn wirklich? Mir waere es fremd, dass man 32-bit-Code in einer 16-bit-Umgebung ausfuehren koennte...

supernicky

Hallo Jonas,

es lässt sich nicht abstreiten das MASM doch etwas komfortabler ist als NASM.

Warum ich .com brauche? Eben weil dieses Format so einfach ist, kann ich
damit erstellte Programme auch einfach nachladen (Anwenderprogramm) und der Eintrittspunkt ist
auch schnell gefunden

Gruß, Nicky

Jonas OSDever schrieb:

Meines Wissens nach duerfen .com-Dateien per Definition nur 16-Bit-Code enthalten.

Ich glaube diese Definition ist falsch. Weil es wird ja auch gar nicht von DOS überprüft welche Art von Code dort enthalten ist. So kann man dort z.B. auch MMX-Befehle verwenden, die es zu DOS-Zeiten noch gar nicht gab und man kann in den 32 bit PM schalten(ab 80386+) und sich dafür auch Speicher im unteren MB von DOS anfordern, wenn man nur ein MB hardwareseitig verbaut hat.

COM ist mit das einfachste ausfuehrbare Format. Es hat keinen Header, gleich am Anfang beginnt der Code (mit der Offsetadresse 0x100, da DOS vorher das PSP ablegt), und auch Sections werden nicht unterschieden. Das Programm kann maximal die 64KB Speicher nutzen, die es durch ein Segment adressieren kann (zumindest laut dem Assembler-Buch von Joachim Rode).

Eine COM-Datei kann nur 64 KB gross sein.
Benutzen und mehr Speicher anfordern kann man von dort aus aber so viel, wie man von DOS bekommen kann und im Falle man selber in den PM, oder auch in den Unrealmode/Bigrealmode schaltet, kann man den gesamten freien Speicher im 4 GB-Adressraum selber verwalten und verwenden.

Das heisst, wenn ueberhaupt, muesstest du von Hand in den Protected Mode schalten, um 32-bit-Code ausfuehren zu koennen. Das mochte unter DOS vielleicht noch gehen und wenn du nach Programmende wieder in den RM geschaltet hast, war das auch voellig Transparent (gab ja durchaus PM-Programme unter DOS).

Aber unter einem heutigen Windows, und da auch nur die 32-bit-Varianten, kannst du in einer COM-Datei wirklich nur 16-bit-Code ausfuehren. Jeglicher versuch, die Privilegierten Instruktionenn wie lgdt zu nutzen, wuerde in einem General Protection Fault enden. Und unter 64-bit-Windowsen kannst du gar keinen 16-bit-Code mehr ausfuehren (ohne Emulatoren oder VMs), da der VM86 im Longmode nicht mehr funktioniert.
Wenn du 32-bit-Code schreiben willst, tu dir selbst den gefallen und nimm Exe-Files.

Oder man nimmt eben ein richtiges DOS dafür. Denn für ein DOS_Com-Programm braucht man kein Windows.

Dirk