Date post: | 24-Jun-2015 |
Category: |
Technology |
Upload: | roman-okolovich |
View: | 1,203 times |
Download: | 3 times |
64 bits for developers
By Roman Okolovich
Introduction
x86-64 is a superset of the x86 instruction set
architecture. x86-64 processors can run existing 32-bit or
16-bit x86 programs at full speed, but also support new
programs written with a 64-bit address space and other
additional capabilities.
The x86-64 specification was designed by Advanced
Micro Devices (AMD), who have since renamed it
AMD64. This was the first time any company other than
Intel made significant additions to the IA-32 architecture.
x86-64 is backwards compatible with 32-bit code without
any performance loss.
x64 architectural features 64-bit integer capability: All general-purpose registers (GPRs) are expanded from 32 bits to 64 bits, and all arithmetic and logical operations,
memory-to-register and register-to-memory operations, etc. can now operate directly on 64-bit integers. Pushes and pops on the stack are always in 8-byte strides, and pointers are 8 bytes wide.
Additional registers: In addition to increasing the size of the general-purpose registers, the number of named general-purpose registers is increased from eight (i.e. eax,ebx,ecx,edx,ebp,esp,esi,edi) in x86-32 to 16. It is therefore possible to keep more local variables in registers rather than on the stack, and to let registers hold frequently accessed constants; arguments for small and fast subroutines may also be passed in registers to a greater extent. However, AMD64 still has fewer registers than many common RISC processors (which typically have 32–64 registers) or VLIW-like machines such as the IA-64 (which has 128 registers).
Additional XMM (SSE) registers: Similarly, the number of 128-bit XMM registers (used for Streaming SIMD instructions) is also increased from 8 to 16.
Larger virtual address space: Current processor models implementing the AMD64 architecture can address up to 256 TB (281,474,976,710,656 bytes)[4] of virtual address space. This limit can be raised in future implementations to 16 EB (18,446,744,073,709,551,616 bytes). This is compared to just 4 GB (4,294,967,296 bytes) for 32-bit x86. This means that very large files can be operated on by mapping the entire file into the process' address space (which is sometimes faster than working with file read/write calls), rather than having to map regions of the file into and out of the address space.
Larger physical address space: Current implementations of the AMD64 architecture can address up to 1 TB (1,099,511,627,776 bytes) of RAM; the architecture permits extending this to 4 PB (4,503,599,627,370,496 bytes) in the future (limited by the page table entry format). In legacy mode, Physical Address Extension (PAE) is included, as it is on most current 32-bit x86 processors, allowing access to a maximum of 64 GB (68,719,476,736 bytes).
Instruction pointer relative data access: Instructions can now reference data relative to the instruction pointer (RIP register). This makes position independent code, as is often used in shared libraries and code loaded at run time, more efficient.
SSE instructions: The original AMD64 architecture adopted Intel's SSE and SSE2 as core instructions. SSE3 instructions were added in April 2005.SSE2 replaces the x87 instruction set's IEEE 80-bit precision with the choice of either IEEE 32-bit or 64-bit floating-point mathematics. This provides floating-point operations compatible with many other modern CPUs. The SSE and SSE2 instructions have also been extended to operate on the eight new XMM registers. SSE and SSE2 are available in 32-bit mode in modern x86 processors; however, if they're used in 32-bit programs, those programs will only work on systems with processors that have the feature. This is not an issue in 64-bit programs, as all AMD64 processors have SSE and SSE2, so using SSE and SSE2 instructions instead of x87 instructions does not reduce the set of machines on which x64 programs can be run. SSE and SSE2 are generally faster than, and duplicate most of the features of, the traditional x87 instructions, MMX, and 3DNow!.
No-Execute bit: The "NX" bit (bit 63 of the page table entry) allows the operating system to specify which pages of virtual address space can contain executable code and which cannot. An attempt to execute code from a page tagged "no execute" will result in a memory access violation, similar to an attempt to write to a read-only page. This should make it more difficult for malicious code to take control of the system via "buffer overrun" or "unchecked buffer" attacks. A similar feature has been available on x86 processors since the 80286 as an attribute of segment descriptors; however, this works only on an entire segment at a time. Segmented addressing has long been considered an obsolete mode of operation, and all current PC operating systems in effect bypass it, setting all segments to a base address of 0 and a size of 4 GB (4,294,967,296 bytes). AMD was the first x86-family vendor to implement no-execute in linear addressing mode. The feature is also available in legacy mode on AMD64 processors, and recent Intel x86 processors, when PAE is used.
Removal of older features: A number of "system programming" features of the x86 architecture are not used in modern operating systems and are not available on AMD64 in long (64-bit and compatibility) mode. These include segmented addressing (although the FS and GS segments are retained in vestigial form for use as extra base pointers to operating system structures)[5], the task state switch mechanism, and Virtual 8086 mode. These features do of course remain fully implemented in "legacy mode," thus permitting these processors to run 32-bit and 16-bit operating systems without modification.
Virtual address space details Although virtual addresses are 64 bits wide in 64-bit mode, current implementations (and
any chips known to be in the planning stages) do not allow the entire virtual address space of 16 EB (18,446,744,073,709,551,616 bytes) to be used. Most operating systems and applications will not need such a large address space for the foreseeable future (for example, Windows implementations for AMD64 are only populating 16 TB (17,592,186,044,416 bytes), or 44 bits' worth).
AMD therefore decided that, in the first implementations of the architecture, only the least significant 48 bits of a virtual address would actually be used in address translation (page table lookup).
Current 48-bit implementation 56-bit implementation 64-bit implementation
Advantages of using x64
Compiling of a 64-bit code increases performance
the expected performance growth caused by an ordinary
compilation is 5-15%
Adobe Company claims that new 64-bit "Photoshop CS4" is 12%
faster than its 32-bit version
Some programs dealing with large data arrays may
increase their performance when expanding address
space
Using ptrdiff_t, size_t and derivative types allows to
optimize program code up to 30%.
Support x64 in Visual Studio (1 of 2) /wp64 - Detects 64-bit portability problems on types that are also marked with the
__w64 keyword.
The /Wp64 compiler option and __w64 keyword are deprecated and will be removed in a future version of the compiler.
Instead of using this option and keyword to detect 64-bit portability issues, use a Visual C++ compiler that targets a 64-bit platform. For more information, see 64-Bit Programming with Visual C++.
Support x64 in Visual Studio (2 of 2) How to: Configure Visual C++ Projects
to Target 64-Bit Platforms
Click Configuration Manager to open the
Configuration Manager Dialog Box.
Click the Active Solution Platform list,
and then select the <New…> option to
open the New Solution Platform Dialog
Box.
Click the Type or select the new
platform drop-down arrow, and then
select a 64-bit platform.
Click OK. The platform you selected in the
preceding step will appear under Active
Solution Platform in the Configuration
Manager dialog box.
Click Close in the Configuration
Manager dialog box, and then click OK in
the <Projectname> Property Pages
dialog box.
8
Data modelData Type LP32 ILP32 ILP64 LLP64 LP64
char 8 8 8 8 8
short 16 16 16 16 16
int32 32
int 16 32 64 32 32
long 32 32 64 32 64
long long (int64) 64
pointer 32 32 64 64 64
• ISO/IEC 9899:1990, Programming
Languages - C (ISO C) left the definition of
the short int, the int, the long int, and the
pointer to avoid constraining hardware
architectures that might benefit from
defining these data types independently
from one another.
• The relationship between the
fundamental data types can be
expressed as:
• Notation: int (I), long (L), and pointer (P)
• ILP32 is used in Win32
• LLP64 is used in Win64
• LP64 is used in UNIX
sizeof(char) <= sizeof(short) <= sizeof(int)
<= sizeof(long) = sizeof(size_t)
General issues relating to x64 (1 of 4)
An int and a long are 32-bit values on 64-bit Windows
operating systems.
You should not to assign pointers to 32-bit variables. Pointers
are 64-bit on 64-bit platforms, and you will truncate the pointer
value if you assign it to a 32-bit variable.sizeof(int) = sizeof(long) = sizeof(pointer)
Common assumptions about the relationships
between the fundamental data types may no
longer be valid in a 64-bit data model
General issues relating to x64 (2 of 4) size_t, time_t, and ptrdiff_t (STL) are 64-bit values on 64-bit
Windows operating systems.
size_t n = bigValue;
for(unsigned i = 0; i != n; ++i)
{ ... }
Fine if bigValue <= UINT_MAX
size_t a;
int b = (int)a;
int b = (int)(a);
int b = int(a);
int b = static_cast<int>(a);
Result value will be
truncated
size_t n = bigValue;
unsigned index = 0;
for(size_t i = 0; i != n; ++i)
{
array[index++] = 10;
}
Don’t mix up size_t and
fundamental data types.
unsigned int <= UINT_MAX
Use only size_t because an array
can contains more when
UINT_MAX items
General issues relating to x64 (3 of 4)
The %x (hex int format) printf modifier will not work as
expected on a 64-bit Windows operating system. It will
only operate on the first 32 bits of the value that is
passed to it.
Use %I32x to display an integer on a Windows 32-bit
operating system.
Use %I64x to display an integer on a Windows 64-bit
operating system.
The %p (hex format for a pointer) will work as expected on a
64-bit Windows operating system.
General issues relating to x64 (4 of 4) data alignment
The MyStruct2 structure size equals to 12 bytes in a 32-bit program, and in a 64-bit program, it is only 16 bytes. Therewith, from the point of view of data access efficiency, the MyStruct1 and MyStruct2 structures are equivalent.
common recommendation is the following: the objects should be distributed in descending order of their size.
References
x86-64 (wikipedia)
64-bit Programming (How Do I in Visual C++)
64-Bit Programming with Visual C++
FAQ for Development on 64-bit Windows
Common Visual C++ 64-bit Migration Issues
Viva64, a tool for porting your applications to 64-bit
platforms
Optimization of 64-bit programs