64-bit insider volume 1 issue 14
TRANSCRIPT
-
8/6/2019 64-Bit Insider Volume 1 Issue 14
1/6
64-bit Insider Newsletter
Volume 1, Issue 14Page 1/6
64-bit InsiderVolume I, Issue 14
Introduction to optimizationfor multi-core and hyper-
threaded processorsAs 64-bit processors become more widespread, so also
do multi-core processors. Many PCs today already havemulti-core 6-bit processors installed in them. In servers,
multiple multi-core processors are often present. And to
add to the mix, many of these processors also support
hyper-threading. How can these technologies help myperformance? What are the differences between these
technologies? How can I be sure that my applications
take advantage of these technologies effectively? Thisnewsletter will start to answer these questions and will
give you pointers to where to go to get further
information.
Multi-processor, multi-core, & hyper-threading:
What are they?
Normally a processor reads instructions from RAM intoone or more caches on the processor itself before loading
the instruction into the execution core and executing it. A
processors speed is often measured by the number of
clock cycles that it produces per second. However, it isalso true that the more instructions that a processor can
execute per clock cycle the faster it will be.
The 64-bit Advantage
The computer industry is
changing, and 64-bit technology
is the next, inevitable step. The
64-bit Insider newsletter will help
you adopt this technology by
providing tips and tricks for a
successful port.
Development and migration of 64-
bit technology is not as
complicated as the 16-bit to 32-bit
transition. However, as with anynew technology, several areas do
require close examination and
consideration. The goal of the 64-
bit Insidernewsletteris to identify
potential migration issues and
provide viable, effective solutions
to these issues. With a plethora of
Web sites already focused on 64-
bit technology, the intention of
this newsletter is not to repeat
previously published information.
Instead, it will focus on 64-bit
issues that are somewhat isolatedyet extremely important to
understand. It will also connect
you to reports and findings from
64-bit experts.
-
8/6/2019 64-Bit Insider Volume 1 Issue 14
2/6
64-bit Insider Newsletter
Volume 1, Issue 14Page 2/6
With a single processor system, the processor
can normally read a single instruction into theexecution coreper clock cycle. However,many instructions take more than a single
clock cycle to execute due to something called
memory latency, for example a loadinstruction must read data from memory
which can take many clock cycles to arrive at
the processor. The result is that a lot of timethe processor is idle. This means that the
instructions are executed and the results of the
execution are committed to memory much less than one instruction per clock cycle.
Hyper-threading is an Intel technology that
enables the processor to execute alternative
instructions when it would normally be idle.
It is still a single processor but the processorpresents itself as two logical processors to
allow the operating system to scheduleseparate threads of instructions forexecution on each logical processor. Now
when one of the instructions for the first
logical processor pauses while it waits fordata to arrive from main memory, the
execution core can execute instructions that
were scheduled for the other logical processor.
A multi-core processor also presents
itself as two logical processors to the
operating system. However, this timethere really are two execution cores on
the processor. These two cores can
execute separate threads ofinstructions in a truly simultaneous
fashion but they are located on a single
processor die and therefore use up lessspace. Unlike a situation where there are
two processors in a system, the two
execution cores share the some of the
same cache which can be very helpful if the two threads of execution are executing thesame instructions and if they are working on the same set of data.
How do I take advantage of these technologies?
The operating system can easily take advantage of multiple processors because it dealsmostly with whole processes. So, it can schedule one process to run on one processor and
another process to run on another processor. However, if there is only one process
EC
CACHE
RAM
Fig. 1. Operation of a single core processor
with no hyper-threading
ECCACHE
RAM
Fig. 2. Operation of a single core processor
with hyper-threading
EC
CACHE
RAM
EC
Fig. 3. Operation of a multi-core core processor
with hyper-threading
-
8/6/2019 64-Bit Insider Volume 1 Issue 14
3/6
64-bit Insider Newsletter
Volume 1, Issue 14Page 3/6
executing, how can it distribute the work done by this process among multiple logical or
physical processors?
A key term that appears several times in the description of the different types of
processors, above, is threads. Creating several threads of execution means writing a
program that at some point splits into two separate sequences of code, in such a way thatboth pieces of code continue to run at the same time. A program written in this way is
called a multi-thread program and it explicitly demarcates independent sequences of
instructions that the operating system can have run on different processors.
Multi-threaded programming has been around for a long time. It involves creating several
threads of execution inside your application that all run at the same time. Most modernapplications use them today even if you are not fully aware of them. For example,
ASP.NET web applications normally consist of a single thread of execution for each
HTTP request that is received. Multiple simultaneous requests mean multiple
simultaneous threads of execution.
Also, they allow you to create highly responsive GUI applications that perform long
compute intensive processes but still respond appropriately to user input. For example,threads allow you to create a progress bar that is painted correctly on the desktop while it
simultaneously scans the hard-disk for viruses.
Adding threads to your applicationsTo add threads to your application you need to use some mechanism to tell the operating
system that you wish to do so. Running processes are scheduled by the operating system.
That is, they are given time on the processor depending on how many other processes arerunning on the operating system. Similarily, multiple threads of execution within a
process must be scheduled by the operating system.
There are two ways to add threads to your program. One is to use a specialized API likethe Windows API or the System.Threading namespace of the .NET framework to create
and control threads manually. The other is to use compiler directives as defined by the
OpenMP standard to have the threads created for you automatically.
Threading in .NET
The .NET Framework provides a set of classes to enable multithreaded programming. In
its most basic form starting a thread is just a matter of creating a Thread object, passing
it the name of a method that will do the work of the thread and then calling its Start()
method. When the Start() method returns there are two threads of execution. Have a look
at the following example:
class ThreadingExample{
staticvoid Main(string[] args){
ThreadStart work = new ThreadStart(printEvens);Thread thread = new Thread(work);thread.Start();
printOdds();
-
8/6/2019 64-Bit Insider Volume 1 Issue 14
4/6
64-bit Insider Newsletter
Volume 1, Issue 14Page 4/6
System.Console.WriteLine("Done.");}
privatestaticvoid printEvens(){
for(int i=0; i
-
8/6/2019 64-Bit Insider Volume 1 Issue 14
5/6
64-bit Insider Newsletter
Volume 1, Issue 14Page 5/6
void multiply_vectors(float a[][COLS], float b[], float result[]){#pragma omp parallel for
for (int i = 0; i < ROWS; i++){
result[i] = 0;for (int j = 0; j < COLS; j++)
result[i] += a[i][j] * b[j];}
}
In this example, we have some C++ code that multiplies a matrix and a vector. If these
objects are very large then it might make sense to perform some of the matrixmultiplication in different threads. The great thing about OpenMP is that we can test this
theory with just a single line of code!
The pragma in this sample code tells the compiler to create a set of threads before the forloop begins and to distribute the iterations of the loop evenly among all the threads.
Just like in the API example we have potential problems when data is shared between theOpenMP threads. In this example, there is no problem because there are no dependenciesbetween the iterations of the loops and every iteration writes to a different part of the
resultarray.
There are many options in OpenMP to configure some aspects of the parallelism. For
example, you can specify things like how many threads are created, how many iterations
are given to each thread, and how to share data between the threads so as to avoidconflicts. However, flexibility is currently limited. For example, you would find it
difficult to use OpenMP to manage elements of your user interface.
OpenMP is supported by the Visual C++ 2005 from Microsoft. It is also supported by theIntel C++ compiler. Another advantage of OpenMP is that it is portable. It is a standard
that will be understood by many different compilers on different platforms. And those
that do not understand OpenMP can ignore it.
Summary
To take advantage of the multiple physical and logical processors available in 64-bitsystems and in some 32-bit systems you need to understand the concept of threads and
you need to implement threads in your own application.
APIs exist for most languages on Windows that allow you to create threads in your own
programs. Two common APIs are the one in .NET and the standard Windows API. Also
a standard called OpenMP defines pragmas that can be used to create threads in a moredeclarative fashion.
In a future newsletter we will look at the issues that surround synchronizing multiplethreads and how to identify and resolve those issues.
URLs
-
8/6/2019 64-Bit Insider Volume 1 Issue 14
6/6
64-bit Insider Newsletter
Volume 1, Issue 14Page 6/6
What is Multicore?
http://en.wikipedia.org/wiki/Multicore
What is hyperthreading?
http://en.wikipedia.org/wiki/Hyper-threading
Reap the Benefits of Multithreading without All the Work -
http://msdn.microsoft.com/msdnmag/issues/05/10/OpenMP/
Multithreading for Rookies -
http://msdn.microsoft.com/library/default.asp?url=/library/en-
us/dndllpro/html/msdn_threads.asp
http://en.wikipedia.org/wiki/Multicorehttp://en.wikipedia.org/wiki/Multicorehttp://en.wikipedia.org/wiki/Hyper-threadinghttp://en.wikipedia.org/wiki/Hyper-threadinghttp://msdn.microsoft.com/msdnmag/issues/05/10/OpenMP/http://msdn.microsoft.com/msdnmag/issues/05/10/OpenMP/http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndllpro/html/msdn_threads.asphttp://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndllpro/html/msdn_threads.asphttp://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndllpro/html/msdn_threads.asphttp://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndllpro/html/msdn_threads.asphttp://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndllpro/html/msdn_threads.asphttp://msdn.microsoft.com/msdnmag/issues/05/10/OpenMP/http://en.wikipedia.org/wiki/Hyper-threadinghttp://en.wikipedia.org/wiki/Multicore