optimization - the city college of new yorkvito0681/pdf/optimization.pdf · optimization vito...
TRANSCRIPT
10/28/2015
OPTIMIZATION CSC342–FALL 2015–Prof. IZIDOR GERTNER
VITO KLAUDIO
OPTIMIZATION VITO KLAUDIO
1
Table of contents
1. Objective .................................................................................... pg. 2
2. Overview .................................................................................... pg. 3
3. Compiler Generated Index Performance ............................... pg. 4
4. Compiler Generated Pointer Performance ............................ pg. 13
5. Optimized Index Performance ................................................ pg. 17
6. Optimized Pointer Performance ............................................. pg. 20
7. Analysis ...................................................................................... pg. 23
8. Conclusion ................................................................................. pg. 25
9. Appendix.................................................................................... pg. 26
OPTIMIZATION VITO KLAUDIO
2
1. Objective
The objective of this project is to prove that the compiler generated assembly code for a
function is not very good in terms of running time. I will use a simple function that clears an array
i.e. that sets all its elements to zero, to test the running time. I will use two versions of the same
program, namely clearing an array by indices and clearing an array by pointers and the array will
have different sizes ranging from 10 to 1,000,000. The compiler generated assembly code for these
functions will then be optimized manually. I will measure the running time of the optimized
version of the program and plot the measurements in a graph to prove that the optimized code for
clearing an array using pointers is the fastest.
Keep in mind that I will be using a computer with multi-core processor therefore it is hard to
get accurate time my measurements. I will run each of tests five times and take the average of that
to approximate the true running time of the algorithm.
OPTIMIZATION VITO KLAUDIO
3
2. Overview
In order to optimize the machine instructions for my functions I will generate the assembly
code and make some modifications to it. All my work will be done on Microsoft’s Visual Studio
2012. There is one problem in this case: I need to compile assembly language code and then link
it to a C++ main file. Visual Studio does not know how to deal with assembly language files by
default, therefore we will need to perform a “Custom Build” operation to achieve our goal. Once
the compilation and linking, called “Build Solution” in Visual Studio, are successful the path is
clear to start timing my functions.
To approximate the running time of my function I will use the “QueryPerfomanceFunction”
which can be found on the Microsoft official web page with the following link:
https://support.microsoft.com/en-us/kb/815668. We use this function because we want to
achieve the highest resolution timer. This function will give a very accurate approximation of the
real running time of the function since it is not possible to truly time the running time of a function
on a multi-core processor.
OPTIMIZATION VITO KLAUDIO
4
3. Compiler Generated Index Performance
This part of the project deals with the performance of the compiler generated assembly code
of a function that clears an array using indices. I will start a new project on Visual Studio as a
“Console Application” and have it as an “Empty Project”. There are two files that I am going to
use. The first is the “main” file, which is the main program that calls the clearing of the array
function. In a separate file, I will code the “ClearArrayUsingIndex” function. I will start my
project as simple as possible, therefore we consider an array of size 10 initially. The following
picture shows how these function are set on my Visual Studio project. The code for all the functions
used in this project will be available in the appendix section.
Figure 1 – Main Function Compilation
OPTIMIZATION VITO KLAUDIO
5
Figure 2 – ClearArrayUsingIndex Compilation
As we can see from the above screenshots the two function files are successfully compiled
separately. We can now build the function, which means linking the two object files generated for
each function after the compilation. After the linking is completed we can go ahead and generate
the assembly code for the ClearArrayUsingIndex function, but we need to let the compiler know
that we want to do that. I do this by right clicking on the “ClearArrayUsingIndex” file and
selecting “Properties”, in the window that appears expand the “C/C++” menu and go to the
“Output Files” section. The second listing from the top, “Assembler Output” is what we are
looking for. I change the listing to “Assembly-Only Listing”. The following screenshot depicts
what the window should look like before clicking “Apply” and then “OK”:
OPTIMIZATION VITO KLAUDIO
6
Figure 3 – Generating Assembly Code from C/C++ Files
After I complete this task, I have to recompile the function. After the compilation is successful
the “.asm” file is generated by the compiler and it can be found in the “debug” folder of my
working directory. I can add this file to my project by right clicking on the “Source Files” folder
and then “Add -> Add Existing File” and select the file from the already mention directory. At
this point we do not need the C/C++ file anymore therefore we right click it and remove it by
clicking on “Exclude From Project”. We open the “ClearArrayUsingIndex.asm” file to check its
contents. And it truly is what we expected, assembly code to clear an array using indices. Now I
need to compile this file in order to create the object file that will be linked with the main file.
Notice that when we right click the .asm file, the compile feature is disabled. This happens because
OPTIMIZATION VITO KLAUDIO
7
the Visual Studio compiler does not know by default how to compile assembly code, hence we
need to “Custom Build” our function. To do this we need to right click on the
“ClearArrayUsingIndex.asm” file and select “Properties”. The following window will appear:
Figure 4 – Custom Building Assembly Files (Part 1)
We can see that we do not have any custom building options here, yet. To tell the compiler that
we want to custom build our file we need to click on the “Item Type” list select “Custom Build
Tool” and click “Apply”. The following screenshot shows the effect of this:
OPTIMIZATION VITO KLAUDIO
8
Figure 5 - Custom Building Assembly Files (Part 2)
As we can see the “Custom Build Tool” appears on the left hand side panel, depicted with a
red circle in the screenshot. We select it and now we need to change the “Command Line” and
“Outputs”. We don’t need to know how the commands work, simple copy and paste is enough
information for the moment.
Command Line:
ml -c "-Fl$(IntDir)%(FileName).lst" "-Fo$(IntDir)%(FileName).obj" "%(FullPath)"
Outputs:
$(IntDir)%(FileName).obj;
OPTIMIZATION VITO KLAUDIO
9
After this operations are performed we apply the modification and then compile the assembly
file. I notice that there are two errors generated by the compiler. The errors point to the lines
depicted in red in the following screenshot:
Figure 6 - Custom Building Assembly Files (Part 3)
We can easily solve this problem by deleting these lines, or just commenting them out and try
to compile again. In this case the compilation is successful and we can proceed with the linking of
the two files by clicking on “Build -> Rebuild Solution”. After the linking is successful we need
to time our program. We do this by using the “QueryPerformanceCounter” function. The main
function in this case needs to be modified. We include two libraries that are “tchar.h” and
“windows.h”. Furthermore, I use a new name space for the performance counter called “System”.
Another modification to the main program is done to fill an array of arbitrary size, therefore we
need to take the “size” variable out of the main scope and declare it as “const” and use a “for loop”
to fill the array, then perform the performance counter to time the function. The following
screenshot shows the new main function:
OPTIMIZATION VITO KLAUDIO
10
Figure 7 – Query Performance Counter Function
We can now run this program to check its timing by clicking on “Debug” and then “Start
Without Debugging”. The following screenshot shows a sample output of this program:
Figure 8 – Results from QueryPerformanceCounter
The value produced is a sample value for clear an array of random size.
OPTIMIZATION VITO KLAUDIO
11
I chose to repeat my experiment five times for each size from 10 to 1,000,000 and take the
average of them in order to approximate the real running time as much as possible. The following
tables show the results:
Table 1 - ClearArrayUsingIndex( Size = 10)
Runs Running Time (seconds) Average (seconds)
1 0.000225839938509944 0.0002707940626337985
2 0.000248081750635923
3 0.000241238116135622
4 0.000334910363358497
5 0.000272889925699516
Table 2 - ClearArrayUsingIndex( Size = 100)
Runs Running Time (seconds) Average (seconds)
1 0.000310102188294904 0.0002873043308657753
2 0.000261341292480257
3 0.000281872195981161
4 0.000337048999139841
5 0.00023824402604174
Table 3 - ClearArrayUsingIndex( Size = 1,000)
Runs Running Time (seconds) Average (seconds)
1 0.000281872195981161 0.0002945756925223456
2 0.000291282193419076
3 0.00030069219085699
4 0.000245943114854579
5 0.00024551538769831
Table 4 - ClearArrayUsingIndex( Size = 10,000)
Runs Running Time (seconds) Average (seconds)
1 0.000468788963270641 0.0004392330167724649
2 0.000416178523049575
3 0.00121688375958483
4 0.000305824916732216
5 0.000366134445766121
OPTIMIZATION VITO KLAUDIO
12
Table 5 - ClearArrayUsingIndex( Size = 100,000)
Runs Running Time (seconds) Average (seconds)
1 0.0014354523364382 0.0010076396347381172
2 0.00117710513405183
3 0.000880262487601259
4 0.000878551578976184
5 0.00114716423311301
Table 6 - ClearArrayUsingIndex( Size = 1,000,000)
Runs Running Time (seconds) Average (seconds)
1 0.00593257565744872 0.007948368199512477
2 0.00882230032520096
3 0.00683593541148849
4 0.00689538948620986
5 0.00813152096782679
We can visualize these results better by plotting them into a graph.
Graph 1 – Compiler Generated Time of ClearArrayUsingIndex
0
0.0005
0.001
0.0015
0.002
0.0025
0.003
10 100 1000 10000 100000 1000000
TIM
E (S
ECO
ND
S)
SIZE
Compiler Generated Running Time of ClearArrayUsingIndex
Index
OPTIMIZATION VITO KLAUDIO
13
4. Compiler Generated Pointer Performance
I continue my study of compiler generated assembly language code for the same function but
using a different approach that of using pointers. I create new project in Visual Studio and add two
separate files, same as when using indices to clear the array. One of the files is the main function
that calls the second file which is the “ClearUsingPointers.cpp” file. The following screenshots
show the modification of the function:
Figure 9 – ClearUsingPointers Function
Figure 10 – Main Function for ClearUsingPointers
OPTIMIZATION VITO KLAUDIO
14
We can see that the method of clearing the array has changed. We are now using pointers to
clear our array. We expect the pointer approach to be more efficient then the indices approach. I
come to this conclusion because using indices we are accessing the memory location of each of
the elements in the array by going back and forth in memory, while on the other hand by using
pointers we are accessing the elements by their address and not their content. We already know
that register operations are much faster then memory operations.
We continue our study by generating the assembly code for the ClearUsingPointers.cpp
function and compile it following the same steps that we used for clearing the array using indices.
Again, there will be some errors in during compilation which we solve by simply commenting out
the lines that provide the error. Once we have the compiled file we link it to the main function by
using the “Build Solution” option of Visual Studio. It is time to test our function by running several
test on different sizes of the array which again varies over the same range as before, 10 to 1,000,000
elements. The following tables show the results:
Table 7 - ClearUsingPointers (Size = 10)
Runs Running Time (seconds) Average (seconds)
1 0.000204025853540234 0.0002282779833006765
2 0.000254925385136225
3 0.000226267665666213
4 0.000225839938509944
5 0.000227550847135019
Table 8 - ClearUsingPointers (Size = 100)
Runs Running Time (seconds) Average (seconds)
1 0.000289143557637731 0.0002439755699357425
2 0.000227978574291288
3 0.000260913565323988
4 0.000228406301447557
5 0.000223273575572331
OPTIMIZATION VITO KLAUDIO
15
Table 9 - ClearUsingPointers (Size = 1,000)
Runs Running Time (seconds) Average (seconds)
1 0.000322934002982969. 0.0002602719745895851
2 0.00024936493210473
3 0.000236105390260396
4 0.000233539027322783
5 0.000268612654136827
Table 10 - ClearUsingPointers (Size = 10,000)
Runs Running Time (seconds) Average (seconds)
1 0.000304969462419678 0.0003327717275771524
2 0.000308391279669829
3 0.000305397189575947
4 0.000304541735263409
5 0.000316945822795206
Table 11 - ClearUsingPointers (Size = 100,000)
Runs Running Time (seconds) Average (seconds)
1 0.000998742909887726 0.0009799656877275237
2 0.000775041607159126
3 0.00101071927026325
4 0.00104750380570237
5 0.000888817030726636
Table 12 - ClearUsingPointers (Size = 1,000,000)
Runs Running Time (seconds) Average (seconds)
1 0.00762680292342957 0.007945716291143611
2 0.00766786473043138
3 0.00767513609208795
4 0.00847626905577947
5 0.00890314075773577
The results from these tables show that the running time of the pointer based clearance of the
array is slightly faster than the index method. But this is just the compiler generated code. Let’s
take a look at the graph we get by plotting these values:
OPTIMIZATION VITO KLAUDIO
16
Graph 2 – Compiler Generated Time of ClearUsingPointers
If we compare this graph to Graph 1, we can see that the graph for the method using pointers
grow slightly slower, meaning that the running time is slightly better than the one achieved from
the indices method. We are not satisfied with this results. We need to optimize manually the
assembly code.
0
0.0005
0.001
0.0015
0.002
0.0025
0.003
10 100 1000 10000 100000 1000000
TIM
E (S
ECO
ND
S)
SIZE
Compiler Generated Running Time of ClearUsingPointers
Pointer
OPTIMIZATION VITO KLAUDIO
17
5. Optimized Index Performance
In order to optimize our methods of clearing an array of arbitrary numbers we will take the
assembly code and see what is it that we can remove without ruining the algorithm or more
importantly, what instructions can we substitute. This means that we will take some instructions,
two or three at a time, and substitute them with just one instruction and this way the running time
will be reduced considerably. I will focus on the instructions that deal with memory. I will try to
reduce as much as possible the number of instructions that tell the processor to go and look into
memory back and forth throughout the program. The following screenshot shows the results after
the optimization:
Figure 11 – Optimized ClearArrayUsingIndex
Here, the red squares depict the instructions that were substituted while the green squares
contain the new instructions which we expect to work faster than before. We optimized the array
by substituting the instructions that work with memory. We run the program, and test it for
different sizes of the array. The tables below show the results:
OPTIMIZATION VITO KLAUDIO
18
Table 13 - ClearUsingIndexOptimized( Size = 10)
Runs Running Time (seconds) Average (seconds)
1 0.000236533117416665 0.000188028857895779
2 0.000112919969254972
3 0.000112064514942434
4 0.000242949024760697
5 0.000235677663104127
Table 14 - ClearUsingIndexOptimized( Size = 100)
Runs Running Time (seconds) Average (seconds)
1 0.000214291305290686 0.0002800757419248322
2 0.000237816298885471
3 0.000239954934666816
4 0.00034474808795268
5 0.000363568082828508
Table 15 - ClearUsingIndexOptimized( Size = 1,000)
Runs Running Time (seconds) Average (seconds)
1 0.000254925385136225. 0.0002647631097304078
2 0.000291709920575344
3 0.0002566362937613
4 0.000266901745511752
5 0.000253642203667418
Table 16 - ClearUsingIndexOptimized( Size = 10,000)
Runs Running Time (seconds) Average (seconds)
1 0.000215146759603223 0.000322335184964193
2 0.000459806692988996
3 0.000447830332613469
4 0.000236960844572934
5 0.000251931295042343
Table 17 - ClearUsingIndexOptimized( Size = 100,000)
Runs Running Time (seconds) Average (seconds)
1 0.000767770245502556 0.0005071988619035828
2 0.000427299429112565
3 0.000490175321084083
4 0.000426016247643758
5 0.000424733066174952
OPTIMIZATION VITO KLAUDIO
19
Table 18 - ClearUsingIndexOptimized( Size = 1,000,000)
Runs Running Time (seconds) Average (seconds)
1 0.00338631589618035 0.002520339495598474
2 0.00216387168356403
3 0.00249279386673476
4 0.00234651117929082
5 0.00221220485222241
We can already see the difference. The optimized Index method works faster than the pointer
method without optimization. Let’s make it clear by plotting it on a graph:
Graph 3 – Optimized Running Time of ClearArrayUsingIndex
Let’s continue and see how optimization effects pointer based array clearing.
0
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
10 100 1000 10000 100000 1000000
TIM
E (S
ECO
ND
S)
SIZE
Optimized Running Time of ClearArrayUsingIndex
Optimized Index
OPTIMIZATION VITO KLAUDIO
20
6. Optimized Pointer Performance
We follow the same procedure as before for optimizing the pointer method. The following
screenshot shows the optimized assembly code for the ClearUsingPointers method:
It is clear that we removed all the redundant instructions and substituted them with new
instructions which use fewer space. Now it is time to test our optimization. The following tables
show the results from testing the optimized pointer method for different sizes of an array:
Table 19 - ClearUsingPointerOptimized( Size =10)
Runs Running Time (seconds) Average (seconds)
1 0.000260913565323988 0.000254839839704971
2 0.000233966754479052
3 0.000229689482916364
4 0.00023482220879159
5 0.000314807187013861
OPTIMIZATION VITO KLAUDIO
21
Table 20 - ClearUsingPointerOptimized( Size =100)
Runs Running Time (seconds) Average (seconds)
1 0.000224129029884869 0.0002680993815493048
2 0.000352874903921788
3 0.000277167197262204
4 0.000227550847135019
5 0.000258774929542644
Table 21 - ClearUsingPointerOptimized( Size =1,000)
Runs Running Time (seconds) Average (seconds)
1 0.000227550847135019 0.0002276363925662732
2 0.000226695392822482
3 0.000226695392822482
4 0.000228406301447557
5 0.000228834028603826
Table 22 - ClearUsingPointerOptimized( Size =10,000)
Runs Running Time (seconds) Average (seconds)
1 0.000242949024760697 0.0002857217403875806
2 0.000246370842010848
3 0.00024551538769831
4 0.000266901745511752
5 0.000426871701956296
Table 23 - ClearUsingPointerOptimized( Size =100,000)
Runs Running Time (seconds) Average (seconds)
1 0.000417461704518381 0.0004474026054571994
2 0.00041788943167465
3 0.000427727156268833
4 0.000361429447047164
5 0.000612505287776969
Table 24 - ClearUsingPointerOptimized( Size =1,000,000)
Runs Running Time (seconds) Average (seconds)
1 0.00196583401021156 0.00240801834436228
2 0.00247140750892132
3 0.00245258751404549
4 0.00217199849953314
5 0.00297826418909989
OPTIMIZATION VITO KLAUDIO
22
We have already gotten the feeling that the optimized pointer method will be the fastest of all
of them. Let’s see the graph that we obtain from these averages:
Graph 4 – Optimized Running Time of ClearUsingPointers
We can see the difference now. The optimized version of the pointer method runs faster than
any of the previous methods.
0
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
10 100 1000 10000 100000 1000000
TIM
E (S
ECO
ND
S)
SIZE
Optimized Running Time of ClearUsingPointers
Optimized Pointer
OPTIMIZATION VITO KLAUDIO
23
7. Analysis
In this section we analyze the running time of all our examples presented in this laboratory.
We combine the results from the timing of the functions and create a new table for them:
Table 24 – Optimization Summary
Size Index Pointers Index Optimized Pointer Optimized
10 0.000271 0.000228 0.000188 0.000255
100 0.000287 0.000244 0.000280 0.000268
1,000 0.000295 0.000260 0.000265 0.000228
10,000 0.000439 0.000333 0.000322 0.000286
100,000 0.001008 0.000980 0.000507 0.000447
1,00,0000 0.007948 0.00765 0.002520 0.0024080
The results of this lab are summarized by the following graph:
OPTIMIZATION VITO KLAUDIO
24
Graph 5 – Optimization Results
0
0.005
0.01
0.015
0.02
0.025
10 100 1000 10000 100000 1000000
Tim
e (s
eco
nd
s)
Size
OPTIMIZATION RESULTS
Optimized Pointer Optimized Index
Pointer Index
OPTIMIZATION VITO KLAUDIO
25
8. Conclusion
In this laboratory we took into consideration two methods for clearing an array. The first
method uses indices to clear the array while the second method uses pointers to do the same thing.
We tested the running time of these methods. Afterwards we optimized the running time by
substituting instructions in the assembly code generated by the compiler. We timed again the
running time to find out that:
The conclusion of this laboratory is that the optimized version of clearing an
array by pointers is the fastest way to complete the task.
OPTIMIZATION VITO KLAUDIO
26
9. Appendix
CLEAR USING INDICES
Main.cpp
#include <tchar.h>
#include <windows.h>
void ClearUsingIndex(int [], int);
using namespace System;
const int n = 100000000;
static int arr[n];
int main() {
for (int i = 0; i < n; i++)
arr[i] = i+1;
__int64 ctr1 = 0, ctr2 = 0, freq = 0;
int acc = 0, i = 0;
// Start timing the code.
if (QueryPerformanceCounter((LARGE_INTEGER *)&ctr1)!= 0)
{
// Code segment is being timed.
ClearUsingIndex(arr, n);
// Finish timing the code.
QueryPerformanceCounter((LARGE_INTEGER *)&ctr2);
Console::WriteLine("Start Value: {0}",ctr1.ToString());
Console::WriteLine("End Value: {0}",ctr2.ToString());
QueryPerformanceFrequency((LARGE_INTEGER *)&freq);
//Console::WriteLine(S"QueryPerformanceCounter minimum resolution: 1/{0}
Seconds.",freq.ToString());
// In Visual Studio 2005, this line should be changed to:
Console::WriteLine("QueryPerformanceCounter minimum resolution: 1/{0}
Seconds.",freq.ToString());
Console::WriteLine("100 Increment time: {0} seconds.",((ctr2 - ctr1) * 1.0
/ freq).ToString());
}
else
{
DWORD dwError = GetLastError();
//Console::WriteLine(S"Error value = {0}",dwError.ToString());// In Visual
Studio 2005, this line should be changed to: Console::WriteLine("Error value =
{0}",dwError.ToString());
OPTIMIZATION VITO KLAUDIO
27
}
// Make the console window wait.
Console::WriteLine();
Console::Write("Press ENTER to finish.");
Console::Read();
return 0;
}
ClearArrayUsingIndex.cpp
void ClearUsingIndex(int arr[], int size)
{
int i;
for (i = 0; i < size; i++)
arr[i] = 0;
}
ClearArrayUsingIndex.asm
; Listing generated by Microsoft (R) Optimizing Compiler Version 17.00.50727.1
TITLE C:\Users\Klaudio\Desktop\CSC342-343 CLASS\csc342Project -
optimization\CSC342_Project\CSC342_Project\ClearArrayUsingIndex.cpp
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB MSVCRTD
INCLUDELIB OLDNAMES
PUBLIC ?ClearArrayUsingIndex@@YAXQAHH@Z ; ClearArrayUsingIndex
EXTRN __RTC_InitBase:PROC
EXTRN __RTC_Shutdown:PROC
; COMDAT rtc$TMZ
rtc$TMZ SEGMENT
__RTC_Shutdown.rtc$TMZ DD FLAT:__RTC_Shutdown
rtc$TMZ ENDS
; COMDAT rtc$IMZ
rtc$IMZ SEGMENT
__RTC_InitBase.rtc$IMZ DD FLAT:__RTC_InitBase
rtc$IMZ ENDS
; Function compile flags: /Odtp /RTCsu /ZI
; COMDAT ?ClearArrayUsingIndex@@YAXQAHH@Z
_TEXT SEGMENT
_i$ = -8 ; size = 4
_arr$ = 8 ; size = 4
_size$ = 12 ; size = 4
?ClearArrayUsingIndex@@YAXQAHH@Z PROC ; ClearArrayUsingIndex, COMDAT
OPTIMIZATION VITO KLAUDIO
28
; File c:\users\klaudio\desktop\csc342-343 class\csc342project -
optimization\csc342_project\csc342_project\cleararrayusingindex.cpp
; Line 2
push ebp
mov ebp, esp
sub esp, 204 ; 000000ccH
push ebx
push esi
push edi
lea edi, DWORD PTR [ebp-204]
mov ecx, 51 ; 00000033H
mov eax, -858993460 ; ccccccccH
rep stosd
; Line 3
mov DWORD PTR _i$[ebp], 0
; Line 4
mov DWORD PTR _i$[ebp], 0
jmp SHORT $LN3@ClearArray
$LN2@ClearArray:
mov eax, DWORD PTR _i$[ebp]
add eax, 1
mov DWORD PTR _i$[ebp], eax
$LN3@ClearArray:
mov eax, DWORD PTR _i$[ebp]
cmp eax, DWORD PTR _size$[ebp]
jge SHORT $LN4@ClearArray
; Line 5
mov eax, DWORD PTR _i$[ebp]
mov ecx, DWORD PTR _arr$[ebp]
mov DWORD PTR [ecx+eax*4], 0
jmp SHORT $LN2@ClearArray
$LN4@ClearArray:
; Line 6
pop edi
pop esi
pop ebx
mov esp, ebp
pop ebp
ret 0
?ClearArrayUsingIndex@@YAXQAHH@Z ENDP ; ClearArrayUsingIndex
_TEXT ENDS
END
OPTIMIZATION VITO KLAUDIO
29
ClearArrayUsingIndexOptimized.asm
; Listing generated by Microsoft (R) Optimizing Compiler Version 15.00.21022.08
TITLE c:\Users\.....\ClearArrayIndexOptimized.cpp
.686P
.XMM
include listing.inc
.model flat
;
; OPTIMIZED!!!!
; Custom Build Step, including a listing file placed in intermediate directory
; but without Source Browser information
; debug:
; ml -c -Zi "-Fl$(IntDir)\$(InputName).lst" "-Fo$(IntDir)\$(InputName).obj"
"$(InputPath)"
; release:
; ml -c "-Fl$(IntDir)\$(InputName).lst" "-Fo$(IntDir)\$(InputName).obj"
"$(InputPath)"
; outputs:
; $(IntDir)\$(InputName).obj
PUBLIC ?ClearUsingIndexOptimized@@YAXQAHH@Z ;
ClearUsingIndexOptimized
_TEXT SEGMENT
_i$ = -8 ; size = 4
_Array$ = 8 ; size = 4
_size$ = 12 ; size = 4
;
?ClearUsingIndexOptimized@@YAXQAHH@Z PROC ; ClearUsingIndexOptimized,
COMDAT
; Line 3
push ebp
mov ebp, esp
sub esp, 204 ; 000000ccH
push ebx
push esi
push edi
lea edi, DWORD PTR [ebp-204]
mov ecx, 51 ; 00000033H
mov eax, -858993460 ; ccccccccH
rep stosd
; Line 5
; mov DWORD PTR _i$[ebp], 0 ; i =0 on stack
mov eax, 0 ; initialize i in EAX to 0
mov edx, DWORD PTR _size$[ebp] ; store ARRAY size in EDX
mov ecx, DWORD PTR _Array$[ebp] ; move address of the ARRAY from stack
to ecx
jmp SHORT $LN3@ClearUsing
$LN2@ClearUsing:
; mov eax, DWORD PTR _i$[ebp] ; move again i from stack to eax
OPTIMIZATION VITO KLAUDIO
30
add eax, 1 ; increament i in EAX
; mov DWORD PTR _i$[ebp], eax ; move eax onto stack
$LN3@ClearUsing:
; mov eax, DWORD PTR _i$[ebp] ; move i from stack to eax
; cmp eax, DWORD PTR _size$[ebp] ; compare i in eax with ARRAY
size on stack
cmp eax, edx ; compare i in eax with ARRAY
size in EDX
jge SHORT $LN4@ClearUsing ; if done exit
; Line 6
; mov eax, DWORD PTR _i$[ebp] ; move again i into eax
; mov ecx, DWORD PTR _Array$[ebp] ; move address of the ARRAY from
stack to ecx
mov DWORD PTR [ecx+eax*4], 0 ; compute the effective address
and move zero to the address
jmp SHORT $LN2@ClearUsing ; jump to the begginning of the
LOOP
$LN4@ClearUsing:
; Line 7
pop edi
pop esi
pop ebx
mov esp, ebp
pop ebp
ret 0
?ClearUsingIndexOptimized@@YAXQAHH@Z ENDP ; ClearUsingIndexOptimized
_TEXT ENDS
END
CLEAR USING POINTERS
Main.cpp
#include <tchar.h>
#include <windows.h>
void ClearUsingPointers(int*, int);
using namespace System;
const int n = 100000000;
static int arr[n];
int main() {
for (int i = 0; i < n; i++)
arr[i] = i+1;
__int64 ctr1 = 0, ctr2 = 0, freq = 0;
int acc = 0, i = 0;
// Start timing the code.
if (QueryPerformanceCounter((LARGE_INTEGER *)&ctr1)!= 0)
{
// Code segment is being timed.
OPTIMIZATION VITO KLAUDIO
31
ClearUsingPointers(arr, n);
// Finish timing the code.
QueryPerformanceCounter((LARGE_INTEGER *)&ctr2);
Console::WriteLine("Start Value: {0}",ctr1.ToString());
Console::WriteLine("End Value: {0}",ctr2.ToString());
QueryPerformanceFrequency((LARGE_INTEGER *)&freq);
//Console::WriteLine(S"QueryPerformanceCounter minimum resolution: 1/{0}
Seconds.",freq.ToString());
// In Visual Studio 2005, this line should be changed to:
Console::WriteLine("QueryPerformanceCounter minimum resolution: 1/{0}
Seconds.",freq.ToString());
Console::WriteLine("100 Increment time: {0} seconds.",((ctr2 - ctr1) * 1.0
/ freq).ToString());
}
else
{
DWORD dwError = GetLastError();
//Console::WriteLine(S"Error value = {0}",dwError.ToString());// In Visual
Studio 2005, this line should be changed to: Console::WriteLine("Error value =
{0}",dwError.ToString());
}
// Make the console window wait.
Console::WriteLine();
Console::Write("Press ENTER to finish.");
Console::Read();
return 0;
}
ClearUsingPointers.cpp
void ClearUsingPointers(int* arr, int size){
int *p;
for( p = &arr[0]; p < &arr[size]; p = p+1)
*p = 0;
};
OPTIMIZATION VITO KLAUDIO
32
ClearUsingPointers.asm
; Listing generated by Microsoft (R) Optimizing Compiler Version 17.00.50727.1
TITLE C:\Users\Klaudio\Desktop\CSC342-343 CLASS\10-28-2015\10-28-
2015\ClearUsingPointers.cpp
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB MSVCRTD
INCLUDELIB OLDNAMES
PUBLIC ?ClearUsingPointers@@YAXPAHH@Z ; ClearUsingPointers
EXTRN __RTC_InitBase:PROC
EXTRN __RTC_Shutdown:PROC
; COMDAT rtc$TMZ
rtc$TMZ SEGMENT
;__RTC_Shutdown.rtc$TMZ DD FLAT:__RTC_Shutdown
rtc$TMZ ENDS
; COMDAT rtc$IMZ
rtc$IMZ SEGMENT
;__RTC_InitBase.rtc$IMZ DD FLAT:__RTC_InitBase
rtc$IMZ ENDS
; Function compile flags: /Odtp /RTCsu /ZI
; COMDAT ?ClearUsingPointers@@YAXPAHH@Z
_TEXT SEGMENT
_p$ = -8 ; size = 4
_arr$ = 8 ; size = 4
_size$ = 12 ; size = 4
?ClearUsingPointers@@YAXPAHH@Z PROC ; ClearUsingPointers, COMDAT
; File c:\users\klaudio\desktop\csc342-343 class\10-28-2015\10-28-
2015\clearusingpointers.cpp
; Line 1
push ebp
mov ebp, esp
sub esp, 204 ; 000000ccH
push ebx
push esi
push edi
lea edi, DWORD PTR [ebp-204]
mov ecx, 51 ; 00000033H
mov eax, -858993460 ; ccccccccH
rep stosd
; Line 3
mov eax, 4
imul eax, 0
add eax, DWORD PTR _arr$[ebp]
mov DWORD PTR _p$[ebp], eax
jmp SHORT $LN3@ClearUsing
$LN2@ClearUsing:
mov eax, DWORD PTR _p$[ebp]
add eax, 4
mov DWORD PTR _p$[ebp], eax
OPTIMIZATION VITO KLAUDIO
33
$LN3@ClearUsing:
mov eax, DWORD PTR _size$[ebp]
mov ecx, DWORD PTR _arr$[ebp]
lea edx, DWORD PTR [ecx+eax*4]
cmp DWORD PTR _p$[ebp], edx
jae SHORT $LN4@ClearUsing
; Line 4
mov eax, DWORD PTR _p$[ebp]
mov DWORD PTR [eax], 0
jmp SHORT $LN2@ClearUsing
$LN4@ClearUsing:
; Line 5
pop edi
pop esi
pop ebx
mov esp, ebp
pop ebp
ret 0
?ClearUsingPointers@@YAXPAHH@Z ENDP ; ClearUsingPointers
_TEXT ENDS
END
ClearUsingPointersOptimized.asm
; Listing generated by Microsoft (R) Optimizing Compiler Version 17.00.60610.1
.686P
.XMM
include listing.inc
.model flat
INCLUDELIB MSVCRTD
INCLUDELIB OLDNAMES
PUBLIC ?ClearArrayPointerOptimized@@YAXPAHH@Z ; clear_array_pointer
EXTRN __RTC_InitBase:PROC
EXTRN __RTC_Shutdown:PROC
; COMDAT rtc$TMZ
rtc$TMZ SEGMENT
; __RTC_Shutdown.rtc$TMZ DD FLAT:__RTC_Shutdown
rtc$TMZ ENDS
; COMDAT rtc$IMZ
rtc$IMZ SEGMENT
; __RTC_InitBase.rtc$IMZ DD FLAT:__RTC_InitBase
rtc$IMZ ENDS
; Function compile flags: /Odtp /RTCsu /ZI
; COMDAT ?clear_array_pointer@@YAXPAHH@Z
_TEXT SEGMENT
_p$ = -8 ; size = 4
_ary$ = 8 ; size = 4
_size$ = 12 ; size = 4
?ClearArrayPointerOptimized@@YAXPAHH@Z PROC ; clear_array_pointer, COMDAT
; 2 : {
push ebp
mov ebp, esp
OPTIMIZATION VITO KLAUDIO
34
sub esp, 204 ; 000000ccH
push ebx
push esi
push edi
lea edi, DWORD PTR [ebp-204]
mov ecx, 51 ; 00000033H
mov eax, -858993460 ; ccccccccH
rep stosd
; 3 : int *p;
; 4 : for(p = &ary[0]; p<&ary[size]; p= p+1)
mov eax, DWORD PTR _ary$[ebp]
mov DWORD PTR _p$[ebp], eax
mov ebx, DWORD PTR _size$[ebp]
lea edx, DWORD PTR [eax+ebx*4]
jmp SHORT $LN3@clear_arra
$LN2@ClearArrayPointerOptimized:
add eax, 4
$LN3@clear_arra:
cmp eax, edx
jae SHORT $LN4@ClearArrayPointerOptimized
; 5 : *p = 0;
mov DWORD PTR [eax], 0
jmp SHORT $LN2@ClearArrayPointerOptimized
$LN4@ClearArrayPointerOptimized:
; 6 : }
pop edi
pop esi
pop ebx
mov esp, ebp
pop ebp
ret 0
?ClearArrayPointerOptimized@@YAXPAHH@Z ENDP ; clear_array_pointer
_TEXT ENDS
END