rt0950 eliminatingrubygilthroughhtm slides ja...microsoft powerpoint -...
TRANSCRIPT
© 2013 IBM Corporation
���������� ���������Ruby���������� ��!"��#$
����������� ������,
Jose G. CastanosIBM Research – Watson Research Center
© 2013 IBM Corporation2
IBM Research – Tokyo ����������� ���
%�� �&'������ ��!"�
� ��������RubyPython�� ��� �������
� ����������� !"#$%& '()*
– JIT������Rubinius�ytljit�PyPy�Fiorano�
– HPC Ruby
� +�,������-./0�12��13��45678
–�� 1��������������������
☺�������������� !�"#�$�#%&'
� ()*�+,�-�.��/�01%2"
© 2013 IBM Corporation3
IBM Research – Tokyo ����������� ���
���������� �����(HTM)()*+,*�./3��!TM49:;<�=>?.:&��� �@�
� AB�CDETM56FGH2I2J��!K2�?LM
Blue Gene/Q2012
zEC122012
Rock ProcessorNOPQ�R�
TransactionalSynchronizationeXtensions, 2013
Sun Microsystems
Intel
© 2013 IBM Corporation4
IBM Research – Tokyo ����������� ���
-./�01
� GIL!HTM?STUVWX LE�� ��-@
–34�������"�56Global Interpreter Lock (GIL)
� Y VZ4[�\]^_`ab-@
© 2013 IBM Corporation5
IBM Research – Tokyo ����������� ���
-./�01
� GIL!HTM?STUVWX LE�� ��-@
–34�������"�56Global Interpreter Lock (GIL)
� Y VZ4[�\]^_`ab-@
zEC12atomic { }
� Ruby GIL!zEC12 HTM!cGdST� Ruby NAS Parallel Benchmarks?ef
© 2013 IBM Corporation6
IBM Research – Tokyo ����������� ���
2345� Python�GIL�HTM���
� �������� !"#$�%�&'()%��* [Riley+,2006]
� ������ !"#$,-%��&'()%��*[Blundell+, 2010]
� Rock�./�01HTM2�%��&'()%��* [Tabba, 2010]
� Ruby, Python�GIL�34�&5����
– JRuby, IronRuby, Jython, IronPython�6
��7�89�:;
� �<=<�><?�%�)=#5@8A-B?CD�E�FG
� Ruby�GIL�����8H�HTM�I1J��K���� %��&'()%�LM�7�N&O<��PQ�R�189�:;� �<=<�><?�FGS�TUVK
© 2013 IBM Corporation7
IBM Research – Tokyo ����������� ���
-./�67
�g�
� �h�i��jk�lm�4nGd
� Ruby GIL4nGd
� HTM45�GIL ST
�Lopq
�pr
© 2013 IBM Corporation8
IBM Research – Tokyo ����������� ���
���� �����
� �3�hs��t
– 5�017)859:��;��<59:�=>?@ABC�DE
�Lut
– ;��<59:�F-GH�IJK1�0����LMN�-OP Q�����RKST
–UV-;��<59:�KGH�IJ%WX���YZK !��[\"�5OZ]^� !_ tbegin();
a->count++;tend();
tbegin();b->count++;tend();
tbegin();a->count++;tend();
tbegin();a->count++;tend();
Thread X
Thread Y
lock();a->count++;unlock();
tbegin();a->count++;tend();
© 2013 IBM Corporation9
IBM Research – Tokyo ����������� ���
zEC12�HTM� WXY5Z
– TBEGIN: Z<([� \(�]^
– TEND: Z<([� \(�_`
– a�bcTABORTcNTSTG�6
� 89
– de?f*g*hi�L1EL2j$kl5 "mno (~2MB)
– de?p7g*hi�L1E=Z�q5rsmno (8KB)
– kl5 "tu#(=N&Zt��vIKwxyz{
� |}�~i���ZKJTBEGINWX���m��
– f*g*hi�xycp7g*hi�xy
– f*g*hi�������cp7g*hi�������
– ��WX� =��t��6; �E�6��[WX����
– a�bc���g*�6
TBEGINif (cc!=0)goto abort handler
...
...TEND
© 2013 IBM Corporation10
IBM Research – Tokyo ����������� ���
Ruby8�9%:"� !;�<;=GIL 1.9.3-p194��
� Ruby��
– Thread, Mutex, ConditionVariable5���`�a�"#�b
� Rubyvwxy
– Ruby�����c�01�����(Pthread) dZea
– ���GIL-fg OZ�� 1����-h���������[
�z����-N{t4GIL!|}~�t4��
� �3��$��� �-GIL!���I/O��
��� ����I0��2���?�GIL!��
–ijk?jklm���Gn��?�"�5op
© 2013 IBM Corporation11
IBM Research – Tokyo ����������� ���
Ruby�GIL>?�@AB
�����?�4GIL!��$���H2I2J��� 10+2����!cGd250s��41���
11
$�%=#5@ �N?� \(=#5@
250�
��
�1�
��
GIL���r<O�Y5Z
r<O�Y5Z
GIL���
if (r<O-�) {gil_release();sched_yeild();gil_acquire();
}
z����?Bh�!��
8�m�=�C"?(O���m �w¡cLM¢£
© 2013 IBM Corporation12
IBM Research – Tokyo ����������� ���
-./�67
�g�
� �h�i��jk�lm�4nGd
� Ruby GIL4nGd
� HTM45�GIL ST
–qr+)s�tb
– ;��<59:�u-vw�xy
–WX-z{
�Lopq
�pr
© 2013 IBM Corporation13
IBM Research – Tokyo ����������� ���
HTM+CDGIL#$�EF���-�h�i��j��UdLu
–=>?@A|}KGIL-~�?��?��|}��qr��
� �h�i��j�����E�2�UV�GIL!|}
Z<([� \(]^
Z<([� \(_`
xy ��ZKw+¤8A
¥¦���Z �EGIL���
GIL���GIL��§¨
© 2013 IBM Corporation14
IBM Research – Tokyo ����������� ���
���� �GH
����E�2�
–��t��M
–��BC
� Y� ¡-¢t�E�2�
–�h�h?���h��WX�
� E�2�£¤-CPU¥�¦§¨��
–��GH��4��`
if (TBEGIN()) {/* Z<([� \( */if (GIL.acquired)TABORT();
} else {/* ��Z©ª */if (GIL.acquired) {if (16«¤8AKw)GIL���;
elseGIL���§¬J¤8A;
} else if (®¯�ª°���Z) {GIL���;
} else { /* ±²¯�ª°���Z */if (3«¤8AKw)GIL���³
else¤8A;
}}Rubyt@�8A;
© 2013 IBM Corporation15
IBM Research – Tokyo ����������� ���
���� �GHIJKLM
� GIL!|}_��_��$����©ª�$«:
– 5�017)859:�����a��%���Ma���R
� V=U¬ ����-®¯°±d²0³´µ���¶O
�I0��2�·¸-¹º��»¼½�R��j�·¸�-¾�
–��;���K��-��� ¡�MP�
– O¢a���;������£N%��������¤¥-5�017)859:�K¦§¨R�
� ©�ª�«�Ma���Ruby�"#�bK¬v%L�[®_
� ¿ I0��2��� ÀÁ!�h�i��j�·¸�UdÂÃ
– getinstancevariable, getclassvariable, getlocal, send, opt_plus, opt_minus, opt_mult, opt_aref
–q¯°±² ³M��]�ªK´µ%U¶
© 2013 IBM Corporation16
IBM Research – Tokyo ����������� ���
���� �JK=>?LM�NO
if (GIL.acquired)GIL���;
elseTEND();
�h�i��j� ~�
if (--yield_counter == 0) {Z<([� \(�_`;Z<([� \(�]^;
}
������h�i��j�·¸� Ä£
Z<([� \(´�µ¯�¶·��¸�
� A2��2� \]^-¢Å BÆ0�4Ç���È®¯3��-\]^�ºÉ4Ê�Ë�
© 2013 IBM Corporation17
IBM Research – Tokyo ����������� ���
���� P��:��QR
�����?�4�h�i��j�!~�N{$�[�-GÌ �h�i��j�Í-���� ®¯?Î\
�ÍGÏÐ�N{~� ÑÒH2I2J���Ó¨G
�ÔGÏÐ�E�2� H2I2J���Ó¨G
– +·�;� i¸��¢a¹�º�M�»¼½%¾�
–��t��M-[®_%¾�
´1Z<([� \(
¹1Z<([� \(
]^ ]^_`
]^ ]^_` ]^_` ]^_`
VºE�¬w»¼
© 2013 IBM Corporation18
IBM Research – Tokyo ����������� ���
���� P�S1TUV
����Õ4YÖ¥�N{$��h�i��j�Í!×Ø
1. �ÙÍ�Ud.:ͨ�255�!Ú6Ûd�
2. ����Õ4 ¿ 2n!Ü<dE�2�Ý!Þßà
– ¿À�R=>�N;��<59:�-V
– ¿À�R=>�N;��<59:�%+·�;�NV
3. E�2�Ý�áâ�1%�!ã<V��h�i��j�Í!äZd�x 0.75�24å�
4. E�2�Ý�áâ!ã<�Á4¢�Ü�300��h�i��j�!N{UV�Y ����-×Ø~�
© 2013 IBM Corporation19
IBM Research – Tokyo ����������� ���
WT5X�YZLM=YZ�#$ (1/2)
�MæLuç ����!#$./\Ü
–��|}Á ���£M�-�Â�ªWX
�Pthread-����"�7)�4 Ãv
� 0�h0�PQ��ès�t ]`Ä£
– Gn��ÄÅo���������V+58�� Æ�9ÇÈ�É�NÊË�Ìm�9Ç
–������ÍÎ�-�ÏÐÑ��Q�����WX
������Ìm�9Ç$��¾RÑÒÓ��Ô
���! é�VZ4-A2��2�!º&ê\]
�GÐë>4-G¥Gì�\]-Y�í��îÜïu
© 2013 IBM Corporation20
IBM Research – Tokyo ����������� ���
WT5X�YZLM=YZ�#$ (2/2)� H�ðD��Ú6ñ>t ./B�2�����
�����"�7)Õ����;�Ö×
• ½m�¬w+�¾r??=Z¿+256ÀÁ¬JÂ�
� òsóZ
–UV����%s$�Ø�OP�Ñ��WX
–ÙÚ����%s$�Ø�a];��<59:���tÛM
�RubyÜ��-Ý«��t�400MB Þßa±à�áâ
• uN-ÃÄK�1LÅmÆÇ
� ����ôõÉ(rb_thread_t) false sharing
–ãä�����"�7)åæ�çèÑ�OP éê���Ìm�9Ç��� ��Q-ë���WX
�����ìêí�Ìm�9Ç��� +����adZîï�
© 2013 IBM Corporation21
IBM Research – Tokyo ����������� ���
[\]^�_`a
� PthreadöÜ?GIL|}÷ø$�Á4�ù�CD0�!úû
– ð�lñ)-GILKá±à���~�?���M��
– HTM-òóÒô��a-GILKõ� ^±à�~�?��
\ÁöOS�÷a��� !à%ø�ªâ%�
–ù� O¢aK�ú-ûüý�Pthread�Ô%þ �E
�üý��!þ�G��L� setjmp()!�c
– z/OS-setjmp()K+�����IJBC��E
– Ruby�-setjmp()-`� K�`�l��-�f���
© 2013 IBM Corporation22
IBM Research – Tokyo ����������� ���
-./�67
�g�
� �h�i��jk�lm�4nGd
� Ruby GIL4nGd
� HTM45�GIL ST
�Lopq
�pr
© 2013 IBM Corporation23
IBM Research – Tokyo ����������� ���
bc[\Idefg
� Ruby 1.9.3-p194�L�
� z/OS 1.13 UNIX System Services���
– EBCDIC-NØ Õ)�®-RubyK/)����
– �+-5�������-h���;Ñ�miniruby��
� 8�E5.5 GHz zEC12?Lo
– 1�+1Æ����+����
��Ò
– GIL°ð�lñ)-Ruby
– HTM-n (n = 1, 16, 256)°��;��<59:�u�n-1�-��|}��Ì��Ñ�
– HTM-dynamic°vw�;��<59:�uxy
© 2013 IBM Corporation24
IBM Research – Tokyo ����������� ���
h98�� !;�i
� +0�3��,+2�2n
– While��0��������Ñ��� !�"#�b
– HTMK1����GIL ��a12�����11�-_®�,
– 9�#)����-ð������Kûá5-14%���
� Ruby NAS Parallel Benchmarks (NPB) [� �,2012]
–�¥�7�-�"#�b
– 9�#)�������()*������%�g
� GIL���z{�a]����.��/�01K����
� Web�2� �2�32�-��ç
© 2013 IBM Corporation25
IBM Research – Tokyo ����������� ���
Ruby NPB�%�� "� (1/2)CG
00.20.40.60.8
11.21.41.61.8
2
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of threads
Thr
ough
put
(1 =
1 t
hrea
d G
IL)
BT
0
0.5
1
1.5
2
2.5
3
3.5
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of threads
Thr
ough
put
(1 =
1 t
hrea
d G
IL)
FT
00.5
11.5
22.5
33.5
44.5
5
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of threads
Thr
ough
put
(1 =
1 t
hrea
d G
IL)
GILHTM-1HTM-16HTM-256HTM-dynamic
� FT�4.4È-É�
� HTM-dynamic�7'()%��Ũ6Ê�HTM�Ë�ÉÌ�SÍ
� HTM-1�ÌBqÎ5@
� HTM-256�Ì��ZÏ
© 2013 IBM Corporation26
IBM Research – Tokyo ����������� ���
IS
0
0.5
1
1.5
2
2.5
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of threads
Thr
ough
put
(1 =
1 t
hrea
d G
IL)
LU
0
0.5
1
1.5
2
2.5
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of threads
Thr
ough
put
(1 =
1 t
hrea
d G
IL)
MG
0
0.5
1
1.5
2
2.5
3
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of threads
Thr
ough
put
(1 =
1 t
hrea
d G
IL)
SP
0
0.5
1
1.5
2
2.5
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of threads
Thr
ough
put
(1 =
1 t
hrea
d G
IL)
Ruby NPB�%�� "� (2/2)
© 2013 IBM Corporation27
IBM Research – Tokyo ����������� ���
�j��k
� HTM-dynamic-Z<([� \(´�Åж· �ÑE�ÒÓÔ�1%Õ�mÖ×J1�
� =�<?�ØE�ÙÚ�Û+��1
Abort ratios of HTM-dynamic
0
0.5
1
1.5
2
2.5
3
0 1 2 3 4 5 6 7 8 9 10 11 12 13Number of threads
Abo
rt ra
tio (%
)
BTCGFTISLUMGSP
© 2013 IBM Corporation28
IBM Research – Tokyo ����������� ���
CPUl����mn� =�<?�ØE�ÙÚ�Û+��1
– IS�8A²;�79%�ÜÝÞ;��ßàᩪ-â¡J1���cCPU�����ãä�åæç
Cycle breakdowns
0%
20%
40%
60%
80%
100%
BT CG FT IS LU MG SP
Transaction begin/end Successful transactionsGIL acuired Aborted transactionsWaiting for GIL release
© 2013 IBM Corporation29
IBM Research – Tokyo ����������� ���
�j��Oo+CDmn
�����óX ���E�2�£¤ .�!�Z�
– Cache fetch-related + Fetch conflict
Abort categorization by reasons (HTM-dynamic / 8 threads)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
BT CG FT IS LU MG SP
TABORT_instruction
Undetermined_condition
Cache_other
Cache_store-related
Cache_fetch-related
Nesting_depth_exceeded
Program-interruption_condition
Restricted_instruction
Store_conflict
Fetch_conflict
Store_overflow
Fetch_overf low
I/O_interruption
Machine-check_interruption
Program_interruption
External_interruption
Restart_interruption
© 2013 IBM Corporation30
IBM Research – Tokyo ����������� ���
pBqBrs�YZ�tuv�2w+CDmn
�./B�2��� ��(rb_newobj)����02�(gc_lazy_sweep)��� �� !�ÖUdG�
–3¥�%Floatð�l�5;�-�2.0-Flonum�� !
Abort categorization by functionsHTM-dynamic / 12 threads / Cache fetch-related
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
BT CG FT IS LU MG SP
Others
vm_getivar
vm_exec_core
tend
rb_newobj
rb_mutex_sleep...
rb_mutex_lock
libc
gc_lazy_sweep
© 2013 IBM Corporation31
IBM Research – Tokyo ����������� ���
CRubyxJRuby=Java�%y��z�{|de
� CRuby (HTM���) �Java (Java VM������ �����)���
� CRuby������������� !"�#$%&�'�()����
� CRuby*JRuby (+,-%� ���)�./����������01�23
� 456�CRuby*JRuby�78���������9:
Scalability of HTM-dynamic / CRuby
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7 8 9 10 11 12 13Number of threads
Thr
ough
put
(1 =
1 t
hrea
d)Scalability of JRuby (12x Intel Xeon)
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7 8 9 10 11 12 13Number of threads
Thr
ough
put
(1 =
1 t
hrea
d)
Scalability of Java NPB (12x Intel Xeon)
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7 8 9 10 11 12 13Number of threads
Thr
ough
put (
1 =
1 th
read
) BTCGFTISLUMGSP
© 2013 IBM Corporation32
IBM Research – Tokyo ����������� ���
�;�%:"�Q�}�~"�
� ������� ����������-��
� 1����U¥GWXHTM?-&GIL!cG����
� +0�3��,+2�?�5-14% H2I2J��
� H2I2J�� !à
–��|}�-*��5�ÐN���|}-"�
– Pthread-����"�7)�4#-+58�-$��z/OS�Î
© 2013 IBM Corporation33
IBM Research – Tokyo ����������� ���
��
� GIL!HTM?STUVWX LE�� ��-@
– 12�����û34.4�
� Y VZ4[�\]^_`ab-@
– HTM�`��NØ %'�ÏKÚ¥Õ&�) Y�
–WX�z{Ñ�NØ-ÏK¿M'M^äV��
–vw ;��<59:�u�xyÑ�»(h�)*
�"GA2��2�\]?GIL! ����!"&��¢Å �3�h#?$%$�&c'(