JCAL (Java Channel Access Light library)
H. Ikeda, Visual Information Center, Inc. H. Sako, Japan Atomic Energy Agency (JAEA)
EPICS Collaboration MeetingKobe
Oct 2009
Outline• Introduction• Problems of JCA and CAJ• JCAL• Benchmark tests• Summary
Introduction
• JCA – “standard” Java channel access libraries
• JCA-API• 2 implementations
– JCA-JNI» Used in SNS, J-PARC, …
– JCA-CAJ (Pure Java version)» Used in J-PARC
• CAJ – Pure Java implementation is desirable at J-PARC
– Control and commissioning applications are made in Java– Multi-platform (Windows, Linux)
• However, unstable behaviors (especially for CAJ-1.1.3)– Connections fail for a few channels out of a few thousands– Sometimes, an application with CAJ can never be finished
(CAJContext.dispose() does not end)
• JCA-API and CAJ codes were investigated
Problems/vulnerability in JCA-API and CAJ
– Thread safety is broken • Insufficient or inconsistent synchronizations• Examples of thread safety breaking
– Invoking Object.wait method without a conditional loop» Should be avoided according to javadoc of Java.lang.Object
– Starting thread in its constructor• Could cause unexpected runtime problems
– Vulnerable internal structure• Broken encapsulation
– Returning mutable fields without defensive copy– Exposing the reference this in the constructor
• Strong inter-dependencies among packages– Hard to maintain or modify
– Too concrete implementation in API• API should be more abstract (for maintenance)• Users are forced to use problematic implementations
– DBR (fundamental data) is not immutable nor thread safe. – Sub-classes inherit it.
– Other (minor) problems in JCA-API• Non-standard naming styles• Hardwired numbers for network handling
– Other (minor) problems in CAJ• finalizer is used.
– It is unclear whether or when it is executed by Java-VM.
These problems are hard to repair without major redesign.
We tried to develop new API and library.
Problems/vulnerability in JCA-API and CAJ
JCAL(Java Access Light library)
• Limited functionality and simplified architecture – New API with interfaces
• By separating clearly interfaces and implementations, maintenance and improvement become easier
– Pure java• Java 5.0
• Apache Commons
– Client function only– No Repeater implemented
• JCAL does not start up an external Repeater process (as CAJ does).
• If a Repeater is running, JCAL uses it.
• Convenient adaptor library for JCA-API– JCA-JCAL– Existing apps using JCA-JNI or JCA-CAJ can use JCAL
• By setting “jp.go.jaea.jcal.jca.JcalContext" to JCALibrary.createContext
Thread safe design of JCAL<<thread safe>>
Context<<thread safe>>
Manager
<<thread safe>>Channel
Client
<<thread safe>>Monitor Subscription
ClientTransport
BroadcastTransport
*
**
ClientManager
RepeaterTransport
*
*
cre
ate
cre
ate
*
internal threadouter threads
Manager controls the internal structure
ClientManager manager for Clients and Subscriptions
Client inner class of Channel
Subscription inner class of Monitor
RepeaterTransport communicates with Repeater (UDP)
BroadcastTransport communicates by broadcast (UDP)
ClientTransport communicates with server (TCP)
Internal single threadOuter threads(user threads)
Thread-safe• Single thread architecture• Immutable fundamental data structure (Dbr)
API
Dbr fundamental data
Context library environment
Channel channel
Monitor monitor
Benchmark test• Process time for connect, get, and put
– JCAL– JCA-JCAL– JCA-JNI single-threaded (2.3.2)– JCA-JNI thread-safe (2.3.2)– JCA-CAJ (1.1.5b)
• Number of channels– 4~4000
• Environment– A Soft IOC: Dell dimension 4500C
CPU: Pentium 4 2.4 GHz– Client: Dell PowerEdge 830
CPU: Pentium D 3 GHz x 2 Memory: 1GB• Conditions
– Single thread is used for tests.– Total time to process all channels is measured– 10 sec wait time before each test to avoid effects of a previous test– System.gc() executed before the test (to avoid GC during the test)– Iterate a test 10 times in a VM. Take average time of 2nd to 10th tests.
• Not to count class loading time.– Async mode with call-back listeners
Connect Test
• JCAL: 0.038 msec / ch • JCAL~JNI(single)<JCA-JCAL<JNI(thread-safe)<CAJ• The results may depend on algorithm to send broadcast messages
0
50
100
150
200
250
300
350
400
450
0 500 1000 1500 2000 2500 3000 3500 4000 4500
number of channels
pro
ce
ss
tim
e (
ms
ec
)
JCAL
JCA-JCAL
CAJ
JNI (single)
JNI (thread-safe)
Slower
Faster
Get Test
• JCAL : 0.013 msec / ch• CAJ~JNI(single)<JCAL~JCA-JCAL<JNI(thread-safe)
0
50
100
150
200
250
300
350
0 500 1000 1500 2000 2500 3000 3500 4000 4500
number of channels
pro
cess
tim
e (m
sec)
JCAL
JCA-JCAL
CAJ
JNI (single)
JNI (thread-safe)
Put Test
• JCAL : 0.038 msec / ch• JCA-JCAL<JCAL<CAJ<JNI(single)<JNI(thread-safe)
0
50
100
150
200
250
0 500 1000 1500 2000 2500 3000 3500 4000 4500
number of channels
pro
cess
tim
e (m
sec)
JCAL
JCA-JCAL
CAJ
JNI (single)
JNI (thread-safe)
Benchmark Test Summary
Average time in msec / channel
test JCAL JCA-JCAL CAJ JCA-JNI single
JCA-JNI thread-safe
connect 0.037 0.051 0.099 0.035 0.072
get 0.011 0.012 0.0061 0.0074 0.074
put 0.038 0.022 0.029 0.044 0.048
• JCAL and JCA-JCAL are comparable with JCA-JNI (single-threaded) and CAJ.
• JCA-JNI (thread-safe) is slower than JCA-JNI(single-threaded).– Overhead for thread safety
Summary
• JCA-API and CAJ codes are found to have problems in thread-safety and robustness.
• To overcome the problems, a newly designed Java CA client library, JCAL, has been experimentally developed which is thread safe in a single-threaded architecture.
• Benchmark tests of JCAL show comparable performance with JCA-JNI and CAJ.
• We continue stability and reliability tests of JCAL in J-PARC control system.– So far no problems found with beam monitor applications
Sync and Async
• Sync and asyn definitions are different between JCA and JCAL– Sync (get)
• JCA: pendIO(t) blocks until the transmission is complete within duration t. If not complete, the timeout error is thrown.
• JCAL: Similar to JCA. “get” blocks until the transmission has been complete or until timeout occurs.
– Sync (put)• JCA: it returns immediately after request has been sent,
pendIO(t) does not wait for response from the server.• JCAL: Different from JCA. It blocks until it receives the
response from the server.
– Async (get/put)• JCA and JCAL (common) : it does not block but returns
immediately. Completion of transmission is notified by the registered listener.
• We use async mode for JCA-JNI/CAJ/JCAL comparison
Fundamental Data Structure (Dbr)
• Dbr– unit data of channel
access
• DbrType– data type
• Attribute– a marker interface for
attribute
• StsAttribute– specific interfaces for
each kind of attribute
<<interface>>Attribute
<<interface>>StsAttribute
<<interface>>CtrlAttribute
<<interface>>TimeAttribute
<<interface>>LabelsAttribute
<<interface>>StsAckAttribute
marker interface
<<enum>>DbrType
<<interface>>Dbr
stringValue() shortValue() ...
STRING SHORT ... STS_STRING STS_SHORT ...
getStatus() getSeverty()
<<interface>>GrAttribute
getUnits() getUpperDispLimit() ... getPrecision()
getUpperCtrlLimit() getLowerCtrlLimit()
getTimeStamp() getLabels() getAckT() getAckS()
Thread-safe Implementation of Dbr sub-classes
How to avoid congestion
• CAJ– Reduces frequency to send broadcast messages
to search for channels in servers as a function of time.
• Similar logic as TCP congestion control• But there is no way to know how many packets are really
lost in CA, since we don’t know how many servers exist.
• JCAL– Sets small buffer size for UDP sockets
• Effectively reduce rate of broadcasts
Wait() method without a conditional loopIn javadoc of java.lang.Objectpublic final void wait() throws InterruptedException
/* Causes current thread to wait until another thread invokes the notify() or notifyAll() method for this object…*/
/* A thread can also wake up without being notified, interrupted, or timing out, a so-called spurious wakeup. While this will rarely occur in practice, applications must guard against it by testing for the condition that should have caused the thread to be awakened, and continuing to wait if the condition is not satisfied. In other words, waits should always occur in loops, like this one: */
synchronized (obj) { while (<condition does not hold>) obj.wait(); ... // Perform action appropriate to condition}
Exposing this objectStarting thread in constructer
public class Unsafe {
public Unsafe(SomeOtherClass someotherclass) {// Unsafe because SomeOtherClass has the
// "this“ reference and may change its properties // before the constructor completes.
someotherClass.registerObject(this);
// Unsafe because the "this" object will be //visible from the new thread before the//constructor completesThread thread = new Thread(this);thread.start();
}}
Defensive copying• A mutable object is simply an object which can change its state after construction.
For example, StringBuilder and Date are mutable objects, while String and Integer are immutable objects. A class may have a mutable object as a field. There are two possible cases for how the state of a mutable object field can change :
– its state can be changed only by the native class - the native class creates the mutable object field, and is the only class which is directly aware of its existence
– its state can be changed both by the native class and by its callers - the native class simply points to a mutable object which was created elsewhere
• Both cases are valid design choices, but you must be aware of which one is appropriate for each case. If the mutable object field's state should be changed only by the native class, then a defensive copy of the mutable object must be made any time it is passed into (constructors and set methods) or out of (get methods) the class. If this is not done, then it is simple for the caller to break encapsulation, by changing the state of an object which is simultaneously visible to both the class and its caller.
• Example • Planet has a mutable object field fDateOfDiscovery, which is defensively copied in
all constructors, and in getDateOfDiscovery. Planet represents an immutable class, and has no set methods for its fields. Note that if the defensive copy of DateOfDiscovery is not made, then Planet is no longer immutable!
Returning mutable field w/o defensive copy
public final class Planet {/**A mutable object field. In this case, the state of this mutable field* is to be changed only by this class. (In other cases, it makes perfect* sense to allow the state of a field to be changed outside the native* class; this is the case when a field acts as a "pointer" to an object* created elsewhere.)*/
private final Date fDateOfDiscovery;
// BAD function// Returns a mutable object without defensive copy// The caller gets a direct reference to the internal field. This is usually dangerous,// since the Date object state can be changed both by this class and its caller.// That is, this class is no longer in complete control of fDate.
public Date getDateOfDiscovery() {return fDateOfDiscovery;
}
/*** GOOD function* Returns a defensive copy of the field.* The caller of this method can do anything they want with the* returned Date object, without affecting the internals of this* class in any */
public Date getDateOfDiscovery() {return new Date(fDateOfDiscovery.getTime());
}}
Leader-Followers Pattern
ServerConnection node
Serialize
Future Pattern
• A method which does a job if it is ready, or waits until it becomes ready if not.
– DbrFuture Channel.asyncGet()
– CaWaiter Channel.asyncPut()
Exposing this object// Interface ExceptionReporterpublic interface ExceptionReporter {
public void setExceptionReporter(ExceptionReporter er);public void report(Throwable exception);
}// Class ExceptionReporterspublic class ExceptionReporters implements ExceptionReporter {
public ExceptionReporters(ExceptionReporter er) {/* Carry out initialization */
er.setExceptionReporter(this); // incorrectly publishes the "this" reference}public void report(Throwable exception) { /* default implementation */ }public final void setExceptionReporter(ExceptionReporter er) { /* sets the reporter */ }
}// Class MyExceptionReporter derives from ExceptionReporterspublic class MyExceptionReporter extends ExceptionReporters {private final Logger logger;public MyExceptionReporter(ExceptionReporter er) {super(er); // calls superclass's constructorlogger = Logger.getLogger("com.organization.Log");}public void report(Throwable t) {logger.log(Level.FINEST,"Loggable exception occurred",t);}}
Repeater (in server)
• Technical problem to implement it in pure Java.– A packet from client to server via Repeater has Repeater’s IP/port as the
source address. In order the server to send a response to the client, Repeater must replace the source address to that of the client, but it is not possible in Java.
– No parameter for the source IP address and port in the CA protocol.
Server1
Server2
Repeater
Client
Source:client
Source:Repeater (w/o packet modif.)
Source:client (w/packet modif.)
Reply
broadcast
Sync and Async• Sync and asyn definitions are different between JCA
and JCAL– Sync (get)
• JCA: pendIO(t) blocks until the transmission has been complete within duration t. If not complete, the timeout error is thrown.
• JCAL: “get” blocks until the transmission has been complete or until timeout occurs.
– Sync (put)• JCA: it returns immediately after request has been sent,
pendIO(t) does not wait for response from the server.• JCAL: it blocks until it receives the response from the server.
– Async (get/put)• JCA: it does not block but returns immediately. Completion of
transmission is notified by the registered listener.– preemptive_callback=true
» The listener’s callback is only while event-loop of Context is running.
• JCAL: similar to JCA– (no flag preemptive_callback)
0
1000
2000
3000
4000
5000
6000
0 500 1000 1500 2000 2500 3000 3500 4000 4500
number of channels
dura
tion (m
s)
J CAL(async)
J CA- J CAL(sync)
CAJ (sync)
J CA-Single(sync)
J CA-Multi(sync)
J CA- J CAL(async)
CAJ (async)
J CA-Single(async)
J CA-Multi(async)
0
50
100
150
200
250
300
350
400
450
0 500 1000 1500 2000 2500 3000 3500 4000 4500
number of channels
dura
tion
(m
s)
J CAL(async)
J CA- J CAL(sync)
CAJ (sync)
J CA-Single(sync)
J CA-Multi(sync)
J CA- J CAL(async)
CAJ (async)
J CA-Single(async)
J CA-Multi(async)
Connect TestIdeal
A Soft IOC: Dell dimension 4500C Pentium 4 2.4 GHzClient: Dell PowerEdge 830 CPU: Pentium D 3 GHz x 2 Memory: 1GB
realisticBeam monitor IOCs: VME Advme7501ARClient: IBM ThinkPad Lenovo R60 CPU: Genuine Intel T2300 1.7GHz x 2 Memory: 1GB
Soft IOC : 0.038 msec / ch (JCAL) JCAL=JCA(single)<JCA-JCAL<JCA(multi)<CAJMonitor IOCs : 0.28 msec / ch (JCAL) JCA-JCAL=JCAL<JCA<CAJ
Factor 7 longer
Preliminary
0
50
100
150
200
250
300
350
400
450
0 500 1000 1500 2000 2500 3000 3500 4000 4500
number of channels
dura
tion
(m
s)
J CAL(async)
J CA- J CAL(sync)
CAJ (sync)
J CA-Single(sync)
J CA-Multi(sync)
J CA- J CAL(async)
CAJ (async)
J CA-Single(async)
J CA-Multi(async)
Get Test
0
50
100
150
200
250
300
350
0 500 1000 1500 2000 2500 3000 3500 4000 4500
number of channels
dura
tion
(m
s)
J CAL(async)
J CA- J CAL(sync)
CAJ (sync)
J CA-Single(sync)
J CA-Multi(sync)
J CA- J CAL(async)
CAJ (async)
J CA-Single(async)
J CA-Multi(async)
A Soft IOC Beam monitor IOCs
soft IOC : 0.013 msec / chCAJ=JCA(single)<JCAL=JCA-JCA<JCA(multi)
Monitor IOCs: 0.040 msec /ch (JCAL)
SimilarPreliminary
0
200
400
600
800
1000
1200
1400
1600
0 500 1000 1500 2000 2500 3000 3500 4000 4500
number of channels
dura
tion
(m
s)
J CAL(async)
J CA- J CAL(sync)
CAJ (sync)
J CA-Single(sync)
J CA-Multi(sync)
J CA- J CAL(async)
CAJ (async)
J CA-Single(async)
J CA-Multi(async)
Put Test
0
50
100
150
200
250
0 500 1000 1500 2000 2500 3000 3500 4000 4500
number of channels
dura
tion
(m
s)
J CAL(async)
J CA- J CAL(sync)
CAJ (sync)
J CA-Single(sync)
J CA-Multi(sync)
J CA- J CAL(async)
CAJ (async)
J CA-Single(async)
J CA-Multi(async)
Beam monitor IOCsA Soft IOC
Note: CAJ, JCA(“sync”) are not really synchronous. They do not wait for server response.•Soft IOC : 0.038 msec / ch (JCAL)•Beam monitor IOCs : 0.16 msec / ch (JCAL)•async : JCA-JCAL < JCAL < CAJ < JCA(single) < JCA(multi)
PreliminaryFactor 4 longer
Benchmark test• Process time for connect, get, and put
– JCAL– JCA-JCAL– JCA-JNI single-threaded (2.3.2)– JCA-JNI multi-threaded (2.3.2)– JCA-CAJ (1.1.5b)
• Number of channels– 4~4000
• Conditions– 10 sec wait time before each test to avoid effects of a previous test– System.gc() executed before the test (to avoid GC during the test)– Iterate a test 10 times in a VM. Take average time of 2nd to 10th
tests.• Not to count class loading time.
– Async mode with call-back listeners
IdealA Soft IOC: Dell dimension 4500C Pentium 4 2.4 GHzClient: Dell PowerEdge 830 CPU: Pentium D 3 GHz x 2 Memory: 1GB
RealisticBeam monitor IOCs: VME Advme7501ARClient: IBM ThinkPad Lenovo R60 CPU: Genuine Intel T2300 1.7GHz x 2 Memory: 1GB
Benchmark Test SummaryAverage time in msec / channel
environment test JCAL JCA-JCAL CAJ JCA-JNI single-th.
JCA-JNI multi-th.
Soft IOC connect 0.037 0.051 0.099 0.035 0.072
Mon IOCs connect 0.31 0.34 1.01 0.59 0.63
Soft IOC get 0.011 0.012 0.0061 0.0074 0.074
Mon IOCs get 0.040 0.071 0.026 0.031 0.101
Soft IOC put 0.038 0.022 0.029 0.044 0.048
Mon IOCs put 0.16 0.37 0.15 0.15 0.18
Summary– JCAL is comparable with JCA-JNI (single-threaded) and CAJ.
• JCA-JNI (single-threaded) is fastest.– Why it takes much longer time in real control VME IOCs?
• CPU of client/servers? Network conditions? To be investigated.• JCAL is slightly slower than JCA-JNI.• JCAL is faster than CAJ in connection, but slightly slower in get/put.
– JCA-JCAL is slightly slower than JCAL.– JCA-JNI (multi-threaded) is slow.