performance in an api-driven world - jaxenter · 2018-06-27 · performance in an api-driven world...

The digital magazine for enterprise developers

Issue August 2015 | presented by www.jaxenter.com #46

The future of the cloudDatabases are where it‘s at

Performance in an API-driven worldAnd why REST APIs are changing the game

The speed of Java 8 lambdasand way more tips, tricks and tutorials

©iStock

pho

to.com/VikaS

uh

Are they for everyone?

Microservices

Editorial

2www.JAXenter.com | August 2015

The hype shows no sign of subsiding. Microservices, DevOps, Continuous Delivery – the latest trends in IT are truly changing how businesses innovate. And most of all, they are putting the programmer at the heart of good business strategies. But there’s a � ipside. The gravity of hype can pull many organisations towards concepts they simply aren’t ready for.

So as you’re watching the herd of enterprises � ocking towards DevOpsian IT, it’s good to stand back and have a think. Are microservices really for everyone? How is it that Etsy is making such strides in continuous delivery with a monolithic system? And are there really no bumps on the road to continuous delivery?

Looking beyond the hype

These are exactly the kinds of questions we’ll be � nding answers to at the JAX London conference in October. And with JAX London season just around the corner, we’ve asked a selection of our conference speakers to give us a sneak preview of what we’ll be learning at the JAX London. From database testing and smart benchmarking to microservices reality checks and continuous delivery tips, this is a special issue for anyone looking to update their IT approach. We’re even going to learn what Parisian history can teach us about software architecture!

Coman Hamilton, Editor

Inde

x Microservices: Storm in a teacup, or teacups in a storm? 4More hype, anyone? Holly Cummins

A tale of two teams 6Smoothing the continuous delivery path Lyndsay Prewer

Fielding, Fowler and Haussmann 7Network-based architectures: learning from Paris Eric Horesnyi

Let’s talk speed 10Java performance tutorial – How fast are the Java 8 streams? Angelika Langer

A world beyond Java 13Coding for desktop and mobile with HTML5 and Java EE 7 Geertjan Wielenga

JEP 222 14JShell, the Java 9 REPL: What does it do? Werner Keil

MySQL is a great NoSQL 16Making the right database decisions Aviran Mordo

Business intelligence must evolve 17Rethinking how we think about self-service cloud BI Chris Neumann

The future of cloud computing 22Database solutions Zigmars Raascevskis

Testing the Database Layer 24Dos and Don’ts Colin Vipurs

Private cloud trends 27Financial services PaaS and private clouds: Managing and monitoring disparate environmentsPatricia Hines

Intelligent traffi c management in the modern application ecosystem 29The future of traf� c management technology Kris Beevers

Trade-offs in benchmarking 31Cost, scope and focus Aysylu Greenberg

Common threats to your VoIP system 32Five tips to stay secure Sheldon Smith

Why reusable REST APIs are changing the game 34No more custom API mazesBen Busse

Considering the performance factor in an API-driven world 38Milliseconds matterPer Buer

Hot or Not


Programming in Schwarzenegger quotesProgramming in C can be fun and all. But wouldn’t you rather make your com-mands in the voice of the Terminator? Who wouldn’t want to return a string with “I’ll be back”? “Listen to me very carefully”, “Hasta la vista, baby” and “Do it now” are clearly way more effective than boring old DeclareMethod, EndMethod-Declaration and CallMethod. And frankly, this language is simply far cooler than esoteric alternatives like brainfuck, Hodor-lang and Deadfish. It may not be the youngest language on the scene anymore, but old age certainly hasn’t stopped Arnie from being a bad-ass Terminator.

The Stack Overflow trollsEarlier this year, Google was forced to shut down Google Code, its version of GitHub, because the product had become overrun with trolls. Meanwhile, the popular IT Q&A site Stack Overflow has been losing dedicated members as a re-sult of its toxic moderator behaviour, over-embellished points system and systemic hatred for newbies. However, some users are debating whether the Stack Overflow problem extends beyond a minority of trolls, and encompasses a fundamental hostility at the heart of the website’s culture of asking questions. Should novice programmers be afraid of being laughed at when asking beginner questions?

Java without the Unsafe classIt’s being referred to as a “disaster”, “misery” and even the “Javapocalypse”. Oracle has announced the projected removal of the private API used by almost every tool, infrastructure software and high performance library built using Java. Java 9 is showing the door to the sun.misc.Unsafe class. As far as ideas go, this one doesn’t sound so good – at least on paper. It’s particularly the library developers that are annoyed by this change. Numerous libraries like Netty will collapse when Oracle pulls out this Jenga block from the Java release. But then again, as the name of the class suggests, it is “unsafe”. Change is a bitch.

Developer grief We don’t want developers to be sad. Or angry. Or decaffeinated. But conference speaker and developer Derick Bailey has shown us the rocky road that many devs go down when it comes to the human side of software development. The five stages of developer grief is the account of the journey that many pro-grammers experience when writing code and market-ing their product. Bailey explains how denial, anger, bargaining, depression and acceptance all leave their mark on initially energetic devs, which results in them feeling pretty e-motion sick from the ride. Feel-ings suck.

Oracle vs. Google We know it’s Summer. But that doesn’t stop there from being quite a lot of not-so-hot things happening in IT right now, like the final deci-sion in the Oracle vs. Google debacle. Most of the sensible parts of the software industry have been hoping for a reduction in the destructive copyright crackdown on Java usage. But with President Obama himself coming down on Or-acle’s side coupled with the US Supreme Court decision against a review of their ruling, there’s little hope of a happy ending to the Android API saga.

Microservices

www.JAXenter.com | August 2015 4

by Holly Cummins

Folks, we have reached a new phase on the Microservices Hype Cycle. Discussion of the microservices hype has over-taken discussion of the actual microservices technology. We’re all talking about microservices, and we’re all talking about how we’re all talking about microservices. This article is, of course, contributing to that cycle. Shall we call it the checkpoint of chatter?

Let’s step back. I think we’re all now agreed on some basic principles. Distributing congealed tea across lots of teacups doesn’t make it any more drinkable; microservices are not a substitute for getting your codebase in order. Microservices aren’t the right fit for everyone. On the other hand, microser-vices do encourage many good engineering practices, such as clean interfaces, loose coupling, and high cohesion. They also encourage practices that are a bit newer, but seem pretty sen-sible, such as scalability through statelessness and develop-ment quality through accountability (“you write it, you make it work in the field”).

Many of these architectural practices are just good software engineering. You’ll get a benefit from adopting them – but if you haven’t already adopted them, will you be able to do that along with a shift to microservices? A big part of the micro-services debate now centres on the best way to transition to microservices. Should it be a big bang, or a gradual peeling of services off the edge, or are microservices something which should be reserved for greenfield projects?

I’m part of the team that writes WebSphere Liberty. As a super-lightweight application server, we’re more of an en-abler of microservices than a direct consumer. However, we have experience of a similar internal transformation. We had a legacy codebase that was awesome in many ways, but it was pretty monolithic and pretty big. We needed to break it up, without breaking it. We knew we could become more modular by rebasing on OSGi services, a technology which shares many characteristics (and sometimes even a name) with microservices. OSGi services allow radical decoupling, but their dynamism can cause headaches for the unwary. What worked for us was writing a brand new kernel, and adapting our existing libraries to the new kernel one by one.

Thinking about failure is critical. Imagine a little micro-service teacup, bobbing along in those rough network wa-ters, with occasional hardware lightning strikes. Not only is failure a possibility, it’s practically a certainty. Tolerance for failure needs to be built in at every level, and it needs to be exercised at every stage of testing. Don’t get too attached to any particular service instance. This was one of our biggest lessons along the way. We made sure our design ensured that code had the framework support to consume services in a robust way, even though they were liable to appear and dis-appear. Along the way, we discovered that many of our tests, and some of our code, made assumptions about the order in which things happened, or the timing of services becoming available. These assumptions were inevitably proved wrong, usually at 1 am when we were trying to get a green build.

More hype, anyone?

Microservices: Storm in a teacup, or teacups in a storm? Somehow, the buzz surrounding microservices has us believing that every single employee and enterprise must break up their monolith empires and follow the microservices trend. But it’s not everyone’s cup of tea, says JAX London speaker Holly Cummins.

Microservices

The human implications of microservicesAlthough we tend to talk about the technological implica-tions of microservices, it’s important to think about the hu-man implications, too. Not everyone is comfortable coding to cope with services dropping in and out of existence, so you may fi nd you end up jettisoning some people along with the monolith. Build in a period of adjustment and educa-tion, and remember to take time to develop new skills as well as new code. By the time most of our team shifted to service-oriented development, we had a beta which clearly demonstrated how well the new model worked. The cool-ness of where we were going was a good compensation for occasional 1am head-scratching over unexpected behaviour.

Microservices: From dream to reality in an hour

Hear Holly Cummins speak at the JAX London: Are microservices a wonder-pattern for rescuing intractably complex applications? Or are they just a restatement of the software engineering best practi-ces we all should be following anyway? Or something in between? How do they work? How should they be written? What are the pitfalls? What are the underpinning technologies?

What I fi nd exciting about the microservices discussion is how it’s making us think about architectural patterns, team or-ganisation, fault tolerance, and the best way to write code and deliver services. That’s got to be a good thing, even if micros-ervices themselves don’t end up being everyone’s cup of tea.

Holly Cummins is a senior software engineer developing enterprise mid-dleware with IBM WebSphere, and a committer on the Apache Aries pro-ject. She is a co-author of Enterprise OSGi in Action and has spoken at Devoxx, JavaZone, The ServerSide Java Symposium, JAX London, GeeCon, and the Great Indian Developer Summit, as well as a number of user groups.

“A big part of the micro-services debate now centres

on the best way to transi-tion to microservices.”

Advert

CD and CI


by Lyndsay Prewer

To paraphrase Wikipedia, Continuous Delivery is a software engineering approach that produces valuable software in short cycles and enables production releases to be made at any time. Continuous Delivery is gaining recognition as a best practice, but adopting it and iteratively improving it is challenging. Giv-en the diversity of teams and architectures that do Continuous Delivery well it’s clear that there is no single, golden path.

This article explores how two very different teams success-fully practiced and improved Continuous Delivery. Both teams were sizeable and mature in their use of agile and lean prac-tices. One team chose microservices, Scala, MongoDB and Docker on a greenfield project. The other faced the constraints of a monolithic architecture, legacy code, .NET, MySQL and Windows.

Patterns for successful practiceFrom observing both teams, some common patterns were vis-ible that contributed to their successful Continuous Delivery.

Continuous Integration that works: Continuous Integration (CI) is the foundation that enables Continuous Delivery. To be a truly solid foundation though, the CI system must maintain good health, which only happens if the team exercise it and care for it. Team members need to be integrating their changes regularly (multiple times per day) and responding promptly to red builds. The team should also be eliminating warnings and addressing long running CI steps. These important behav-iours ensure that release candidates can be created regularly, efficiently and quickly. Once this process starts taking hours instead of minutes, Continuous Delivery becomes a burden instead of an enabler.

Automated tests: Managing the complexity of software is extremely challenging. The right mix of automated tests helps address the risk present when changing a complex system, by identifying areas of high risk (e.g. lack of test coverage or broken tests) that need further investigation. When practicing automated testing, it’s important to get the right distribution of unit, integration and end-to-end tests (the well documented “test pyramid”).

Both teams I worked with moved towards a tear-drop dis-tribution: a very small number of end-to-end tests, sitting on top of a high number of integration tests, with a moder-ate number of unit tests at the base. This provided the best balance between behavioural coverage and cost of change,

which, in turn, allowed the risk present in a software incre-ment to be more easily identified.

Low cost deployment (and rollback): Once a release candidate has been produced by the CI system, and the team is happy with it’s level of risk, one or more deployments will take place, to a variety of environments (normally QA, Staging/Pre-Production, Production). When practicing Continuous Delivery, it’s typical for these deployments to happen multiple times per week, if not per day. A key success factor is thus to minimise the time and effort of these deployments. The microservice team were able to reduce this overhead down to minutes, which en abled multiple deployments per day. The monolith team reduced it to hours, in order to achieve weekly deployments.

Regardless of how frequent production deployments hap-pen, the cost and impact of rolling back must be tiny (sec-onds), to minimise service downtime. This makes rolling back pain-free and not a “bad thing” to do.

Monitoring and alerting: No matter how much testing (manual or automated) a release candidate has, there is al-ways a risk that something will break when it goes into Pro-duction. Both teams were able to monitor the impact of a release in near real-time using tools such as Elastic Search, Kibana, Papertrail, Splunk and NewRelic. Having such tools easily available is great, but they’re next to useless unless peo-ple look at them, and they are coupled to automated alerting (such as PagerDuty). This required a culture of “caring about Production”, so that the whole team (not just Operations, QA or Development) knew what “normal” looked like, and no-ticed when Production’s vital signs took a turn for the worse.

ConclusionThis article has highlighted how different teams, with very different architectures, both successfully practiced Continu-ous Delivery. It’s touched on some of the shared patterns that have enabled this. If you’d like to hear more about how their Continuous Delivery journey, including the different block-ers and accelerators they faced, and the ever present impact of Conway’s Law, then I’ll be speaking on this topic at JAX London on 13-14th October 2015.

Smoothing the continuous delivery path

A tale of two teams Continuous Delivery is gaining recognition as a best practice, but adopting it and itera-tively improving it is challenging.

Lyndsay Prewer is an Agile Delivery Lead, currently consulting for Equal Ex-perts. He focuses on helping people, teams and products become even more awesome, through the application of agile, lean and systemic practices. A former rocket-scientist and software engineer, over the last two decades he’s helped ten companies in two hemispheres improve their delivery.

Architecture


by Eric Horesnyi

As developer or architect, we often have to communicate � ne concepts of network or system architecture to “decision-mak-ers”. In my case, I have been using the smart city analogy for twenty years. And to celebrate the 25th birthday of the Web, I propose to draw an analogy in depth between a designed city, Paris and the Web. Going through Fielding’s thesis, we will compare Paris to the Web in terms of constraints, inherited features, architectural style choices and � nally assess whether these choices meet the objectives. All these with a focus on a transformational period of Paris: 1853–1869 under Hauss-mann as Architect, with an approach worth applying to the many large corporate information systems looking to adopt a microservice style, as proposed by Fowler and Newman.

Here are the � rst two episodes out of seven that we will cover during the session at the upcoming JAX London. Our audience is, by design, either software experts interested in city archi-tecture to illustrate the beauty of HTTP Rest and Continuous Delivery, or anybody wanting to understand the success of the Web, and get acquainted to key web lingos and concepts.

EPISODE I: Who was Haussmann, and his challenge for Paris?Eugene Haussmann was a civil servant chosen by Napoleon III to accelerate the modernisation of Paris in 1853. He reigned over Paris architecture for sixteen years. When he took of� ce, Paris UX was awful, probably worth less than a star on the Play or Apple store: high dropout rate (cholera stroke every � ve years with 10,000 dead each time), servers (houses) were congested (up to 1 person per square metre – US translation: 100 people per townhouse � oor), and datacentres would work only intermittently: no power (gas, water), no cooling (sewage). No cache (cellar), no proxy (concierge), no streaming (subway) … a UX nightmare.

Outside was even worse: ridden with cybercrime (you get it) in obscure streets, slow with narrow access lines (streets) and non-existent backbones (boulevards), without a shared proto-col for polling traf� c (sidewalk for pedestrians) or streaming (subway), and no garbage collection (gotcha). Worse, when a user would go from one page to another, TLP was terri-ble because of these congested un-protocoled lines, but they would come home with vi-ruses by lack of continuous delivery of patches to the servers (air circulation and sun down these narrow streets hidden by overly high buildings).

To top it off, service was fully interrupted and redesigned regular-ly (Revolutions in 1789, 1815, three Glorieuses in 1830, 1848), with-out backward compat-ibility. Although users would bene� t from

Network-based architectures: learning from Paris

Fielding, Fowler and Haussmann For all the great strides that IT is taking to bring us to better futures faster, it turns out that everything we need to know about the development of the web can be learned from the his-tory of urban Paris.

Eugene Haussmann was a civil servant chosen by

water), no cooling (sewage). No cache (cellar),

19,000 deaths of Cholera in Paris, 1832

would bene� t from

Congested servers, crowded

apartments in Paris

Eugène Haussmann

Architecture


these changes in the long run, they defi nitely did not appreci-ate the long period of adaptation to the new UI, not to men-tion calls and escalations to a non-existent service desk (votes for poor and women). Well, actually, these small access lines made it easy to build DDOS attacks (barricades), a feature the business people did not like (Napoleon III).

EPISODE II: What properties did Haussmann select for his system?Some elements of style that Haussmann ultimately selected were actually imposed upon him by his boss, Napoleon III, nephew of Napoleon. My intention here is defi nitely not to write an apology of Napoleon who spread war for years across Europe, just to point to one feature he introduced in the Process: the Code Civil. The Code Civil details the way people can interact together, things they can do and can’t. It came out of heated debates during the French Revolution. Its publication had an impact similar to the CERN initial paper on HTTP in 1990. Code Civil rules citizens life the same way HTTP is the protocol governing our exchanges on the Web.

A fundamental principle in the Code Civil is the separation of concern: it defi nes what is a citizen (Client), and separates its interests and interaction from companies, associations and the state (servers .com, .org or .gov). Any mixing of interest is unlawful (misuse of social assets), pretty much like HTTP is by design Client Server and Stateless. This also means that a server cannot treat two clients differently (equality), or two servers unequally (this links to the current debate on Net Neutrality):

•Client-Server: the most fundamental concept allowing for a network-based architecture: the system is considered as a set of services provided by servers (shops, public infrastructure services, associations) to clients (citizens). Once roles and possible interactions are defi ned, each can evolve indepen-dently: a citizen can grow from student to shop-owners, a grocery store to a juice provider. People living in a building are not forced to buy their meat from the store in the same building, and a store-owner may practice whatever price he wants. This principle of separation of concern allows for citizens freedom to choose, and for entrepreneurial spirit (ability to create, invest, adapt) to service citizens (with food, entertainment, social bonds through associations or culture).

•Stateless: no session state, request from client to server must contain all necessary information for server to process. Whoever the citizen, the server must serve him without knowing more about him than other citizens. No mixing of genres is allowed between client and server: all citizens are treated equal, and must be serviced the same way for the same request. Citizens cannot be bound to a single service by services creating dependencies or local monopoly over their services. This separation of concerns from one building to another, and one person to a com-pany is a foundation of citizen life even today (usually).

Another feature Haussmann had to take into consideration was the addressing scheme of Paris, defi ned in 1805, similar to our DNS scheme used for HTTP requests:

Pre-Haussmannian street in Paris, unsafe, ridden with viruses, no cooling

Pre-Haussmannian street in Paris, “Liberty Leading the People,

July 28th, 1830” by Delacroix

Paris numbering scheme, example on

an Haussmanian building

First Page of Code Civil, 1804

unequally (this links to the current debate on Net Neutrality):

Napoleon III describing his mission to Haussmann, 1853

Low-entry barrier for citizens,

Montmartre example of a popular neighbourhood in Paris

Architecture


•The main backbone, la Seine, defi nes the start of every street (our root)

•Each association (.org), state organization (.gov) and com-pany (.com) can defi ne its scheme within its own domain

Wanting to build on Code Civil/HTTP, Napoleon III’s ego did not tolerate his capital city to be less civilized than Lon-don or emerging New York. In terms of performance, his network-based architecture needed to do the job on:

•Network (streets) performance: throughput (fast pedestri-ans and carriages), small overhead (no need for a citizen to walk the street with a policeman to protect himself), band-width (wide street)

•User-perceived performance: latency (ability to quickly reach ground fl oor of buildings), and completion (get busi-ness done)

•Network-effi ciency: best way to be effi cient is to avoid using the street too much. Examples are homeworking (differential data) and news or music kiosks avoiding music only in the Opera or getting news from Le Monde headquarters (cache)

For Fielding, he would also select his architectural style against the following metrics:

1. Scalable: make it possible for Paris to grow2. Simple: citizens, civil servants and visitors would need to

understand the way the city worked without a user manual3. Modifi able: ability to evolve in the future through change4. Extensible: add new neighbourhood without impacting

the system5. Customizable: specialize a building without impacting others

to be visible (people should see each other in the street)9. Portable: style should work well in other regions, with

other materials and weather conditions10. Reliable: susceptible to failure (no single event could

stop water, gas or circulation for citizens)

Looking into the challenges he wanted to address in Paris through his architectural style, Haussmann weighted each of these properties for his evaluation criteria. The main objec-tives appeared to be:

•Low entry-barrier: citizens are not forced to live in Paris, and Haussmann wanted to provide them best possible UX to increase adoption. A citizen needed to be able to simply fi nd an address, and a builder to publish a new reference, allowing for the creation of an initial system very quickly.

•Extensibility: a low-entry barrier would help create the modern community Haussmann wanted, and in the long-term, Paris needed to be ready for changes in its style to adapt to new technologies.

•Distributed hypermedia: Paris needed to provide citizens with life experience ranging from music (Opera and kiosk), fi lms (actual theatres), ecommerce (food from Les Halles) and health (parks). All these experiences were rich in con-tent and would attract many citizens, so much so that they needed to be distributed across the city.

•Anarchic scalability: once the fi rst set of new neighbour-hoods would be in place, the city could grow in one direc-tion or another, at a very large scale, without the need for a centralized control (anarchy) to ensure integrity of the entire system. This required each building to ensure its own authentication, and be able to inspect incoming traffi c through fi rewalls (double door, concierge).

•Independent deployment: each server (building) or appli-cation (neighbourhood) could be deployed independently from the others, without compromising the system. Legacy systems (older neighbourhoods that could/should not be changed, e. g. Notre Dame de Paris) needed to be easily encapsulated to interact and be part of the entire system.

Paris architecture extended to new technolo-

gies, e. g. Metro streaming networkParis architecture extended to new technolo- Paris supporting hypermedia, e. g. Pigalle, distributed in various neighbourhoods

Paris supporting hypermedia, e. g. Pigalle,

Fowler, Fielding, and Haussmann – Network-based Architectures

Hear Eric Horesnyi speak at the JAX London(Oct. 12–14, 2015). Why is Paris so beautiful, Netfl ix so scalable and REST now a standard? This is about analyzing the constraints leading to ar-chitecture styles in network-based software as well as buildings. Haussmann invented a scalable model for the city, Fielding esta-blished the principles of an internet-scale software architecture (REST), and Fowler described in detail how microservices can get an application to massively scale.

Eric Horesnyi was a founding team member at Internet Way (French B2B ISP, sold to UUNET) then Radianz (Global Finance Cloud, sold to BT). He is a High Frequency Trading infrastructure expert, passionate about Fin-tech and Cleantech. Eric looks after 3 bozons and has worked in San Francisco, NYC, Mexico and now Paris.

6. Confi gurable: easily modify a building (component) after construction (post-deployment)

7. Reusable: a building host-ing an accounting fi rm one day can serve as creamery the next

8. Visible: to provide best security and auditability of the system, interactions be-tween components needed

Java


by Angelika Langer

Java 8 came with a major addition to the JDK collection framework, namely the Stream API. Similar to collections, streams represent sequences of elements. Collections support operations such as add(), remove(), and contains() that work on a single element. Streams, in contrast, have bulk opera-tions such as forEach(), filter(), map(), andreduce() that ac-cess all elements in a sequence. The notion of a Java stream is inspired by functional programming languages, where the corresponding abstraction is typically called a sequence, which also has filter-map-reduce operations. Due to this simi-larity, Java 8 – at least to some extent – permits a functional programming style in addition to the object-oriented para-digm that it supported all along.

Perhaps contrary to widespread belief, the designers of the Java programming language did not extend Java and its JDK to allow functional programming in Java or to turn Java into

a hybrid “object-oriented and functional” programming lan-guage. The actual motivation for inventing streams for Java was performance or – more precisely – making parallelism more accessible to software developers (see Brian Goetz, State of the Lambda). This goal makes a lot of sense to me, consid-ering the way in which hardware evolves. Our hardware has dozens of CPU cores today and will probably have hundreds some time in the future. In order to effectively utilize the hard-ware capabilities and thereby achieve state-of-the-art execu-tion performance we must parallelize. After all – what is the point to running a single thread on a multicore platform? At the same time, multithread programming is considered hard and error-prone, and rightly so. Streams, which come in two flavours (as sequential and parallel streams), are designed to hide the complexity of running multiple threads. Parallel streams make it extremely easy to execute bulk operations in parallel – magically, effortlessly, and in a way that is acces-sible to every Java developer.

Java performance tutorial – How fast are the Java 8 streams?

Let’s talk speed Java 8 brought with it a major change to the collection framework in the form of streams. But how well do they really perform?

©is

tock

phot

o.co

m/V

iorik

a

Java


So, let’s talk about performance. How fast are the Java 8 streams? A common expectation is that parallel execution of stream operations is faster than sequential execution with only a single thread. Is it true? Do streams improve perfor-mance?

In order to answer questions regarding performance we must measure, that is, run a micro-benchmark. Benchmark-ing is hard and error-prone, too. You need to perform a proper warm-up, watch out for all kinds of distorting ef-fects from optimizations applied by the virtual machine’s JIT compiler (dead code elimination being a notorious one) up to hardware optimizations (such as increasing one core’s CPU frequency if the other cores are idle). In general, benchmark results must be taken with a grain of salt. Every benchmark is an experiment. Its results are context-dependent. Never trust benchmark fi gures that you haven’t produced yourself in your context on your hardware. This said, let us experiment.

Comparing streams to loopsFirst, we want to fi nd out how a stream’s bulk operation compares to a regular, traditional for-loop. Is it worth using streams in the fi rst place (for performance reasons)?

The sequence which we will use for the benchmark is an int-array fi lled with 500,000 random integral values. In this array we will search for the maximum value. Here is the tra-ditional solution with a for-loop:

int[] a = ints;int e = ints.length;int m = Integer.MIN_VALUE;for(int i=0; i < e; i++) if(a[i] > m) m = a[i];

Here is the solution with a sequential IntStream:

int m = Arrays.stream(ints) .reduce(Integer.MIN_VALUE, Math::max);

We measured on an outdated hardware (dual core, no dy-namic overclocking) with proper warm-up and all it takes to produce halfway reliable benchmark fi gures. This was the result in that particular context:

int-array, for-loop : 0.36 msint-array, seq. stream: 5.35 ms

The result is sobering: the good old for-loop is 15 times faster than the sequential stream. How disappointing! Years of de-velopment effort spent on building streams for Java 8 and then this? But, wait! Before we conclude that streams are abysmally slow let us see what happens if we replace the int-array by an ArrayList<Integer>. Here is the for-loop:

int m = Integer.MIN_VALUE;for (int i : myList) if (i>m) m=i;

Here is the stream-based solution:

int m = myList.stream() .reduce(Integer.MIN_VALUE, Math::max);

These are the results:

ArrayList, for-loop : 6.55 msArrayList, seq. stream: 8.33 ms

Again, the for-loop is faster that the sequential stream op-eration, but the difference on an ArrayList is not nearly as signifi cant as it was on an array. Let’s think about it. Why do the results differ that much? There are several aspects to consider.

First, access to array elements is very fast. It is an index-based memory access with no overhead whatsoever. In other words, it is plain down-to-the-metal memory access. Ele-ments in a collection such as ArrayList on the other hand are accessed via an iterator and the iterator inevitably adds overhead. Plus, there is the overhead of boxing and unboxing collection elements whereas int-arrays use plain primitive type ints. Essentially, the measurements for the Ar-rayList are dominated by the iteration and boxing overhead whereas the fi gures for the int-array illustrate the advantage of for-loops.

Secondly, had we seriously expected that streams would be faster than plain for-loops? Compilers have 40+ years of experience optimizing loops and the virtual machine’s JIT compiler is especially apt to optimize for-loops over arrays with an equal stride like the one in our benchmark. Streams on the other hand are a very recent addition to Java and the JIT compiler does not (yet) perform any particularly sophisti-cated optimizations to them.

Thirdly, we must keep in mind that we are not doing much with the sequence elements once we got hold of them. We spend a lot of effort trying to get access to an element and then we don’t do much with it. We just compare two integers, which after JIT compilation is barely more than one assem-

Workshop: Lambdas and Streams in Java 8

This JAX London workshop led by Angelika Langer is devoted to the stream framework, which is an extension to the JDK collection framework. Streams offer an easy way to parallelize bulk opera-tions on sequences of elements. The Stream API differs from the classic collection API in many ways: It supports a fl uent program-ming style and borrows elements from functional languages.

“Our hardware has dozens of CPU cores today and will probably have hundreds some time in the future.”

Java


bly instruction. For this reason, our benchmarks illustrate the cost of element access – which need not necessarily be a typi-cal situation. The performance figures change substantially if the functionality applied to each element in the sequence is CPU intensive. You will find that there is no measurable dif-ference any more between for-loop and sequential stream if the functionality is heavily CPU bound.

The ultimate conclusion to draw from this benchmark ex-periment is NOT that streams are always slower than loops. Yes, streams are sometimes slower than loops, but they can also be equally fast; it depends on the circumstances. The point to take home is that sequential streams are no faster than loops. If you use sequential streams then you don’t do it for performance reasons; you do it because you like the func-tional programming style.

So, where is the performance improvement streams were invented for? So far we have only compared loops to streams. How about parallelization? The point of streams is easy par-allelization for better performance.

Comparing sequential streams to parallel streamsAs a second experiment, we want to figure out how a sequen-tial stream compares to a parallel stream performance-wise. Are parallel stream operations faster than sequential ones?

We use the same int-array filled with 500,000 integral val-ues. Here is the sequential stream operation:

int m = Arrays.stream(ints) .reduce(Integer.MIN_VALUE, Math::max);

This is the parallel stream operation:

int m = Arrays.stream(ints).parallel() .reduce(Integer.MIN_VALUE, Math::max);

Our expectation is that parallel execution should be faster than sequential execution. Since the measurements were made on a dual-core platform parallel execution can be at most twice as fast as sequential execution. Ideally, the ratio sequential/parallel performance should be 2.0. Naturally, parallel execution does introduce some overhead for split-ting the problem, creating subtasks, running them in multi-ple threads, gathering their partial results, and producing the overall result. The ratio will be less than 2.0, but it should come close. These are the actual benchmark results:

sequential parallel seq./par.int-array 5.35 ms 3.35 ms 1.60

“The point to take home is that sequential streams are no faster than loops.”

The reality check via our benchmark yields a ratio (sequen-tial/parallel) of only 1.6 instead of 2.0, which illustrates the amount of overhead that is involved in going parallel and how (well or poorly) it is overcompensated (on this particular platform).

You might be tempted to generalise these figures and con-clude that parallel streams are always faster than sequential streams, perhaps not twice as fast (on a dual core hardware), as one might hope for, but at least faster. However, this is not true. Again, there are numerous aspects that contribute to the performance of a parallel stream operation.

One of them is the splittability of the stream source. An array splits nicely; it just takes an index calculation to figure out the mid element and split the array into halves. There is no overhead and thus barely any cost of splitting. How easily do collections split compared to an array? What does it take to split a binary tree or a linked list? In certain situations you will observe vastly different performance results for different types of collections.

Another aspect is statefulness. Some stream operations maintain state. An example is the distinct() operation. It is an intermediate operation that eliminates duplicates from the input sequence, i.e., it returns an output sequence with distinct elements. In order to decide whether the next ele-ment is a duplicate or not the operation must compare to all elements it has already encountered. For this purpose it maintains some sort of data structure as its state. If you call distinct() on a parallel stream its state will be accessed con-currently by multiple worker threads, which requires some form of coordination or synchronisation, which adds over-head, which slows down parallel execution, up to the extent that parallel execution may be significantly slower than se-quential execution.

With this in mind it is fair to say that the performance model of streams is not a trivial one. Expecting that parallel stream operations are always faster than sequential stream operations is naive. The performance gain, if any, depends on numerous factors, some of which I briefly mentioned above. If you are familiar with the inner workings of streams you will be capable of coming up with an informed guess regard-ing the performance of a parallel stream operation. Yet, you need to benchmark a lot in order to find out for a given con-text whether going parallel is worth doing or not. There are indeed situations in which parallel execution is slower than sequential execution and blindly using parallel streams in all cases can be downright counter-productive.

The realisation is: Yes, parallel stream operations are easy to use and often they run faster than sequential operations, but don’t expect miracles. Also, don’t guess; instead, bench-mark a lot.

Angelika Langer works as a trainer and consultant with a course curricu-lum of Java and C++ seminars. She enjoys speaking at conferences, among them JavaOne, JAX, JFokus, JavaZone and many more. She is au-thor of the online “Java Generics FAQs” and a “Lambda Tutorial & Refer-ence” at www.AngelikaLanger.com.

Java


Geertjan Wielenga is Developer and author at Sun Microsystems and Oracle, working on NetBeans IDE and the NetBeans Platform, speaker at JavaOne, Devoxx, JAX London and other international software develop-ment conferences, Java and JavaScript enthusiast, JavaOne Rock Star.

by Geertjan Wielenga

We can no longer make assumptions about where and how the applications we develop will be used. Where originally HTML, CSS and JavaScript were primarily focused on pre-senting documents in a nice and friendly way, the utility of the browser has exploded beyond what could ever have been imagined. And, no, it’s not all about multimedia – i.e., no, it’s not all about video and audio and the like. It’s all about full-blown applications that can now be programmed for the browser. Why the browser? Because the browser is every-where: on your mobile device, on your tablet, on your laptop, and on your desktop computer.

Seen from the perspective of the Java ecosystem, this de-velopment is a bit of a blow. All along, we thought the JVM would be victorious, i.e., we thought the “write once, run anywhere” mantra would be exclusively something that we as Java developers could claim to be our terrain. To various extents, of course, that’s still true, especially if you see An-droid as Java for mobile. Then you could make the argument that on all devices, some semblance of Java is present. The arguments you’d need to make would be slightly complicated by the fact that most of your users don’t actually have Java installed – i.e., they physically need to do so, or your applica-tion needs to somehow physically install Java on your user’s device. Whether you’re a Java enthusiast or not, you need to admit that the reach of the browser is far broader and more intuitively present than Java, at this point.

So, how do we deal with this reality? How can you make sure that your next application supports all these different devices, which each have their own specificities and eccen-tricities? On the simplest level, each device has its own screen size. On a more complex level, not every device needs to en-able interaction with your application in the same way. Some

of those devices have more problems with battery life than others. Responsive design via CSS may not be enough, simply because CSS hides DOM elements. It does not prevent the loading of resources, meaning that the heavy map technology that you intend for the tablet user is going to be downloaded all the same for the mobile user, even though it will not be shown, thanks to CSS.

Did you know?Did you know there’s something called “responsive Java-Script”, which is much more powerful than “responsive CSS”? Did you know that there are a number of techniques you can use when creating enterprise-scale JavaScript appli-cations, including modularity via RequireJS? Did you know that AngularJS is not the only answer when it comes to JavaS-cript application frameworks?

And finally, are you aware of the meaningful roles that Java, especially Java EE, can continue to play in the brave new old world of JavaScript? These questions and concerns will be addressed during my session at JAX London, via a range of small code snippets and examples, i.e., you will cer-tainly see as much code and technical tips and tricks, as you will see slides. Undoubtedly, you will leave the session with a lot of new insights and questions to consider when starting your next enterprise-scale applications, whether in Java or in JavaScript!

Coding for desktop and mobile with HTML5 and Java EE 7

A world beyond Java Full-blown applications programmed for your browser – that’s where it’s at, right now, says JAX Lon-don speaker Geertjan Wielenga. And this should be of some concern to Java developers out there.

Java


by Werner Keil

As proposed in OpenJDK JEP 222 [1], the JShell offers a REPL (Read-Eval-Print Loop) to evaluate declarations, state-ments and expressions of the Java language, together with an API allowing other applications to leverage its functionality. The idea is not exactly new. BeanShell [2] has existed for over 15 years now, nearly as long as Java itself, not to mention many scripting languages on Scala and Groovy also featuring similar shells already.

BeanShell (just like Groovy, too by the way) made an at-tempt of standardisation by the Java Community Process [3] in JSR 274 – a JSR that did not produce any notable output, in spite of the fact that (or perhaps because?) two major com-panies, Sun and Google, had joined the expert group. Under the JCP.next initiative this JSR was declared “Dormant”.

An eyebrow-raising approachAdding a new Java feature like this via JEP, rather than wak-ing up the “Dormant” JSR (which anyone could, including

Oracle who now owns former EG member Sun), raised some eyebrows among JCP EC members. One concern was that after the JCP had just merged its ME and SE/EE parts into a single body, developing more and more platform features not as JSRs but JEPs under the OpenJDK would create another rift between ME/SE (JDK) and EE where most remaining JSRs then resided.

Device I/O [4], derived from an Oracle proprietary prede-cessor under Java ME, was already developed as an OpenJDK project. Without a JEP, it seems Oracle at least can also ratify such projects without prior proposal. The farce around JSR 310, which neither produced an actual Spec document man-datory to pretty much all JSRs, nor (according to Co-Spec Lead Stephen Colebourne) comes with a real API similar to other SE platform JSRs like Collections, was another example of where the JSR should have been withdrawn or declared

Figure 1: JShell arithmetic

JShell, the Java 9 REPL: What does it do?

JEP 222 Among the few truly new features coming in Java 9 (alongside Project Jigsaw’s modular-ity) is a Java Shell that has recently been confirmed. Java Executive Committee member Werner Keil explains how Java’s new REPL got started and what it’s good for.

©is

tock

phot

o.co

m/d

inn

Java


dormant when the JEP started. It was just meant to rubber-stamp some JDK part by the EC, without the actual result of the JSR outside of the OpenJDK.

Every class has some Javadoc, so that doesn’t really count. Given Oracle’s strong involvement we are likely to see more JEPs under the OpenJDK. And having a transparent open-source effort behind these parts of the Java ecosystem is still better than a closed environment, so even if it may disenfran-chise and weaken some of the JCP (and EC), it is better than no open development at all.

Potential uses of the JShellHaving such a shell in Java is certainly not a bad idea. Re-gardless of its development under Java SE, future versions of Java EE may find a standard shell even more appealing than Java SE. The value for Java ME remains to be seen, espe-cially if down-scaling like Device I/O is even possible. But at the very least, IoT devices running Java SE Embedded should clearly benefit.

Windows PowerShell [5] has become a strong preference for system administration or DevOps, at least on Windows and .NET. This is used by its Play Framework for adminis-trative tasks, while Groovy is used for similar purposes by the Spring Framework, or under the hood of the JBoss Ad-min Shell [6]. Meanwhile, WebLogic Scripting Tool (WLST) emerged from Jython, a Python shell on the JVM. Java EE Reference Implementation GlassFish has an admin shell called asadmin. Being able to tap into a unified Java shell in future versions could certainly make life easier for many Java-based projects, as well as products, developers and ops using them.

Other interesting fields of use are domain-specific exten-sions. Groovy, Scala or other shell-enabled languages (both on the JVM and outside of it) are very popular for business or scientific calculations. Based on early impressions with JShell [7] messages like “assigned to temporary variable $3 of type int” can be quite misleading (Figure 1).

In particular the financial domain tends to think of US dol-lars when they read “$”, so that still has room for improve-ment. But almost natural language queries such as Google answers questions like “what is 2 plus 2”, or a pretty NoSQL DB of its time like Q&A [8], offering such features ten years before the Java language even started to have great poten-tial. Instead of simply asking “2+2” questions, users may ask what the temperature in their living room is, when backed by a Smart Home solution. Or using JSRs like 354, the re-cently finished Money API [9], questions like “2$ in CHF” or

similar would make great sense too. That’s where temporary variables quoting $ amounts would be a bit confusing, but maybe the JDK team behind JShell finds other ways to phrase that.

Another great example of a Java-powered REPL and ex-pression language for scientific and other arithmetic chal-lenges is Frink [10], named after the weird scientist character in The Simpsons TV series. It answers all sorts of questions, starting from date/time or time zone conversions (which java.time aka JSR 310 could certainly be used for, too) or currency conversions like:

"600 baht -> USD"

Frink provides much more mathematical and physical for-mulas, including unit conversion. Based on JSR 363, the upcoming Java Units of Measurement standard [11], this will be possible in a similar way. With Groovy, co-founder Guillaume Laforge has documented a DSL/REPL for Units of Measurements using JSR 275 a while back [12]. Their solu-tion was used in real-life medical research for Malaria treat-ments. Of course, being written in Java, someone might also simply expose the actual Frink language and system via JShell under Java 9!

Werner Keil is an Agile Coach, Java EE and IoT/Embedded/Real Time expert. Helping Global 500 enterprises across industries and leading IT vendors, he has worked for over 25 years as Program Manager, Coach, SW architect and consultant for Finance, Mobile, Media, Tansport and Public sectors. Werner is an Eclipse and Apache Committer and JCP mem-

ber in JSRs like 333 (JCR), 342 (Java EE 7), 354 (Money), 358/364 (JCP.next), Java ME 8, 362 (Portlet 3), 363 (Units, also Spec Lead), 365 (CDI 2), 375 (Java EE Se-curity) and in the Executive Committee.

“The value for Java ME re-mains to be seen, especially

if down-scaling like De-vice I/O is even possible.”

References

[1] http://openjdk.java.net/jeps/222

[2] http://www.beanshell.org/

[3] http://jcp.org

[4] http://openjdk.java.net/projects/dio/

[5] https://en.wikipedia.org/wiki/Windows_PowerShell

[6] http://teiid.jboss.org/tools/adminshell/

[7] http://blog.takipi.com/java-9-early-access-a-hands-on-session-with-jshell-the-java-repl/

[8] https://en.wikipedia.org/wiki/Q%26A_(Symantec)

[9] http://www.javamoney.org

[10] https://futureboy.us/frinkdocs/

[11] http://unitsofmeasurement.github.io/

[12] https://dzone.com/articles/domain-specific-language-unit-

Databases


by Aviran Mordo

NoSQL is a set of database technologies built to handle massive amounts of data or specifi c data structures foreign to relational databases. However, the choice to use a NoSQL database is often based on hype, or a wrong assumption that relational databases cannot perform as well as a NoSQL database. Op-erational cost is often overlooked by engineers when it comes to selecting a database.

When building a scalable system, we found that an impor-tant factor is using proven technology so that we know how to recover fast if there’s a failure. Pre-existing knowledge and experience with the system and its workings – as well as being able to Google for answers – is critical for swift mitigation. Relational databases have been around for over 40 years, and there is a vast industry knowledge of how to use and maintain them. This is one reason we usually default to using a MySQL database instead of a NoSQL database, unless NoSQL is a signifi cantly better solution to the problem.

However, using MySQL in a large-scale system may have per-formance challenges. To get great performance from MySQL, we employ a few usage patterns. One of these is avoiding data-base-level transactions. Transactions require that the database maintains locks, which has an adverse effect on performance.

Instead, we use logical application-level transactions, thus reducing the load and extracting high performance from the database. For example, let’s think about an invoicing schema. If there’s an invoice with multiple line items, instead of writing all the line items in a single transaction, we simply write line by line without any transaction. Once all the lines are written to the database, we write a header record, which has pointers to the line items’ IDs. This way, if something fails while writ-ing the individual lines to the database, and the header record was not written, then the whole transaction fails. A possible downside is that there may be orphan rows in the database. We don’t see it as a signifi cant issue though, as storage is cheap and these rows can be purged later if more space is needed.

High-performance MySQL usage patternsHere are some of our other usage patterns to get great perfor-mance from MySQL:

•Do not have queries with joins; only query by primary key or index.

•Do not use sequential primary keys (auto-increment) be-cause they introduce locks. Instead, use client-generated

keys, such as GUIDs. Also, when you have master-master replication, auto-increment causes confl icts, so you will have to create key ranges for each instance.

•Any fi eld that is not indexed has no right to exist. Instead, we fold such fi elds into a single text fi eld (JSON is a good choice).

We often use MySQL simply as a key-value store. We store a JSON object in one of the columns, which allows us to ex-tend the schema without making database schema changes. Accessing MySQL by primary key is extremely fast, and we get submillisecond read time by primary key, which is excel-lent for most use cases. So we found that MySQL is a great NoSQL that’s ACID-compliant.

In terms of database size, we found that a single MySQL instance can work perfectly well with hundreds of millions of records. Most of our use cases do not have more than several hundred million records in a single instance. One big advan-tage to using relational databases as opposed to NoSQL is that you don’t need to deal with the eventually consistent nature displayed by most NoSQL databases. Our developers all know relational databases very well, and it makes their lives easy.

Don’t get me wrong, there is a place for NoSQL; relational databases have their limits – single host size and strict data structures. Operational cost is often overlooked by engineers in favour of the cool new thing. If the two options are viable, we believe we need to really consider what it takes to main-tain it in production and decide accordingly.

Aviran Mordo is the head of back-end engineering at Wix. He has over twenty years of experience in the software industry and has fi lled many engineering roles and leading positions, from designing and building the US national Electronic Records Archives prototype to building search en-gine infrastructures.

Making the right database decisions

MySQL is a great NoSQL Nowhere else are business decisions as hype-oriented as in IT. And while NoSQL is all well and good, MySQL is often the sensible choice in terms of operational cost and scalability.

From 0 to 60 Million Users: Scaling with Microservices and Multi-Cloud Architecture

Hear Aviran Mordo speak at the JAX London: Many small startups build their systems on top of a traditional toolset. These systems are used because they facilitate easy development and fast pro-gress, but many of them are monolithic and have limited scalabili-ty. As a startup grows, the team is confronted with the problem of how to evolve and scale the system.

Data


by Chris Neumann

“Self-service” is a term that gets used a lot in the business in-telligence (BI) space these days. In reality, data analytics has largely ignored the group of users that really need self service, even as that user base has grown. More than ever people realize the value of data, but non-technical users are still left out of the conversation. While everything from storage to collaboration tools have become simple enough for anyone to download and begin using, BI and data analytics tools still require end users to be experts or seek the help of experts. That needs to change.

Users should be able to get up and running on data analyt-ics and connect to the services they use most, easily. More em-ployees in every department are expected to make decisions based on their data, but that doesn’t mean everyone needs to be a data analyst or data scientist. Business users want to analyse data that lives in the services they use everyday, like Google Analytics, HubSpot, Marketo, and Shopify – and even Excel, and know the questions they need answered. What they need are truly self-service tools to get those answers.

Calls for changeWhile vendor jargon and the obsession with big data may be clouding the self-service cloud BI conversation, experts and enterprises are recognizing that things need to change. Leading analyst firms like Forrester and Gartner are recognizing that BI must evolve. When business users depend on IT teams to get answers, a bottleneck is created. End users are demanding tools they can use on their own without having to go to IT.

There are a number of vendors connecting to cloud services. But, connecting in a way that facilitates effective data analysis presents a myriad of additional challenges from navigating the sheer variety of formats to categorizing unspecified data.

At DataHero, we’ve built the requisite connectors for ac-cessing the data within cloud services. We’ve also taken the next steps with a data classification engine that automates ETL and recognizes that what a cloud service might call “text” is actually an important field. In order to successfully integrate these connections, solutions must automatically normalize the data from disparate services, matching attributes and allow-ing the data to be combined and analysed. Without automatic

normalization and categorization, self-service cloud BI isn’t possible.

The whole is greater than the sum of its partsWhile self-service cloud BI is already possible, the users are often new to the world of data analytics. That means that the tools too must evolve as the users become more sophisticated and new possibilities emerge.

For example, without data analytics, a marketer might log into a Google Analytics dashboard, then MailChimp, then Salesforce to take the pulse of a marketing campaign. Each ser-vice provides its own value, but when combined the marketer can use a common attribute, like email address, and create a third dataset. What comes out of that is a much more pure answer to the marketers question: “how successful is my cam-paign?”

Google Analytics, MailChimp and Salesforce are a common combination but there are many combinations that may be just as valuable but have yet to be explored. With the prolifer-ation of cloud applications, the possibilities are nearly endless.

The new users of BI and data analytics have also never had the opportunity to work with one another. To continue with the example, a marketer may have created the charts needed to monitor KPIs and put them into a dashboard, but these KPIs need to be shared with internal teams, clients and execu-tives. Reporting is normally a one-way process when it should be iterative and collaborative and allow clients and executives to provide real feedback on the most up-to-date numbers.

The consumerization of BIBI and data analytics have largely missed the consumerization of IT trend, despite industry-wide use of the term self service. That doesn’t mean that change isn’t coming. The shift to the cloud is continuing to accelerate and the emerging self-service cloud BI space is quickly heating up, driven by user demand and a need to decouple analytics from IT.

Chris Neumann is the founder and Chief Product Officer of DataHero, where he aims to help everyone unmask the clues in their data. Previ-ously he was the first employee at Aster Data Systems and describes himself as a data-analytics junkie, a bona fide techie and a self-proclaimed foodie.

Rethinking how we think about self-service cloud BI

Business intelligence must evolve Every employee and every end user should have the right to find answers using data analytics. But the current reliance on IT for key information is creating an unnecessary bottleneck, says DataHero’s Chris Neumann.

www.jaxlondon.com

Business Design Centre, LondonOctober 12 – 14th, 2015

In london sInce 2010

follow us: @JAXLondon JAX London JAX London

The Enterprise Conference on Java, Web & Mobile, Developer Practices,

Agility and Big Data!

The Conference for Java & Software Innovation

Presented by Organized by

grouP DisCounTsave 30%

jaxlondon.com

October 12th – 14th, 2015Business Design Centre, London

JAX London provides a 3 day conference experience for cutting edge software engineers and enterprise level professionals with attendees from across the globe. JAX brings together the world’s leading Java and JVM experts as well as many innovators in the fields of Microservices, Continuous Delivery and DevOps to share their knowledge and experience. In the spirit of agile methodology and lean business, JAX London is the place to define the next level of ultraefficient and superadaptive technology for your organization.

Learn how to increase your productivity, identify which technologies and practices suit your specific requirements and learn about new approaches. Monday is a preconference workshop and tutorial day. The halfday and fullday workshops are overseen by experts. On Tuesday and Wednesday see the proper conference taking place – with more than 60 technical sessions, keynotes, the JAX Expo, community events and more. For more information and the latest news about the conference and our speakers check out www.jaxlondon.com.

Join us for JaX London 2015

Keynotes

Jeff sussna (ingineering.iT)Jeff Sussna is Founder and Principal of Ingineering.IT, a Minneapolis technology consulting firm that helps enterprises and SoftwareasaService companies adopt 21stcentury IT tools and practices. Jeff has nearly 25 years of IT experience. He has led highperformance teams across

the Development/QA/Operations spectrum. He has a track record of driving quality improvements through practical innovation. Jeff has done work for a diverse range of companies, including Fortune 500 enterprises, major technology companies, software product and service startups, and media conglomerates.Jeff combines engineering expertise with the ability to bridge business, creative, and technical perspectives. He has the insight and experience to uncover problems and solutions other miss. He is a highly soughtafter speaker and writer respected for his insights on topics such as Agile, DevOps, Service Design, and cloud computing.Jeff’s interests focus on the intersection of development, operations, design, and business. He is the author of “Designing Delivery: Rethinking IT in the Digital Service Economy”. Designing Delivery explores the relationship between IT and business in the 21stcentury, and presents a unified approach to designing and operating responsive digital services.

From Design Thinking to Devops and Back Again: unify-ing Design and operationsThe era of digital service is shifting customers’ brand expectations from stability to responsiveness. Optimizing delivery speed is only half of this new equation. Companies also need to optimize their ability to listen and to act on what they hear. In order to maximize both velocity and responsiveness, companies need to transform upfront design into a continuous, circular designoperations loop that unifies marketing, design, development, operations, and support.

Adrian Colyer (Accel Partners) Adrian is a Venture Partner with Accel Partners in London, and the author of “The Morning Paper,” where he reviews an interesting CSrelated paper every weekday. He’s also an advisor to ClusterHQ, Skipjaq, and Weaveworks. Previously Adrian served in CTO roles at Pivotal,

VMware, and SpringSource. Adrian’s extensive open source experience includes working with the teams that created the Spring Framework and related Spring projects, Cloud Foundry, RabbitMQ, Redis, Groovy, Grails, and AspectJ, as well as with team members making significant contributions to Apache Tomcat and Apache HTTP server.

VC from the inside – a techie’s perspectiveAfter many years in CTO roles with SpringSource, VMware, and Pivotal, and having experienced what it is like to work in a VCbacked company, in June of 2014 Adrian switched sides and joined the venture capital firm Accel Partners in London. So what exactly does a technologist do inside a venture capital firm? And having been part of the process from the inside, how do investment decisions get made? In this talk Adrian will share some of the lessons he’s learned since embedding in the world of venture capital, and how you can maximise your chances of investment and a successful companybuilding partnership.

rachel Davies (unruly) Rachel Davies coaches product development teams at Unruly (tech.unruly.co) in London. She is author of “Agile Coaching” and an invited speaker at industry events around the globe. Her mission is to create workplaces where developers enjoy delivering valuable software. Ra

chel is a strong advocate of XP approaches and an organiser of Extreme Programmers London meetup.

The Art of shifting PerspectivesDevelopers love writing code but to build resilient industryscale systems we often need to persuade others to make changes to both code and working practices. As a coach, my job is to help developers spot areas for improvement and act on their ideas. Core to this work is opening up different ‘ways of seeing’ the work that lies ahead.

In this new talk, I will share some stories of changes the teams I work with have made and explain some mechanisms that we applied to make changes. Teams I work with at Unruly use eXtreme Programming (XP) techniques to build our systems. Modern XP has many counterintuitive practices — such as mob and pair programming. How did new ways of seeing old problems help us resolved them?

Come along to this talk to hear about some practical techniques you can use to help solve tricky problems and get others on board with your idea by shifting perspective.

jaxlondon.com


Timetable

Monday – October 12th

09:00 – 17:00 Design & Implementation of Microservices James Lewis

Designing and Operating User-Centered Digital Services Jeff Sussna

Workshop: Lambdas and Streams in Java 8 Angelika Langer, Klaus Kreft

Workshop: Crafting Code Sandro Mancuso

Workshop on Low Latency logging and replay Peter Lawery

Tuesday – October 13th

09:00 – 10:00 KeYNOTe: From Design Thinking to DevOps and Back Again: Unifying Design and Operations

Jeff Sussna

10:15 – 11:05 Benchmarking: You’re Doing It Wrong Aysylu Greenberg

The Performance Model of Streams in Java 8 Angelika Langer

Open Source workflows with BPMN 2.0, Java and Camunda BPM Niall Deehan

DevOps, what should you decide, when, why & how Vinita Rathi

11:40 – 12:10 Java Generics: Past, Present and Future Richard Warburton, Raoul-Gabriel Urma

11:40 – 12:30 Smoothing the continuous delivery path – a tale of two teams Lyndsay Prewer

14:30 – 15:20 2000 Lines of Java or 50 Lines of SQL? The Choice is Yours Lukas Eder

From 0 to 60 Million Users: Scaling with Microservices and Multi-Cloud Architecture Aviran Mordo

How to defeat feature gluttony? Kasia Mrowca

15:50 – 16:40 Costs of the Cult of Expertise Jessica Rose

Cluster your Application using CDI and JCache Jonathan Gallimore

Distributed Systems in one Lesson Tim Berglund

Garbage Collection Pause Times Angelika Langer

Technology Innovation Diffusion Jeremy Deane

17:10 – 18:00 Continuous delivery – the missing parts Paul Stack

Pragmatic Functional Refactoring with Java 8 Richard Warburton, Raoul-Gabriel Urma

Preparing your API Strategy for IoT Per Buer

Use your type system; write less code Samir Talwar

A pattern language for microservices Chris Richardson

18:15 – 18:45 All Change! How the new Economics of Cloud will make you think differently about Java Steve Poole, Chris Bailey

Le Mort du Product Management Nigel Runnels-Moss

20:00 – 21:00 KeYNOTe: VC from the inside - a techie’s perspective Adrian Colyer

Wednesday – October 14th09:00 – 09:45 KeYNOTe: The Art of Shifting Perspectives Rachel Davies

10:00 – 10:50 Advanced A/B Testing Aviran Mordo

Architectural Resiliency Jeremy Deane

Cassandra and Spark Tim Berglund

Lambdas Puzzler Peter Lawrey

11:20 – 12:10 Coding for Desktop and Mobile with HTML5 and Java EE 7 Geertjan Wielenga

Intuitions for Scaling Data-Centric Architectures Benjamin Stopford

Microservices: From dream to reality in an hour Dr. Holly Cummins

12:20 – 13:10 Does TDD really lead to good design? Sandro Mancuso

DevOps and the Cloud: All Hail the (Developer) King! Daniel Bryant, Steve Poole

Fowler, Fielding, and Haussmann – Network-based Architectures Eric Horesnyi

Java vs. JavaScript for Enterprise Web Applications Chris Bailey

15:30 – 16:20 The Dark Side of Software Metrics Nigel Runnels-Moss

The Unit Test is dead. Long live the Unit Test! Colin Vipurs

Events on the outside, on the inside and at the core Chris Richardson

Architecting for a Scalable Enterprise John Davies


Workshop on Low Latency logging and replayPeter Lawrey (Higher Frequency Trading Ltd)

A workshop for beginners to advanced developers on how to write and read data efficiently in Java. The workshop will cover the following: An advanced review of how the JVM really uses memory. what are references. what is compressed OOPS. how are the fields in an object laid out. Using Maven to build a project using Chronicle. Setting up a simple maven project. Using modules from maven central. Assembling a maven build. How do memory mapped files work on Windows and Linux. Storing data in a memory mapped file. Sharing data between JVMs via memory mapped files. What is Unsafe and how does it work. Using Unsafe to see the contents of an object in memory. Using Unsafe to access native memory. Writing and read data to a Chronicle Queue. Using raw bytes. Using a wire format. Designing a system with low latency persisted IPC. simple order matching system example. Advanced content will be added into the early sessions to keep advanced user interested and the later topic will have prebuilt working examples to build on.

Workshop: Crafting CodeSandro Mancuso (Codurance)

This course is designed to help developers write wellcrafted code—code that is clean, testable, maintainable, and an expression of the business domain. The course is entirely handson, designed to teach developers practical techniques they can immediately apply to realworld projects. Software Craftsmanship is at the heart of this course. Throughout, you will learn about the Software Craftsmanship attitude to development and how to apply it to your workplace. Writing Clean Code is difficult. Cleaning existing code, even more so. You should attend if you want to: Write clean code that is easy to understand and maintain. Become more proficient in TestDriven Development (TDD): using tests to design and build your code base. Focus your tests and production code according to business requirements using OutsideIn TDD (a.k.a. the London School of TDD) Clean code necessitates good design. In the process of driving your code through tests, you will learn how to: Understand design principles that lead to clean code Avoid overengineering and large rewrites by incrementally evolving your design using tests Once you have an understanding of the principles at work, we will apply them to Legacy Code to help you gain confidence in improving legacy projects through testing, refactoring and redesigning. The content will be: TDD lifecycle and the OutsideIn style of TDD Writing unit tests that express intent, not implementation Using unit tests as a tool to drive good design Expressive code Testing and refactoring Legacy Code.

Workshop: Lambdas and streams in Java 8Angelika Langer (Angelika Langer Training/Consulting), Klaus Kreft (Klaus Kreft)

This workshop is devoted to the stream framework, which is an extension to the JDK collection framework. Streams offer an easy way to parallelize bulk operations on sequences of elements. The stream API differs from the classic collection API in many ways: it supports a fluent programming style and borrows elements from functional languages. For instance, streams have operations such as filter, map, and reduce. The new language features of lambda expressions and method references have been added to Java for effective and convenient use of the Stream API.In this workshop we will introduce lambda expressions and method/constructor references, give an overview of the stream operations and discuss the performance characteristics of sequential vs. parallel stream operations. Attendants are encouraged to bring their notebooks. We will not only explore the novelties in theory, but intend to provide enough information to allow for handson experiments with lambdas and streams.

Designing and operating user-Centered Digital servicesJeff Sussna ( Ingineering.IT)

With software eating the world, 21stcentury business increasingly depends on IT, not just for operational efficiency, but for its very existence. In a highly disruptive service economy, ITdriven businesses must continually adapt to everchanging customer needs and market demands. To power the adaptive organization, IT needs to become a medium for continuous, empathic customer conversations. This workshop teaches participants how to design and operate systems and organizations that help businesses create value through customer empathy. It introduces them to the theory and practice of Continuous Design, a crossfunctional practice that interconnects marketing, design, development, and operations into a circular design/operations loop. Participants learn how to: • align software designs with operational, business, and customer needs •maximize quality throughout the design, development, and operations lifecycle •create highly resilient and adaptable systems, practices, and organizations. The workshop takes place in two sessions: Introduction to Continuous Design and Applying Continuous Design. Morning Introduction to Continuous Design: this session introduces the principles of Continuous Design. It grounds those principles in the historical, philosophical, and economic underpinnings that link methodologies such as Design Thinking, Agile, DevOps, and Lean. By providing a strong theoretical grounding in new ways of knowing, this session gives participants the ability to evaluate the effectiveness of specific tools and practices, and to continually adapt them to meet their own needs and constraints. Afternoon Applying Continuous Design: this session introduces a concrete methodology for applying Continuous Design to realworld problems.

Design & implementation of MicroservicesJames Lewis (ThoughtWorks)

Microservices Architecture is a concept that aims to decouple a solution by decomposing functionality into discrete services. Microservice architectures can lead to easier to change, more maintainable systems which can be more secure, performant and stable. In this workshop you will discover a consistent and reinforcing set of tools and practices rooted in the philosophy of small and simple that can help you move towards a Microservice architecture in your own organisation. Small services, communicating via the web’s uniform interface with single responsibilities and installed as well behaved operating system services. However, with these finergrained systems come new sources of complexity. What you will learn: During this workshop you will understand in more depth what the benefits are of finergrained architectures, how to break apart your existing monolithic applications, and what are the practical concerns of managing these systems. We will discuss how to ensure your systems can be made more stable, how to handle security, and how to handle the additional complexity of monitoring and deployment. We will cover the following topics: Principledriven evolutionary architecture Capability modelling and the town planning metaphor REST, web integration and eventdriven systems of systems Microservices, versioning, consumer driven contracts and Postel’s law. Who should attend: Developers, Architects, Technical Leaders, Operations Engineers and anybody interested in the design and architecture of services and components.

JAX London Workshop Day

jaxlondon.com

Cloud


by Zigmars Raascevskis

Cloud computing engines today allow businesses to easily ex-tend their IT infrastructure at any time. This means that you can rent servers with only a few clicks, and various software stacks including web-servers, middleware and databases can be installed and run on to these server instances with little-to-no effort. With data continuing to aggregate at a rapid speed, the database is becoming a large part of this infrastructure. By leveraging conventional cloud computing, every business can run its own database stack in cloud the same way as if it were on-premise.

There’s still a huge amount of potential to accelerate speed and efficiency by using a multi-tenant database. For multi-

tenant distributed databases, a certain amount of servers in a cloud footprint are set aside for managing databases, but these resources are shared by many users. This opens up the possibility for improving speed and efficiency of the IT infrastructure within organizations. A combined database footprint has massive resources and the ability to parallelize a much wider range of requests than users with their own dedicated servers. Such a setup allows faster run time and avoids the painful sizing and provisioning process associ-ated with on-premise infrastructure and traditional cloud computing. So what should businesses look for when select-ing a database solution? A multi-tenant database solution is worth considering given it can help overcome the following challenges.

Database solutions

The future of cloud computing The cloud has changed everything. And yet the cloud revolution at the heart of IT is only get-ting started. As data becomes more and more important, we’re beginning to realise how central a role the database will play in future.

©is

tock

phot

o.co

m/P

eter

Boo

th

Cloud


I – Failure tolerance of distributed systemsBy design, distributed systems with state replication are resis-tant against most forms of single machine failures. Guarding against single machine hardware failures is relatively straight-forward. With the distributed database design, every database is hosted on multiple machines that replicate each partition several times. Therefore, in the case of server failure, each sys-tem routes traffic to healthy replicas to make sure that data is replicated elsewhere – ensuring higher availability. However, making distributed systems tolerant against software failures is much more difficult due to common cause and presents a difficult challenge. The ultimate power of distributed systems comes from parallelism, but this also means that the same code is executed on every server participating in fulfilling the request. If working on a particular request causes a fatal fail-ure that has a negative impact on the operation of a system or even crashes it, this means the entire cluster is immediately affected.

Sophisticated methods are necessary to avoid such correlat-ed failures, which might be rare, that have devastating effects. One method involves trying each query on a few isolated computational nodes before sending it down to the entire cluster with massive parallelism. Once failures are observed in the sandbox, suspicious requests are immediately quaran-tined and isolated from the rest of the system.

II – Performance guarantees in a multi-tenant environmentAnother common problem that often manifests itself in public clouds is the “noisy neighbour” issue. When many users share computational resources, it is important to ensure that they are prioritized and isolated properly so that sudden changes in behaviour of one user do not have an adverse impact on an-other. A common approach for computing engines has been isolation of resources into containers. This requires giving each user a certain sized box that it cannot break out from – providing a level of isolation – however, it’s not flexible in terms of giving users enough resources exactly when they need them. Effective workload scheduling, low-level resource prioritization and isolation are key techniques to achieving a predictable performance.

A multi-tenant database software stack actually provides more opportunities to share and prioritize resources dynami-cally while providing performance guarantees. This is possible because the database software can manage access of critical resources like a CPU core or a spinning disk through a queue of requests that are accessing the resource. The provisioning process ensures that there are enough aggregated resources in the cluster. However, in the case that some user behaves

unpredictably, the software stack is able to control the queues and can make sure that only the offender is affected and other users whose resource usage patterns are unchanged remain unaffected. Additionally, management of requested queues can ensure, through prioritization, that the end user’s latency metrics are optimised by picking the next request from the queue.

III: ACID-compliant transactions: A NoSQL challengeAnother obstacle for massively paralleled distributed systems has been consistency guarantees. For NoSQL distributed da-tabases, ensuring transactional consistency and ACID proper-ties have been a real problem. This is due to the fact that with a distributed database, many nodes have to be involved in processing the transaction and it is not obvious how to act in cases of failure. Plus, the state of the cluster has to be synchro-nized to ensure consistency, which presents high overheads in a highly distributed environment.

Instead of compromising performance or consistency, in-vestment needs to be made to make database software scale while preserving consistency. For example, transactional consistency can be managed through the use of a transaction log, which can in turn, be distributed and replicated for high throughput and durability.

Distributed databases can serve as a solid foundation for distributed computing that is massively parallel and in-stantly scalable. In this respect NoSQL technologies and its community can leverage this trend to contribute to the architecture of a “future computer”. By understanding the benefits of a multi-tenant system and adopting the ap-propriate solutions, organizations can experience instant scalability and massive parallelism within their own data infrastructures.

“Distributed databases can serve as a solid foun-dation for distributed computing that is mas-

sively parallel and instantly scalable.”

Zigmars Raascevskis left a senior engineering position at Google to join Clusterpoint as the company CEO, foreseeing that document oriented da-tabases would take the market by storm. Prior to joining Clusterpoint, Zigmars worked for 8 years at Google, where among other projects he managed the web search backend software engineering team in Zurich.

Before his Google career, Zigmars worked for Exigen, a leading regional IT company, and Lursoft, a leading regional information subscription service company.

Tests


by Colin Vipurs

Over my many years of software development I’ve had to per-form various levels of testing against many different database instances and types including RDBMS and NoSQL, and one thing remains constant – it’s hard. There are a few approaches that can be taken when testing the database layer of your code and I’d like to go over a few of them pointing out the strengths and weaknesses of each.

MockingThis is a technique that I have used in the past but I would highly recommend against doing now. In my book “Tests Need Love Too” I discuss why you should never mock any third-party interface, but just in case you haven’t read it (you really should!) I’ll go over it again.

As with mocking any code you don’t own, what you’re validating is that you’re calling the third-party code in the way you think you should, but, and here’s the important part – this might be incorrect. Unless you have higher lever tests covering your code, you’re not going to know until it hits production. In addition to this, mocking raw JDBC is hard, like really hard. Take for example the test code snippet in Listing 1.

Within this test, not only are there a huge amount of expectations to setup, but in order to verify that all the calls happen in the correct order, jMock “states” are used extensively. Because of the way JDBC works, this test also violates the guidelines of never having mocks returning mocks and in fact goes several levels deep! Even if you manage to get all of this working, something as simple as a typo in your SQL can mean that although your tests are green this will still fail when your code goes to production.

A final note on mocking – no sane developer these days would be using raw JDBC, but one of the higher-level abstrac-tions available, and the same rules apply for these. Imagine a suite of tests setup to mock against JDBC and your code switches to Spring JdbcTemplate, jOOQ or Hibernate. Your tests will now have to be rewritten to mock against those frameworks instead – not an ideal solution.

Testing Against a Real DatabaseIt may sound silly, but the best way to verify that your data-base interaction code works as expected is to actually have

Listing 1

@Testpublic void testJdbc() { final Connection connection = context.mock(Connection.class); final ResultSet resultSet = context.mock(ResultSet.class); final PreparedStatement preparedStatement = context.mock(PreparedStatement.class); final States query = context.states("query").startsAs("pre-prepare");

context.checking(new Expectations() {{ oneOf(connection).prepareStatement("SELECT firstname, lastname, occupation FROM users"); then(query.is("prepared")); will(returnValue(preparedStatement)); oneOf(preparedStatement).executeQuery(); oneOf(resultSet).next(); when(query.is("executed")); then(query.is("available")); oneOf(resultSet).getString(1); when(query.is("available")); will(returnValue("Hermes")); oneOf(resultSet).getString(2); when(query.is("available")); will(returnValue("Conrad")); oneOf(resultSet).getString(3); when(query.is("available")); will(returnValue("Bureaucrat")); oneOf(resultSet).close(); when(query.is("available")); oneOf(preparedStatement).close(); when(query.is("available")); }});}

Dos and Don’ts

Testing the Database Layer There’s one thing we can agree on when it comes to database tests: they ain’t easy. Testing guru and JAX London speaker Colin Vipurs runs through the strengths and weaknesses of com-mon approaches to testing databases.

Tests


it interact with a database! As well as ensuring you’re using your chosen API correctly this technique can verify things that mocking never can, for example, your SQL is syntactically correct and does what you hope.

In-Memory Databases: One the easiest and quickest ways to get setup with a database to test against is to use one of the in-memory versions available, e. g. H2, HSQL or Derby. If you’re happy introducing a Spring dependency into your code, then the test setup can be as easy as this (Listing 2).

This code will create an instance of the H2 database, load the schema defi ned in schema.sql and any test data in test-da-ta.sql. The returned object implements javax.sql.DataSource so can be injected directly into any class that requires it.

One of the great benefi ts of this approach is that it is fast. You can spin up a new database instance for each and every test requiring it giving you a cast iron guarantee that the data is clean. You also don’t need any extra infrastructure on your development machine as it’s all done within the JVM. This mechanism isn’t without its drawbacks though.

Unless you’re deploying against the same in-memory da-tabase that you’re using in your test, inevitably you will run up against compatibility issues that won’t surface until you hit higher level testing or god forbid – production. Because

you’re using a different DataSource to your production in-stance it can be easy to miss confi guration options required to make the Driver operate correctly. Recently I came across such a setup where H2 was confi gured to use a DATETIME column requiring millisecond precision. The same schema defi nition was used on a production MySQL instance which not only required this to be DATETIME(3) but also needs the useFractionalSeconds=true parameter provided to the driver. This issue was only spotted after the tests were migrated from using H2 to a real MySQL instance.

Real Databases: Where possible I would highly recommend testing against a database that’s as close as possible to the one being run in your production environment. A variety of factors can make this diffi cult or even impossible, such as commercial databases requiring a license fee meaning that installing on each and every developer machine is prohibitively costly. A classic way to get around this problem is to have a single devel-opment database available for everyone to connect to. This in itself can cause a different set of problems, not least of which is performance (these always seem to get installed on the cheap-est and oldest hardware) and test repeatability. The issue with sharing a database with other developers is that two or more people running the tests at the same time can lead to incon-sistent results and data shifting in an unexpected way. As the number of people using the database grows, this problem gets worse – throw the CI server into the mix and you can waste a lot of time re-running tests and trying to fi nd out if anyone else is running tests right now in order to get a clean build.

If you’re running a “free” database such as MySQL or one of the many free NoSQL options, installing on your local development machine can still be problematic – issues such as needing to run multiple versions concurrently or keeping everyone informed of exactly what infrastructure needs to be up and what ports they need to be bound to. This model also requires the software to be up and running prior to perform-ing a build making onboarding staff onto a new project more time consuming than it needs to be.

Thankfully over the last few years several tools have ap-peared to ease this, the most notable being Vagrant and Dock-er. As an example, spinning up a local version of MySQL in Docker can be as easy as issuing the following command:

$ docker run -p 3306:3306 -e MYSQL_ROOT_PASSWORD=bob mysql

This will start up a self-contained version of the latest MySQL mapped to the local port of 3306 using the root password provided. Even on my 4 year old MacBook Pro, after the ini-tial image download, this only takes 12 seconds. If you need Redis 2.8 running as well you can tell Docker to do that too:

$ docker run -p 6389:6389 redis:2.8

Or the latest version running on a different local port:

$ docker run -p 6390:6389 redis:latest

This can be easily plugged into your build system to make the whole process automated meaning the only software your

Listing 2

{code}public class EmbeddedDatabaseTest { private DataSource dataSource;

@Before public void createDatabase() { dataSource = new EmbeddedDatabaseBuilder(). setType(EmbeddedDatabaseType.H2). addScript("schema.sql"). addScript("test-data.sql"). build(); }

@Test public void aTestRequiringADataSource() { // execute code using DataSource }}

The Unit Test is dead. Long live the Unit Test!

Hear Colin Vipurs speak at the JAX London: Unit tests are the life-blood of any modern development practise, helping developers not only ensure the robustness of their code but to also speed up the development cycle by providing fast feedback on code chan-ges. In reality this isn’t always the case and even with the most diligent of refactorings applied, unit tests can actually become a hindrance to getting the job done effectively.

Tests


developers need on the local machine is Docker (or Vagrant) and the infrastructure required for the build can be packaged into the build script!

Testing Approach: Now you have your database up and running the question becomes “how should I test?”. Depend-ing on what you’re doing the answer will vary. A greenfield project might see a relational schema changing rapidly in the early stages whereas an established project will care more about reading existing data. Is the data transient or long lived? Most* applications making use of Redis would be do-ing so with it acting like a cache so you need to worry less about reading existing data.

* Most, not all. I’ve worked with a fair few systems where Redis is the primary data store.

The first thing to note is that for functional tests the best thing to do is start with a clean, empty database. Repeatabil-ity is key and an empty database is a surefire way to ensure this. My preference is for the test itself to take care of this, purging all data at the beginning of the test, not the end. In the

event of a test failure, having the database still populated is an easy way to diagnose problems. Cleaning up state at the end of the test leaves you no trace and as long as every test follows this pattern you’re all good.

A popular technique for seeding test data is to use a tool like DbUnit which lets you express your data in files and have it easily loaded. I have two problems with this; the first is that if you’re using a relational database there is duplication between the DB schema itself and the test data. Not only does a schema change require changing the dataset file(s) but the test data is no longer in the test class itself meaning a context switch between tests and data. For an example of a of DbUnit XML file see Listing 3.

One question I usually hear from newcomers to DB test-ing is whether they should round-trip the data or poke the database directly for verification. Round-tripping is an im-portant part of the testing cycle as you really need to know that the data you’re writing can be read back. An issue with this though is that that you’re essentially testing two things at once, so if there is a failure on one side it can be hard to de-termine what that is. If you’re using TDD (of course you are) then tackling the problem will likely feel very uncomfortable as the time between red and green can be quite high and you won’t be getting the fast feedback you’re used to.

The technique I have adopted is a hybrid approach that let’s me get the best of both approaches while mostly avoiding the drawbacks of each. The first test I write will be a pure read test that will insert the data by hand within the test itself. Al-though this seems like duplication, and it is a little bit, the test code will bypass any logic the write path might make. For ex-ample, an insert that has an “ON DUPLICATE KEY” clause will not do this and make the assumption this record does not exist as the test is in complete control of the state of the data. The test will then use the production code to read back what the test has inserted and presto, the read back is verified. An example of a read test can be seen in Listing 4.

Once the read path is green, the write tests will round-trip the data using production code for both writing and reading. Because the read path is known to be good, there is only the write path to worry about. A failure on the read path at some point in the future will cause both sets of tests to fail, but a failure only on the write path helps isolate where the prob-lem is. In addition, if you’re using a test DSL for verifying the read path, it can be reused here to save you time writing those pesky assertions! An example of a round-trip test can be seen in Listing 5.

Listing 3

<dataset> <USER FIRST_NAME="John" SURNAME="Smith" DOB="19750629"/> <USER FIRST_NAME="Jane" SURNAME="Doe" DOB="19780222"/> </dataset>

Listing 4

def "existing user can be read"() { given: sql.execute('INSERT INTO users (id, name) VALUES (1234, "John Smith")') when: def actualUser = users.findById(1234) then: actualUser.id == 1234 actualUser.name == 'John Smith'}

Listing 5

def "new user can be stored"() { given: def newUser = new User(1234, "John Smith") when: users.save(newUser) then def actualUser = users.findById(1234) actualUser.id == 1234 actualUser.name == 'John Smith'}

Colin Vipurs started professional software development in 1998 and re-leased his first production bug shortly after. He has spent his career work-ing in a variety of industries using a wide range of technologies always attempting to release bug-free code. He holds a MSc from Liverpool Uni-versity and currently works at Shazam as a Developer/Evangelist. He has

spoken at numerous conferences worldwide.

Finance IT


by Patricia Hines

Financial Institutions (FIs) find that deploying PaaS and IaaS solutions within a private cloud environment is an attractive alternative to technology silos created by disparate server hardware, operating systems, applications and application programming interfaces (APIs). Private cloud deployments enable firms to take a software-defined approach to scaling and provisioning hardware and computing resources.

While other industries have long enjoyed the increased agility, improved business responsiveness and dramatic cost savings by shifting workloads to public clouds, many firms in highly regulated industries like financial services, health-care and government are reluctant to adopt public cloud. As a result of increased regulatory and compliance scrutiny for these firms, the potential risks of moving workloads to public clouds outweigh any potential savings.

Private cloud and PaaS trendsThe definition of what comprises a private cloud deployment vary, with some analysts and vendors equating private cloud with Infrastructure as a Service (IaaS) and others broadening the term to encompass both IaaS and Platform as a Service (PaaS). Whatever the definition, many financial services firms have already deployed private cloud, IaaS and PaaS technolo-gies, often driven by platform simplification and consolida-tion initiatives.

Vendor platforms for private PaaS are gaining popularity with a wide range of available proprietary and open source solutions. Proprietary vendors include Apprenda and Pivot-al (which is a commercial version built on Cloud Foundry). Open source platforms include Cloud Foundry, OpenShift, Apache Stratos and Cloudify. Many banks are choosing open source-based solutions as an insurance policy against vendor lock-in. Moreover, with the source code under the pressure of public scrutiny, the quality of these applications is often higher than their proprietary rivals.

Business drivers for private cloud and PaaS adoptionAccording to Forrester, the top two business drivers for private cloud adoption are improved IT manageability and

flexibility, followed by a transformed IT environment with optimized systems of record and empowered developers. For those citing improved IT manageability and flexibility, there is a desire to collect, analyse and centralize error and event logs to manage and monitor performance against SLAs. For those adopting private cloud to empower developers, the choice is viewed as a foundational element to allow developer self-ser-vice for provisioning application environments and deploy-ing code throughout the application lifecycle. PaaS promises to abstract applications from their underlying infrastructure, enabling faster deployment and time to market.

Limitations of private cloud and PaaSMost large banks have thousands of systems in place to sup-port millions of customers. They host these systems on a complex, heterogeneous mix of systems, many of which have been in place for a long time. For example, many core bank-ing systems are still running on IBM mainframes and AS/400 platforms because of their security, reliability, scalability and resiliency. FIs continue to depend on third-party hosted ap-plications for functions ranging from bill pay to credit checks, which along with SaaS applications for CRM and HR man-agement, will remain outside of the private cloud’s domain.

As firms evaluate their private cloud architecture, they need to consider how they can achieve their business goals of im-proved IT manageability and empowered developers across a heterogeneous, hybrid environment. Although it is possible to re-host and re-architect core legacy systems onto modern platforms like Java and .Net, these projects will extend far into the future. As a result, financial institutions need to man-age and monitor disparate environments, each with its own challenges and restrictions, for the foreseeable future.

When a FI adopts private cloud and PaaS technologies to simplify IT management for application deployment, they are adding another technology stack to the already complex mix. To make matters worse, some FIs have deployed (or are eval-uating) multiple private cloud and PaaS platforms, often with disparate capabilities and restrictions, and proprietary APIs. With the mix of private cloud, IaaS, and PaaS environments that must coexist with legacy infrastructure, critical “health” managing and monitoring becomes more difficult.

Financial services PaaS and private clouds: Managing and monitoring disparate environments

Private cloud trends Not all enterprises and IT teams can enjoy the luxuries of the public cloud. So let’s take a look at the limits and the risks of the alternative: the private cloud and PaaS.

Finance IT


Even if a firm decides to eventually re-architect legacy ap-plications for private PaaS hosting or move workloads across multiple PaaS solutions, it is critical that organizations de-velop an overarching connectivity strategy to seamlessly tie together systems, data and workflow that accommodates a long-term migration journey. In order for the organization to achieve a “single pane of glass” for managing and monitor-ing, organizations need the ability to connect and integrate the various environments and enable service discovery, nam-ing, routing, and rollback for SOAP web services, REST APIs, microservices and data sources.

Managing disparate environmentsThe combination of endpoints – data sources, applications, web services, APIs and processes – are ever growing and evolving. In order to orchestrate a well governed but agile application landscape, IT architects need to re-consider their integration approach. A unified integration platform can han-dle any type of integration scenario, particularly high-com-plexity requirements for high performance, throughput and security involving a combination of application, B2B, and SaaS integration needs, whether on-premises or in the cloud. Organizations facing the need to manage heterogeneous ar-chitectural environments have an opportunity to address a wide range of requirements by means of a unified, full stack for connectivity on one platform – connectivity, orchestra-tion, services, and APIs.

As firms adopt multi-vendor solutions, they need a way to abstract the complexity of their private cloud vendor and ar-chitecture decisions. With a unified connectivity solution, you can beta test multiple PaaS environments using an indepen-dent orchestration layer with a single API layer to back end systems and databases. The connectivity layer helps you to avoid PaaS vendor lock-in while increasing interoperability and data portability.

A unified integration layer enables organizations to take an API-led connectivity approach for xPaaS (Application Platform-as-a-Service, Database Platform-as-a-Service, Mid-dleware Platform-as-a-Service, etc.) integration and manage-ment. API-led connectivity packages underlying connectivity and orchestration services into easily composable, discover-

able and reusable building blocks. Reusable building blocks accelerate time to market for new products and services whether packaged vs. custom, on-premise vs. off-premise. Rather than each developer needing to have a deep under-standing of an external application’s API intricacies, they can use the integration layer to compose their applications with connectivity as needed to easily automate tasks, access data-bases and call web services by leveraging APIs.

Private cloud, IaaS and PaaS technologies are on the IT agendas of many financial services firms. But those technolo-gies are just one piece of the infrastructure puzzle. In order to simplify IT management and empower developers, you need a blending and bridging of environments that delivers agility across infrastructure silos. MuleSoft’s Anypoint Platform is the only solution that enables end-to-end connectivity across API, service orchestration and application integration in a single platform.

The single platform enables IT organizations to take a bi-modal approach to private cloud management – driving speed to market and agility while enforcing a governance process to avoid fragmentation and duplication of services. MuleSoft, a proven on-premises, hybrid and cloud integration leader, provides a virtual agility layer, allowing new services on the PaaS to interact with legacy on-premise mainframes or SaaS environments in the cloud (Figure 1).

Each of the building blocks in Anypoint Platform delivers purposefully productized APIs, powerful Anypoint core and ubiquitous connectivity. Based on consistent and repeatable guiding principles, the Anypoint Platform delivers tools and services for runtime, design time, and engagement that en-able successful delivery for each audience, whether internal or external. MuleSoft’s Anypoint Platform is architecturally independent – it is agnostic in terms of private cloud, IaaS or PaaS solutions, whether custom-built or purchased from a third-party provider. Customers have the freedom and agility to abstract connectivity and integration from the underlying infrastructure, platform and application environments maxi-mizing efficiency and business value.

Part of simplifying your architecture and becoming more agile is having flexibility. MuleSoft’s unique connectivity ap-proach allows you to plan for the future. You may start with an established infrastructure provider and move to an emerg-ing pure-play PaaS provider. You may build applications for on-premises deployment but later decide to host them in the cloud. Anypoint Platform has a single code base for on-premises, hybrid and cloud deployment, adapting to chang-ing business and regulatory conditions. This single code base ensures integration and interoperability across the enterprise with transparent access to data, seamless monitoring and se-curity, and the agility to respond to changing business needs.

Figure 1: MuleSoft

Patricia Hines is the financial services industry marketing director at MuleSoft, a San Francisco-based company that makes it easy to connect applications, data and devices.

Web


by Kris Beevers

Internet based applications are built markedly different to-day than they were even just a few years ago. Application architectures are largely molded by capabilities of the infra-structure and core services upon which the applications are built. In recent years we’ve seen tectonic shifts in the ways infrastructure is consumed, code is deployed and data is managed.

A decade ago, most online properties lived on physical infrastructure in co-location environments, with dedicated connectivity and big-iron database back ends, managed by

swarms of down-in-the-muck systems administrators with arcane knowledge of config files, firewall rules and network topologies. Applications were deployed in monolithic models, usually in a single datacentre – load balancers fronting web heads backed by large SQL databases, maybe with a caching layer thrown in for good measure.

Since the early 2000s, we’ve seen a dramatic shift toward “cloudification” and infrastructure automation. This evo-lution has led to an increase in distributed application to-pologies, especially when combined with the explosion of database technologies that solve replication and consistency challenges, and configuration management tools that keep

The future of traffic management technology

Intelligent traffic man-agement in the modern application ecosystem As application architecture continues to undergo change, modern applications are now living in increas-ingly distributed and dynamic infrastructure. Meanwhile, DNS and traffic management markets are finally shifting to accommodate the changing reality.

©is

tock

phot

o.co

m/r

etro

rock

et

Web


track of dynamically evolving infrastructures. Today, most new applications are built to be deployed – at minimum – in more than one datacentre, for redundancy in disaster recov-ery scenarios. Increasingly, applications are deployed at the far-flung “edges” of the Internet to beat latency and provide lightning fast response times to users who’ve come to expect answers (or cat pictures) in milliseconds.

As applications become more distributed, the tools we use to get eyeballs to the “right place” and to provide the best ser-vice in a distributed environment have lagged behind. When an application is served from a single datacentre, the right service endpoint to select is obvious and there’s no decision to be made, but the moment an application is in more than one datacentre, endpoint selection can have a dramatic impact on user experience.

Imagine someone in California interacting with an appli-cation served out of datacentres in New York and San Jose. If the user is told to connect to a server in New York, most times, they’ll have a significantly worse experience with the application than if they’d connected to a server in San Jose. An additional 60–80 milliseconds in round trip time is tacked onto every request sent to New York, drastically decreasing the application’s performance. Modern sites often have 60–70 assets embedded in a page and poor endpoint selection can impact the time to load every single one of them.

Solving endpoint selection How have we solved endpoint selection problems in the past? The answer is, we haven’t – at least, not very effectively.

If you operate a large network and have access to deep pockets and a lot of low-level networking expertise, you might take advantage of IP anycasting, a technique for rout-ing traffic to the same IP address across multiple datacentres. Anycasting has proven too costly and complex to be applied to most web applications.

Most of the time, endpoint selection is solved by DNS, the domain name system that translates hostnames to IP ad-dresses. A handful of DNS providers support simple notions of endpoint selection for applications hosted in multiple da-tacentres. For example, the provider might ping your serv-ers, and if a server stops responding, it is removed from the endpoint selection rotation. More interestingly, the provider may use a GeoIP database or other mechanism to take a guess at who’s querying the domain and where they’re located, and send the user to the geographically closest application end-point. These two simple mechanisms form the basis of many large distributed infrastructures on the Internet today, includ-ing some of the largest content delivery networks (CDNs).

In today’s modern Internet, applications live in increasingly distributed and dynamic infrastructure. The DNS and traf-fic management markets are finally shifting to accommodate these realities.

Modern DNS and traffic management providers are be-ginning to incorporate real-time feedback from application infrastructures, network sensors, monitoring networks and other sources into endpoint selection decisions. While basic health checking and geographic routing remain tools of the trade, more complex and nuanced approaches for shifting

traffic across datacentres are emerging. For example, some of the largest properties on the Internet, including major CDNs, are today making traffic management decisions based not only on whether a server is “up” or “down,” but on how loaded it is, in order to utilize the datacentre to capacity, but not beyond.

Several traffic management providers have emerged that measure response times and other metrics between an appli-cation’s end users and datacentres. These solutions leverage data in real time to route users to the application endpoint that’s providing the best service, for the user’s network, right now, ditching geographic routing altogether. Additional traffic management techniques, previously impossible in the context of DNS, are finding their way to market, such as endpoint stickiness, complex weighting and prioritizing of endpoints, ASN and IP prefix based endpoint selection and more.

The mechanisms and interfaces for managing DNS configu-ration are improving, as new tools mature for making traffic management decisions in the context of DNS queries. While legacy DNS providers restrict developers to a few proprietary DNS record types to enact simplistic traffic management be-haviours, modern providers offer far more flexible toolkits. This enables developers to either write actual code to make endpoint selection decisions or offering flexible, easy to use rules engines to mix and match traffic routing algorithms into complex behaviours.

What’s next for traffic management technology? As with many industries, traffic management will be driven by data. Leading DNS and traffic management providers, such as NSONE, already leverage telemetry from application in-frastructure and Internet sensors. The volume and granularity of this data will only increase, as will the sophistication of the algorithms that act on it to automate traffic management decisions.

DNS and traffic management providers have found addi-tional uses for this data outside of making real-time endpoint selection decisions. DNS providers are already working with larger customers to leverage DNS and performance telemetry to identify opportunities for new datacentre deployments to maximize performance impact. DNS based traffic manage-ment will be an integral part of a larger application delivery puzzle that sees applications themselves shift dynamically across datacentres in response to traffic, congestion and other factors.

Applications and their underlying infrastructure have changed significantly in the last decade. Now, the tools and systems we rely on to get users to the applications are finally catching up.

Kris Beevers is an internet infrastructure geek and serial entrepreneur who’s started two companies, built the tech for two others, and has a particular specialty in architecting high volume, globally distributed inter-net infrastructure. Before NSONE, Kris built CDN, cloud, bare metal, and other infrastructure products at Voxel, a NY based hosting company that

sold to Internap (NASDAQ:INAP) in 2011.

https://www.crunchbase.com/person/kris-beevers#sthash.xmxgexW5.dpuf

Benchmarks


by Aysylu Greenberg

Benchmarking software is an important step in maturing a sys-tem. It is best to benchmark a system after correctness, usabil-ity, and reliability concerns have been addressed. In the typical lifetime of a system, emphasis is fi rst placed on correctness of implementation, which is verifi ed by unit, functional, and in-tegration tests. Later, the emphasis is placed on the reliability and usability of the system, which is confi rmed by the monitor-ing and alerting setup of a system running in production for an extended period of time. At this point, the system is fully functional, produces correct results, and has the necessary set of features to be useful to the end client. At this stage, bench-marking the software helps us to gain a better understanding of what improvement work is necessary to help the system gain a competitive edge.

There are two types of benchmarks one can create – perfor-mance and quality. Performance benchmarks generally meas-ure latency and throughput. In other words, they answer the questions: “How fast can the system answer a query?”, “How many queries per second can it handle?”, and “How many concurrent queries can the system handle?” Quality bench-marks, on the other hand, address domain specifi c concerns, and do not translate well from one system to another. For instance, on a news website, a quality benchmark could be the total number of clicks, comments, and shares on each article. In contrast, a different website may include not only those properties but also what the users clicked on. This might hap-pen because the website’s revenue is dependent on the number of referrals, rather than how engaging a particular article was.

Speaking of revenue, the goal of a benchmark is to guide optimizations in the system and to defi ne the performance goal. A good benchmark should be able to answer the ques-tion “How fast is fast enough?” It allows the company to keep the users of the system happy and keep the infrastruc-ture bills as low as possible, instead of wasting money on unneeded hardware.

There’s a spectrum of cost and benefi t trade-offs a bench-mark designer should be aware of. Specialized benchmarks that utilize realistic workloads and model the production en-vironment closely are expensive to set up. A common problem is that special infrastructure needs to exist to be able to dupli-cate the production workload. Aggregation and verifi cation of results is also a very involved process, as it requires thor-ough analysis and application of moderately sophisticated statistical techniques. On the other hand, micro-benchmarks are quick and easy to set up, but they often produce mislead-ing results, since they might not be measuring a representative workload or set of functionality.

To get started with designing a benchmark, it is helpful to pose a question for the system, e.g. “How fast does the page load for the user when they click to see contents of their cart?” By pairing that with the goal of the benchmark, e.g. “How fast does the page need to load for a pleasant user ex-perience?” this gives the team guidance for their optimization work and helps to determine when a milestone is reached.

Benchmarking is both an engineering and a business prob-lem. Clearly defi ning the question and the goal for the bench-mark helps utilize compute and engineer hours effectively. When designing a benchmark, it’s important to consider how much “bang for the buck” the system will receive from the benchmarking work. Benchmarks with wide coverage of the system’s functionality and thorough analysis of the results are expensive to design and set up, but also provide more con-fi dence in the behaviour of the system. On the other hand, smaller benchmarks might answer narrow questions very well and help get the system closer to the goal much faster.

Cost, scope and focus

Trade-offs in bench-marking Is it quality you’re looking to improve? Or performance? Before you decide on what kind of a bench-mark your system needs, you need to know the spectrum of cost and benefi t trade-offs.

Benchmarking: You’re Doing It Wrong

Hear Aysylu Greenberg speak at the JAX London: Knowledge of how to set up good benchmarks is invaluable in understanding performance of the system. Writing correct and useful benchmarks is hard, and verifi cation of the results is diffi cult and prone to errors. When done right, benchmarks guide teams to improve the performance of their systems. In this talk, we will discuss what you need to know to write better benchmarks.

Aysylu Greenberg works at Google on a distributed build system. In her spare time, she works on open source projects in Clojure, ponders the design of systems that deal with inaccuracies, paints and sculpts.

Security


by Sheldon Smith

Using a VoIP system to handle calls for your company? You’re not alone. In 2014, the worldwide VoIP services market reached almost $70 billion and is on pace for an-other banner year in 2015. Despite the usability, flexibility and cost effectiveness of VoIP systems, companies need to be aware of several common threats that could dramatically increase costs or put company secrets at risk. Here are five of the most common VoIP threats and how your company can stay secure.

I – Transmission issuesUnlike plain old telephone service (POTS), VoIP systems rely on packet-switched telephony to send and receive mes-sages. Instead of creating a dedicated channel between two endpoints for the duration of a call using copper wires and analog voice information, call data is transmitted using thou-sands of individual packets. By utilizing packets, it’s possible to quickly send and receive voice data over an internet con-nection and VoIP technologies are designed in such a way that packets are re-ordered at their destination so calls aren’t out of sync or jittery.

Five tips to stay secure

Common threats to your VoIP system VoIP remains a popular system for telephone communication in the enterprise. But have you ever considered the security holes this system is leaving you open to? And what com-pany secrets are at risk of eavesdropping, denial of service and “Vishing” attacks?

©is

tock

phot

o.co

m/v

ladr

u©

isto

ckph

oto.

com

/Yan

C

Security


What’s the risk? The transmission medium itself. POTS lines are inherently secure since a single, dedicated connection is the only point of contact between two telephones. Though when voice data is transmitted over the internet at large, it becomes possible for malicious actors to sniff out traffic and either listen in on conversations or steal key pieces of data. The solution? Encrypt your data before it ever leaves local servers. You’ve got two choices here: Set up your own encryp-tion protocols in-house, or opt for a VoIP vendor that bundles a virtual private network (VPN), which effectively creates a secure “tunnel” between your employees and whoever they call.

II – Denial of serviceThe next security risk inherent to VoIP? Attacks intended to slow down or shut down your voice network for a period of time. As noted by a SANS Institute whitepaper, malicious attacks on VoIP systems can happen in a number of ways. First, your network may be targeted by a denial of service (DOS) flood, which overwhelms the system. Hackers may also choose buffer overflow attacks or infect the system with worms and viruses in attempt to cause damage or prevent your VoIP service from being accessed. As noted by a recent CBR article, VoIP attacks are rapidly becoming a popular avenue for malicious actors – UK-based Nettitude said that within minutes of bringing a new VoIP server online, attack volumes increased dramatically.

Dealing with these threats means undertaking a security audit of your network before adding VoIP. Look for insecure endpoints, third-party applications and physical devices that may serve as jumping-off points for attackers to find their way into your system. This is also a good time to assess leg-acy apps and older hardware to determine if they’re able to handle the security requirements of internet-based telephony. It’s also worth taking a hard look at any network protection protocols and firewalls to determine if changes must be made. Best bet? Find an experienced VoIP provider who can help you assess existing security protocols.

III – EavesdroppingAnother issue for VoIP systems is eavesdropping. If your traf-fic is sent unencrypted, for example, it’s possible for motivat-ed attackers to “listen in” on any call made. The same goes for former employees who haven’t been properly removed from the VoIP system or had their login privileges revoked. Eavesdropping allows malicious actors to steal classified in-

formation including phone numbers, account PINs and us-ers’ personal data. Impersonation is also possible – hackers can leverage your VoIP system to make calls and pose as a member of your company. Worst case scenario? Customers and partners are tricked into handing over confidential in-formation.

Handling this security threat means developing policies and procedures that speak to the nature of the problem. IT depart-ments must regularly review who has access to the VoIP sys-tem and how far this access extends. In addition, it’s critical to log and review all incoming and outgoing calls.

IV – VishingAccording to the Government of Canada’s “Get Cyber Safe” website, another emerging VoIP threat is voice phising or “vishing”. This occurs when malicious actors redirect le-gitimate calls to or from your VoIP network and instead connect them to online predators. From the perspective of an employee or customer the call seems legitimate and they may be convinced to provide credit card or other informa-tion. Spam over Internet Telephony (SPIT) is also a growing problem; here, hackers use your network to send thousands of voice messages to unsuspecting phone numbers, damag-ing your reputation and consuming your VoIP transmis-sion capacity. To manage this issue, consider installing a separate, dedicated internet connection for your VoIP alone, allowing you to easily monitor traffic apart from other in-ternet sources.

V – Call fraudThe last VoIP risk comes from the call fraud, also called toll fraud. This occurs when hackers leverage your network to make large volume and lengthy calls to long-distance or “pre-mium” numbers, resulting in massive costs to your company. In cases of toll fraud, meanwhile, calls are placed to revenue-generating numbers – such as international toll numbers – which generate income for attackers and leave you with the bill.

Call monitoring forms part of the solution here, but it’s also critical to develop a plan that sees your VoIP network regu-larly patched with the latest security updates. Either create a recurring patch schedule or find a VoIP provider that auto-matically updates your network when new security updates become available.

VoIP systems remain popular thanks to their ease-of-use, agility and global reach. They’re not immune to security is-sues – but awareness of common threats coupled with proac-tive IT efforts helps you stay safely connected.

“Dealing with these threats means undertaking a secu-rity audit of your network

before adding VoIP.”

Sheldon Smith is a Senior Product Manager at XO Communications. XO provides unified communications and cloud services. XO’s solutions help companies become more efficient, agile, and secure. Sheldon has exten-sive product management and unified communications experience.

REST


by Ben Busse

Where I work at DreamFactory, we designed and built some of the very first applications that used web services on Sales-force.com, AWS and Azure. Over the course of ten years, we learned many painful lessons trying to create the perfect RESTful backend for our portfolio of enterprise applications.

When a company decides to start a new application project, the “business” team first defines the business requirements and then a development team builds the actual software. Usu-ally there is a client-side team that designs the application and a server-side team that builds the backend infrastructure. These two teams must work together to develop a REST API that connects the backend data sources to the client applica-tion.

One of the most laborious aspects of the development pro-cess is the “interface negotiation” that occurs between these two teams (Figure 1). Project scope and functional require-ments often change throughout the project, affecting API and integration requirements. The required collaboration is com-plex and encumbers the project.

Dungeon master development: Complex mazes of custom, handcrafted APIsYou can get away with slow, tedious interface negotiation if you’re just building one simple application. But what if you need to ship dozens, hundreds or even thousands of API-driv-en applications for employees, partners and customers? Each application requires a backend, APIs, user management and security, and you’re on a deadline.

Building one-off APIs and a custom backend for each and every new application is untenable. Mobile is forcing compa-nies to confront this reality (or ignore it at their own peril). With the acceptance of BYOD (“bring your own device”) and

the proliferation of mobile devices, the modern enterprise may need hundreds or even thousands of mobile applica-tions. Backend integration, custom API development, back-end security and testing comprise the lion’s share of a typical enterprise mobile application project (more than half of the time on average).

Most enterprises today are woefully unable to address API complexity at its root cause. Mobile projects typically have new requirements that were not anticipated by the existing REST APIs that are now in production. You could expand the scope of your existing API services, but they are already in production.

So the default option is to create a new REST API for each new project! The API building process continues for each

Figure 1: Interface negotiation

No more custom API mazes

Why reusable REST APIs are changing the game REST APIs make our lives easier – but we’re still in the dark ages when it comes to making our APIs general purpose, portable and reusable. DreamFactory evangelist Ben Busse describes some com-mon pitfalls of hand-coding custom REST APIs and explores the architectural advantages and techni-cal characteristics of reusable REST APIs.

REST


new app with various developers, consultants and contrac-tors. The result is custom, one-off APIs that are highly frag-mented, fragile, hard to centrally manage and often insecure. The API dungeon is an ugly maze of complexity (Figure 2).

•Custom, manually coded REST APIs for every new appli-cation project, written with different tools and developer frameworks.

•REST APIs are hardwired to different databases and file storage systems.

•REST APIs run on different servers or in the cloud. •REST APIs have different security mechanisms, credential

strategies, user management systems and API parameter names.

•Data access rights are confused, user management is com-plex and application deployment is cumbersome.

•The system is difficult to manage, impossible to scale and full of security holes.

•API documentation is often non-existent. Often, compa-nies can’t define what all the services do, or even where all of the endpoints are located.

The future: reusable REST APIsThe core mistake with the API dungeon is that develop-ment activity starts with business requirements and appli-cation design, and then works its way back to server-side data sources and software development. This is the wrong direction.

The best approach is to identify the data sources that need to be API-enabled and then create a comprehensive and reus-able REST API platform that supports general-purpose ap-plication development (Figure 3).

There are huge benefits to adopting a reusable REST API strategy.

•APIs and documentation are programmatically generated and ready to use.

•There’s no need to keep building server-side software for each new application project.

•Client-side application design is decoupled from security and administration.

•The “interface negotiation” is simplified. •Development expenses and time to market are dramati-

cally reduced. •Developers don’t have to learn a different API for each

project. •RESTful services are no longer tied to specific pieces of

infrastructure. •Companies can easily move applications between servers

and from development to test to production.

Technical characteristics of a reusable APIThis sounds good in theory, but what are the actual techni-cal characteristics of reusable REST APIs? And how should reusable APIs be implemented in practice? The reality is that there’s no obvious way to arrive at this development pattern until you’ve tried many times the wrong way, at which point it’s usually too late.

DreamFactory tackled the API complexity challenge for over a decade, built a reusable API platform internally for our own projects and open sourced the platform for any developer to use. We had to start from scratch many times before hitting on the right design pattern that enables our developers to build applications out of general-purpose in-terfaces.

There are some basic characteristics that any reusable REST API should have:

•REST API endpoints should be simple and provide param-eters to support a wide range of use cases.

•REST API endpoints should be consistently structured for SQL, NoSQL and file stores.

•REST APIs must be designed for high transaction volume, hence simply designed.

•REST APIs should be client-agnostic and work inter-changeably well for native mobile, HTML5 mobile and web applications.

A reusable API should have the attributes below to support a wide range of client access patterns:

Figure 3: Reusable REST APIs

Figure 2: The API dungeon

REST


•Noun-based endpoints and HTTP verbs are highly effec-tive. Noun-based endpoints should be programmatically generated based on the database schema.

•Requests and responses should include JSON or XML with objects, arrays and sub-arrays.

•All HTTP verbs (GET, PUT, DELETE, etc.) need to be implemented for every use case.

•Support for web standards like OAuth, CORS, GZIP and SSL is also important.

It’s crucially important to have a consistent URL structure for accessing any backend data source. The File Storage API should be a subset of the NoSQL API, which should be a sub-set of the SQL API (Figure 4).

Figure 5: Interactive, auto-generated API docs

Parameter names should be reused across services where possible. This presents developers with a familiar interface for any data source. The API should include automatically generated, live interactive documentation that allows devel-opers to quickly experiment with different parameters (Fig-ure 5).

In general, the structure of the request URL and associated parameters needs to be very flexible and easy to use, but also comprehensive in scope. Looking at the example below, there is a base server, an API version, the backend database (the API name) and a particular table name in the request URL

Figure 4: SQL API and subsets

Figure 6: Request URL

string. Then the parameters specify a filter with a field name, operator and value. Lastly, an additional order pa-rameter sorts the returned JSON data array (Figure 6).

A huge number of application devel-opment scenarios can be implemented just with the filter parameter. This al-lows any subset of data to be identified and operated on. For example, objects in a particular date range could be loaded into a calendar interface with a filter string (Figure 7).

Complex logical operations should also be supported and the filter string interface needs to protect against SQL injection attacks. Other database-spe-cific features include:

•Pagination and sorting•Rollback and commit•Role-based access controls on tables•Role-based access controls on re-

cords•Stored functions and procedures

A comprehensive reusable REST API should also support operations on ar-rays of objects, but you can also spec-

REST


Figure 9: Field validation and worflow trigger

Figure 7: Find Task records

Figure 8: Loading a project and all related tasks

ify related objects as a URL parameter. This allows complex documents to be downloaded from a SQL database and used immediately as a JSON object. The data can be edited along with the objects (Figure 8). When committed back to the server, all of the changes are updated including parent, child and junction relationships between multiple tables. This flexibility supports a huge number of very efficient data access patterns.

The vast majority of application de-velopment use cases can be supported with a reusable REST API right out of the box. For special cases, a server-side scripting capability can be used to cus-tomize behavior at any API endpoint (both request and response) or create brand new custom API calls. DreamFac-tory uses the V8 JavaScript engine for this purpose.

Some of the special cases that you might want to implement with server-side scripting include:

•Custom business logic•Workflow triggers•Formula fields•Field validation•Web service orchestration

ConclusionREST API complexity is an important problem for companies building API-driven applications. The tendency to build new APIs for each new project has negative consequences over time. Adopting a REST API platform strat-egy with reusable and general-purpose services addresses this problem and provides many benefits in terms of more agile development and quicker time to value.

Ben Busse is a developer evangelist with DreamFactory in San Francisco. He’s pas-sionate about open source software, mobile development and hunting wild mushrooms in northern California.

APIs


by Per Buer

In recent years, web APIs have exploded. Various tech in-dustry watchers now see them as providing the impetus for a whole “API economy”. As a result and in order to create a fast track for business growth, more and more companies and organizations are opening up their platforms to third parties. While this can create a lot of opportunities, it can also have huge consequences and pose risks. These risks don’t have to be unforeseen, however.

Companies’ checklists for building or selecting API man-agement tools can be very long. Most include the need to of-fer security (both communication security -TLS- and actual API security -keys-), auditing, logging, monitoring, throttling, metering and caching. However, many overlook one critical factor: performance. This is where you can hedge your bets and plan for the potential risk.

There’s an interesting analogy between APIs and the long path websites have travelled since the nineties. Back then, websites had few objects and not that many visitors so per-formance and scalability mattered less. This has changed

Milliseconds matter

Considering the perfor-mance factor in an API-driven world With visitors demanding immediate response times, the fate of a website and the performance of APIs are becoming increasingly intertwined.

Preparing your API Strat-egy for IoT

Hear Per Buer speak at the JAX London: Not that long ago, API calls were counted per hour. Evaluations for API management tools typically have long lists of criteria, but performance is usually left off. That might be fi ne in certain environments but not where IoT and mobile are concerned. For these environments the number of API calls has increased to the point that even the typical rate of 200 API calls per second is no longer enough.

©is

tock

phot

o.co

m/I

nok

APIs


dramatically over the last decade. Today, increasingly impa-tient visitors penalise slow websites by leaving quickly, and in many cases never returning. Microsoft computer scientist Harry Shum says that sites that open just 250 milliseconds faster than competing sites – a fraction of an eye blink – will gain an advantage.

APIs have travelled a similar path. Ten to fi fteen years ago most API management tools out there had very little to do and performance wasn’t an issue. The number of API calls handled was often measured in calls per hour. Consequently, these tools were designed to deal with things other than thou-sands of API calls per second. What a difference a decade can make! According to Statista, worldwide mobile app down-loads are expected to reach 268.69 billion by 2017. But API management tools haven’t caught up. Even nowadays many of the products in the various vendors top right quadrant will only handle rates of 200 API calls per second per server. Their focus has been on features, not performance.

If you open up your API platform, you probably want a lot of developers to use it. However, most web services have introduced a rate limit for API calls. If set high enough, the limit is reasonable to ensure availability and quality of ser-

vice. But what is high enough to provide a competitive advan-tage in our accelerated times? Take for example an industry like banking, where many players are opening up their plat-forms in a competitive bid to attract developers who create third-party apps and help monetise the data. The ones that set the API call limit too low create a bad developer experience, pushing them towards friendlier environments.

A limited number of API calls in web services also affects the end-customer. Take for example online travel operators or online media. In these environments a lot of data needs to fl ow through the APIs. These are becoming more depend-ent on fast and smooth communication between their services and their various apps. If these services slow down due to API call limitations, customers will defect to faster sites.

I compared the situation of APIs with that of the web ten years ago when performance started to matter. The situation that actually developed is much more serious than I initially predicted. Consumers increasingly demand instant gratifi ca-tion. This means that the window for companies to ensure the performance of their APIs is closing. Being able to deliver performance and set a higher limit of API calls can make a huge difference. Otherwise, developers will go elsewhere to help grow another company’s business. If you want to future-proof for the API boom, it’s time to consider the performance factor.

PublisherSoftware & Support Media GmbH

Editorial Offi ce AddressSoftware & Support MediaSaarbrücker Straße 3610405 Berlin, Germanywww.jaxenter.com

Editor in Chief: Sebastian Meyen

Editors: Coman Hamilton, Natali Vlatko

Authors: Kris Beevers, Per Buer, Ben Busse, Holly Cummins, Aysylu Greenberg,

Patricia Hines, Eric Horesnyi, Werner Keil, Angelika Langer, Aviran Mordo,

Chris Neumann, Lyndsay Prewer, Zigmars Raascevskis, Sheldon Smith,

Colin Vipurs, Geertjan Wielenga

Copy Editor: Jennifer Diener

Creative Director: Jens Mainz

Layout: Flora Feher, Dominique Kalbassi

Sales Clerk:Anika Stock+49 (0) 69 [email protected]

Entire contents copyright © 2015 Software & Support Media GmbH. All rights reserved. No part of this publication may be reproduced, redistributed, posted online, or reused by any means in any form, including print, electronic, photocopy, internal network, Web or any other method, without prior written permission of Software & Support Media GmbH.

The views expressed are solely those of the authors and do not refl ect the views or po-sition of their fi rm, any of their clients, or Publisher. Regarding the information, Publisher disclaims all warranties as to the accuracy, completeness, or adequacy of any informa-tion, and is not responsible for any errors, omissions, in adequacies, misuse, or the con-sequences of using any information provided by Pub lisher. Rights of disposal of rewarded articles belong to Publisher. All mentioned trademarks and service marks are copyrighted by their respective owners.

Imprint

“Back then, websites had few objects and not that many

visitors so performance and scalability mattered less.”

Per Buer is the CTO and founder of Varnish Software, the company be-hind the open source project Varnish Cache. Buer is a former programmer turned sysadmin, then manager turned entrepreneur. He runs, cross coun-try skis and tries to keep his two boys from tearing down the house.

performance in an api-driven world - jaxenter · 2018-06-27 · performance in an api-driven world...

Documents