embedded multicore processing for mobile communications real-time contracts and thread profiling...

17
Embedded Multicore processing for Mobile Communications Real-time Contracts and Thread Profiling York Jack Whitham

Upload: darren-ball

Post on 31-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Embedded Multicore processing for Mobile Communications

Real-time Contractsand Thread Profiling

York

Jack Whitham

2

Threads and Applications

Threads

Applications

A thread is an execution context.

An application is a collection of threads.

Not shown: a task (a memory space shared by one or more threads).

3

Thread Utilisation

Each thread has an associated utilisation U.

utilisation U – the amount of CPU resource required by the thread.

U =C

T

C = execution time estimate

T = thread period

York’s work attempts to ensure that every application receives the CPU resource that it needs.

How? Contracts and Servers.

4

What is a contract?

Threads

Applications

Contracts

A contract specifies the CPU resources needed by an application.

• 1-1 relationship between contracts and applications.

Contracts are generated by the CG tool (made in York) based on the requirements of an application (thread utilisation data).

5

What is a server?

A server represents a virtual CPU offering a guaranteed utilisation to a group of threads.

Server utilisation =

A physical CPU can be partitioned into two or more servers.

Userver = ∑Uthread

Threads 0.1 0.2 0.5 0.1

Applications

Contracts

U = 0.5

0.2

Servers U = 0.6

6

CG (contract generator) tool

The CG tool aims to efficiently partition a system:– Always allocate dedicated CPUs to each application (or component)

whenever possible.– Otherwise, try to use as few servers as necessary.

CG tool input:– For each thread: Thread ID and Thread Utilisation

The CG tool groups threads according to their utilisation:– CG places threads into groups.– Whenever the total utilisation of a group of threads Utotal satisfies

Threshold_L <= Utotal <= Threshold_H , a dedicated CPU is required.– Otherwise, a server with a utilisation of Utotal (probably with some margin)

will be used.

CG tool output:– One contract for each application.

7

Contract Negotiation

Negotiation determines which servers should run on each CPU in order to satisfy contracts.

Negotiation may fail if insufficient CPU resources are available to schedule the servers.

A contract (generated by CG) contains the following information:1. Number of dedicated CPUs required.2. Thread allocation to these CPUs.3. Number of servers required.4. The utilisation of each server.5. Thread allocation to each servers.

8

Example of Negotiation

iThreads ja c d e f g h

Applications

Contracts

b

Servers

CPUs

Generated by CG

Specified by contract

Negotiated

Needs 2 servers.Server 1 contains threads [a, b, c] and has utilisation 0.5Server 2 contains threads [d, e] and has utilisation 0.6

Needs 1 CPU and 1 server.CPU has threads [f, g] and has utilisation 0.9Server contains thread [h] and has utilisation 0.2

Needs 1 servers.Server contains threads [i, j] and has utilisation 0.3

9

CN (contract negotiation) tool

The CN tool (developed in York) assigns CPU resources in order to satisfy contracts.

CN tool input:– Number of available CPUs (N).– User-specified shortest period of any server (T).

CN tool algorithm:– Allocate dedicated CPUs first (must be less than N).– Use First-Fit-Decreasing-Utilisation algorithm to allocate servers (of all

contracts) to the remaining CPUs and see whether they are schedulable.

CN tool output – for each CPU:– Set of threads (and priorities) to run on this CPU.– Set of servers (and priorities) to run on this CPU. – Server periods (deadline = period) and budgets.– IDs of threads that will run under a specific server (including priorities).

10

Results

CPU time is guaranteed to each server.– Note that this is not a hard real-time guarantee for each thread.

Synchronisation and communication between threads are not supported in the current version but they will be added soon.

Contracts require utilisation data for each thread:

To use contracts, L4 and LOBA must support the concept of a server.

U =C

T

C = execution time estimate

T = thread period

11

Obtaining utilisation data

York has extended L4 with a thread profiling tool called “csamon”.

The tool dumps live thread execution statistics from L4.

We can use it to calculate the utilisation of each thread.

We can also use it to visualise what is happening inside the system.

12

csamon

We need your help with “csamon”.

• We have to test it with VAST.• We need utilisation data for LTE and other parts of the system.

Download “csamon” from York:

• http://www.jwhitham.org.uk/c/csamon/• Instructions on the web. • Patches can be applied to L4 snapshots from Dresden.

13

Todo: Server support

A server is a group of threads that share an execution time budget.

CN gives a budget and period to each server. The budget is smaller than the period.

When the execution time budget of a server is exhausted, the threads in that server cannot run until the budget is replenished at the end of the period.

In order to use contracts, L4 and LOBA must be aware of servers.

14

Research work from York

Time-predictable access to dynamic data from C programs.– As discussed in October - tasks on MPcore/Cortex-A9 are not time-

predictable in a hard real-time sense; how could this be improved?– What’s the problem?

Data caches Shared memory Heuristic optimisations in each CPU e.g. branch prediction.

Replace data cache with data scratchpad/tightly coupled memory?– Creates a problem for dynamic data structures.

No problem if all variables are local or global, but what about malloc in C, and new in Java/C++?

15

Research work from York

New scratchpad technology is required.– Initial simulator and Microblaze implementations working.– Data cache can be entirely replaced by a time-predictable

component which can store dynamic data structures.– Details - http://www.jwhitham.org.uk/c/smmu.html– Automatically applied to SPEC and Mibench software with good

results in many cases, but some programs need to be modified to make best use of the new technology.

– Two conference papers in submission, one technical report published, downloadable software and HDL on website.

Future work: more analysis and development required!

16

Summary

York has developed a contracts model and three tools (CG, CN, csamon). York has done some research into time-predictable memory.

York needs the utilisation data for the threads in the finished system.

York must work with Dresden to implement support for servers in L4 and LOBA.

More information about contracts and servers:– See emuco-wp2 mailing list archive for 18th June 2009.

17

www.emuco.eu

Prof. Dr. Ing. Attila Bilgic

E-mail: [email protected]

Ruhr-Universität Bochum

Institute for Integrated Systems

Universitätsstrasse 150

D-44780 Bochum

GERMANY

ICT-eMuCo is a European project supported under the Seventh Framework Programme (7FP) for research and technological development

Project Coordination:

Jack Whitham / Yang Chang / Neil Audsley

jack/yang/[email protected]