open compute sw interface 1 01

12
7/21/2019 Open Compute SW Interface 1 01 http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 1/12  SW Interfaces for an Open Compute Project Switch Authors: Aviad Raveh, Matty Kadosh, Ariel Almog

Upload: smahendar

Post on 07-Feb-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 1/12

 

SW Interfaces for an Open Compute

Project Switch

Authors: Aviad Raveh, Matty Kadosh, Ariel Almog

Page 2: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 2/12

 

2 January 2014

1  Scope

This document provides technical specifications for software interfaces of an Open ComputeProject switch.

2  Contents

1 Scope ......................................................................................................................................... 2

2 Contents .................................................................................................................................... 2

3 Overview.................................................................................................................................... 3

3.1 License .......................................................................................................................... 3

4 Interfaces Definition .................................................................................................................. 5

4.1 Open Ethernet Switch API ............................................................................................ 5

4.2 OCP Platform Control ................................................................................................... 7

5 Revision History ....................................................................................................................... 12

Page 3: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 3/12

Open Compute Project SW interfaces for an OCP Switch

http://opencompute.org 3

3  Overview

This document suggests standard and unified interfaces to control an OCP switch silicon and platform. To comply with the below interfaces, each OCP switch should be supplied with aSoftware Development Kit (SDK) that contains drivers that implements the logic for theseinterfaces.

Once done, the unified interface significantly reduces the porting effort for protocols and systemmonitoring stacks when integrating a solution with any system or switch hardware vendor.

Figure 1 shows a high layer software (SW) architecture containing the following components:

  Open Network Install Environment (ONIE) compatible boot loader

   Network OS –  a SW that manages the switching/routing protocols together with system

management logic. This element is out of the doc’s scope 

  Open Ethernet Switch (OES) APIs –  a set of functions that invoke configuration of the fast

 path switch HW.

  Platform interfaces –  a set of functions that controls the system components: Fans,

temperature sensors, power supplies and LEDs. Other components such as watchdog and

interrupt controller can be added as the standard is adopted by the OCP community.

Figure 1 - OCP Software Components

3.1  License

As of April 7, 2011, the following persons or entities have made this Specification availableunder the Open Web Foundation Final Specification Agreement (OWFa 1.0), which is available

at http://www.openwebfoundation.org/legal/the-owf-1-0-agreements/owfa-1-0 : 

Facebook, Inc.

You can review the signed copies of the Open Web Foundation Agreement Version 1.0 for this

Specification at http://opencompute.org/licensing/ , which may also include additional parties tothose listed above.

Page 4: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 4/12

 

4 January 2014

Your use of this Specification may be subject to other third party rights. THIS SPECIFICATIONIS PROVIDED "AS IS." The contributors expressly disclaim any warranties (express, implied, or

otherwise), including implied warranties of merchantability, non-infringement, fitness for a particular purpose, or title, related to the Specification. The entire risk as to implementing orotherwise using the Specification is assumed by the Specification implementer and user.

IN NO EVENT WILL ANY PARTY BE LIABLE TO ANY OTHER PARTY FOR LOSTPROFITS OR ANY FORM OF INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL

DAMAGES OF ANY CHARACTER FROM ANY CAUSES OF ACTION OF ANY KINDWITH RESPECT TO THIS SPECIFICATION OR ITS GOVERNING AGREEMENT,

WHETHER BASED ON BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE),OR OTHERWISE, AND WHETHER OR NOT THE OTHER PARTY HAS BEEN ADVISEDOF THE POSSIBILITY OF SUCH DAMAGE.

Page 5: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 5/12

Open Compute Project SW interfaces for an OCP Switch

http://opencompute.org 5

4  Interfaces Definition

The following section provides a description for generic interfaces for switch and platform

management.

4.1  Open Ethernet Switch API

The current merchant switch silicon devices are supplied with a proprietary SDK. The protocolsdriven SW stacks need to implement a hardware abstraction layer (HAL) to isolate its unified

code from the vendor specific logic. This work should be repeated for each hardware (HW)device that the stack supports.

The Open Ethernet Switch (OES) is a standard driven (e.g. .1Q) interface. Each HW vendorshould provide a glue layer between OES and its SDK APIs. This implies that for all the standarddriven functionality, the protocol stack has no changes when porting from one HW to another.The SDK and glue logic provided by each vendor are responsible to implement the HW specificlogic.

Since not all the HW devices are completely aligned with their functionality, the OES interfacedefinitions include, for each API, a pointer for vendor specific logic. This pointer can be assigned

a null value when there is no need for a unique logic.

The lack of a feature in a given device cannot always be compensated by adding a vendorextension parameter to a single API. This document suggests not solving such conflicts in theOES layer. The solution is to implement a per-HW HAL in the protocol stack directly calling theSDK API. With the existence of the OES API, this effort is required only for the conflicting

features.

The OES API repository is located in https://github.com/open-ethernet/OES . Vendors shouldconsider uploading the OES, SDK glue logic to the same repository.

OES API Convention

The APIs presented are given as an example of the API coding convention.

This API creates/destroys a new/existing LAG group and adds/deletes ports to/from an existing

LAG ports group.

oes_status_e oes_api_lag_port_group_set (

const oes_access_cmd_e access_cmd, 

const int br_id, 

unsigned long * lag_port_p,

const unsigned long * log_port_list_p,

const unsigned short port_cnt,

void * lag_port_group_vs_ext_p

)

Parameters:

[in] access_cmd - CREATE/DESTROY/ADD/DELETE

[in] br_id - bridge ID

[in,out] lag_port_p - In: Already created LAG ports group ID. Out: Newly created LAG ports group

ID. This parameter should be set to null in all the commands besides ADD .

[in] log_port_list_p - list of logical ports to ADD/DELETE to/from a LAG ports group

Page 6: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 6/12

 

6 January 2014

[in] port_cnt_p - number of logical ports to ADD/DELETE to/from a LAG ports group

[in,out] lag_port_group_vs_ext_p - vendor specific extension 

Returns:

OES_STATUS_SUCCESS - operation completes successfully OES_STATUS_PARAM_ERROR - parameter is invalid 

OES_STATUS_RESOURCE_UNAVAILABLE - resource error 

OES_STATUS_ERROR - general error 

This API presents an existing LAG’s ports group.

oes_status_e oes_api_lag_port_group_get (

const int br_id,

const unsigned long lag_port,

unsigned long * log_port_list_p,

unsigned short * port_cnt_p,

void * lag_port_group_vs_ext_p

Parameters:

[in] br_id - bridge ID

[in] lag_port  - LAG ports group ID

[out]  log_port_list_p - list of logical ports that are queried

[in,out] port_cnt_p  - number of logical ports that are queried. If a smaller number is

provisioned, the actual count is returned.

[in,out] lag_port_group_vs_ext - vendor specific extension

Returns:

OES_STATUS_SUCCESS - operation completes successfully 

OES_STATUS_PARAM_ERROR - parameter is invalid 

OES_STATUS_ERROR - resource error 

OES API Names

Each API is built using the following naming convention:oes_api_<subfile>_<functionality >_set/get. A “set” API changes the configuration while the“get” queries the existing HW values.

Access Command

To reduce the number of APIs, a function can have multiple options to its functionality controlled by the “access_ cmd” parameters. Some of those access command values are as follows:

-  Create: Allocates and set a new resource

-  Destroy: De-allocates and delete an already allocated resource

-  Add: Sets a value to a field where no resource needs to be allocated

-  Delete: Erases a value previously configured and returns to default

Page 7: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 7/12

Open Compute Project SW interfaces for an OCP Switch

http://opencompute.org 7

-  Get; get first; get next: Controls the read location from the database

Var iable Types

Variables should be as generic as possible (int, long, short, etc.). Typedefs should be avoided.

Structures/enumerators should be declared at the function header.

Return Value

The API may return one of the following errors:

-  Success: Everything is OK

-  Parameter error: Something is invalid in the parameter sanity check

-  Error: Unexpected error from the SDK

-  Resource unavailable: No HW resources to accept the configuration

Vendor Extensions

Each API supports vendor specific enhancements by pointing to a memory allocated by the

calling function and freed by the SDK implementation. The type that the pointer is pointing toshould be agreed on by both the stack and the vendor implementation and should be out of the

OES scope. API vendor extensions are ignored by passing a NULL pointer.

4.2  OCP Platform Control

The platform interface is implemented by calling a set of files in the Linux kernel (sysfs). Thefollowing paragraph defines the sysfs tree and the HW components file syntax.

To implement a basic system control the following components should be monitored/configured.

-  Fan speed

-  Various thermal sensors

-  Power supply units

-  LEDs

Figure 2 shows a high-level management SW view from the user interface to the HW drivers.This document focuses on the kernel-user interface. Each OCP platform vendor should

implement the platform drivers aligned with the following defined logic.

Page 8: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 8/12

 

8 January 2014

Figure 2 - Platform Control Structure

Platform Interfaces

Vendor Linux Drivers

sysfs

Structural

entity

interface

U

K

Platform daemon

Monitor/configure/alarms

Vendor Linux DriversVendor Linux Drivers

USB

I2C

PCI

sensors.

conf 

WEBCLI SNMP

Platform providercode

Managementcode

Legend

temp2

temp1

LED1

 

Looking to leverage existing open source implementation, the lm-sensors framework

http://www.lm-sensors.org/ is adopted to control a switch system.

Platform Daemon

The platform daemon is a platform management code that should be platform-independent. The per-platform unique variables are defined in a sensors.conf file that is based on predefined syntax

and contains the translation from the driver semantics (e.g. fan1) to the platform semantics (e.g.fan_cpu_1).

Some sensors might be replaceable (e.g. replaceable power supply). The platform daemon shall provide insert and eject events to the upper layers. If an upper layer attempts to set information to

a device that is not currently present in the system, the configuration shall be stored in the platform daemon. If an upper layer attempts to get information from a device that is not currently present in the system, the platform daemon returns a failure notification with a matching cause.

The sensors.conf file contains specific platform data. Below is an example for default temperaturesensor values:

set temp1_max_hyst 45

set temp1_max 52

set temp1_crit_hyst 57set temp1_crit 62

label temp1 "CPU Temp"

The “set” commands configure values to their matching fields. Label is used to identify devices ina more readable way. A set of common naming label is used (# represents a numeric variable):

  fan_ps_#

  fan_chassis_#

   ps_#

Page 9: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 9/12

Open Compute Project SW interfaces for an OCP Switch

http://opencompute.org 9

  led_ps_#

  led_chassis_#

In some systems, the hardware is more autonomous than in others. For example, some platformsmight control the fan speed by hardware while others might do it by software. To support both

models, the platform code shall support monitoring functionality for temperature and voltage. The

 platform daemon is configured by the sensors.conf file to handle monitoring by hardware or bysoftware.

sysfs Interface

In the sysfs interface, each hardware device gets its own directory under the /sys/devices tree.Utility sensors scan the symbol links under /sys/class/hwmon/ tree to allocate the platform

devices. Figure 3 shows the relations between the per-bus and per-device sysfs trees.

Figure 3 - syfs Structure

/

sys

devices

class

i2c...

xxx

hwmon

hmon1

fan1

input

minhmon2

input

min

input

min

S  y  m b o l  i  c  l  i  n k  

temp1

 

When a device is not currently present in the system (e.g. if the fan tray is pulled out for

maintenance/replacement), the driver deletes this device from the file system.

Some of the sensors may be related to others. For example, a fan tray may have 2 fans; each

equipped with its own speed meter, with a single PWM unit is shared for both fans. In such case“set speed” commands set the speed of both fan units (i.e. setting first fan speed sets the secondfan speed and vice versa).

Fans

 Note: The fans definitions in this document are based on lm-sensor 3.3.4 (May 13).

The fan file system supports the following information (or part of it).

File

Name

Description Access Note

input Measured fan speed (in RPM) RO

Page 10: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 10/12

 

10 January 2014

min Minimum fan speed value (in RPM) RWIf the softwarecontrols the speed.Only the min shows.

max Maximum fan speed value (in RPM) RW

div Fan divisor. Integer value in powers of two (1, 2,4, 8, 16, 32, 64, 128).

RW

 pulses Number of tachometer pulses per fan revolution.Integer value, typically between 1 and 4.

RW

alarm 1 means an alarm condition exists;0 means no alarm

RO “alarm” won’t showif “min_alarm” or

“max_alarm” are presented and vice

versa

min_alar m

1 means an alarm condition exists;speed too slow

RO

max_alar m

1 means an alarm condition exists;speed too high

RO

fault This can be used to notify open diodes,unconnected fans etc.

RO

 beep Beeped when an alarm occurs RW

Leds

The LEDs file system supports the following information.

File Name Description Access

operation Color and flashing rate: nocolor, yellow, green, red, blue, yellow_blink,green_blink, red_blink, yellow_fast, green_fast, red_fast, hw_control.

RW

Temp

 Note: The temp definitions in this document are based on lm-sensor 3.3.4 (May 13).

0 shows the various thresholds and alarms for temperature sensors.

Figure 4 - Temp Sensor Alarms and Thresholds

lcrit

min

max

emergency

crit

Emergency hyst

crit hyst

max hyst

Emergency alarmCrit alarm

Max alarm

Min alarm

Lcrit alarm

Alarm on

Alarm off 

 

Page 11: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 11/12

Open Compute Project SW interfaces for an OCP Switch

http://opencompute.org 11

The temp file system supports the following information (or part of it).

File Name Description Access Note

input Measured temperature in milli-degreeCelsius

RO

max Maximum temperature value RW If- max –  alarm on

- max_hyst –  alarmoff

max_hyst Maximum limit for temperature hysteresisvalue; absolute

RW

min Minimum temperature value RW

crit Critical maximum temperature value.

Typically greater than correspondingtemp_max values.

RW

crit_hyst Critical limit for temperature hysteresisvalue; absolute

RW

lcrit Critical minimum temperature value.Typically lower than corresponding

temp_min values.

RW

emergency Emergency maximum temperature value;

for chips supporting more than two uppertemperature-limits. Must be equal orgreater than corresponding temp_crit value.

RW

emergency_hyst Temperature hysteresis value foremergency limit

RW

lowest Historical minimum temperature RO

highest Historical maximum temperature RO

alarm 1 means an alarm condition exists;0 means no alarm

RO “alarm” does notshow if

“min_alarm” or

“max_alarm” or anyother specific alarmare presented and

vice versa

min_alarm 1 means an alarm condition exists;

temp too slow

RO

max_alarm 1 means an alarm condition exists;temp too high

RO

crit_alarm 1 means an alarm condition exists;temp critical

RO

emergency_alarm 1 means an alarm condition exists;temp emergency

RO

lcrit_alarm 1 means an alarm condition exists;temp low critical

RO

fault This can be used to notify open diodes,unconnected fans etc.

RO

type Sensor type selection, enum: 3-thermal

diode, 4-thermistor

RW

offset Temperature offset added to the chiptemperature reading.

RO

 beep Beeped when an alarm occurs RW

The following web page http://www.mellanox.com/page/ocp  hosts the “OCP 10 and 40 GigabitEthernet Switch Platform Management Demo” video that demonstrates a switch platformmanagement based on the above definitions:

Page 12: Open Compute SW Interface 1 01

7/21/2019 Open Compute SW Interface 1 01

http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 12/12

 

12 January 2014

5  Revision History

Revision Date Description

1.0 27-Jan-14 First release

1.01 30-Jan-14 Added a pointer to the OCP demo video