open compute sw interface 1 01
TRANSCRIPT
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 1/12
SW Interfaces for an Open Compute
Project Switch
Authors: Aviad Raveh, Matty Kadosh, Ariel Almog
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 2/12
2 January 2014
1 Scope
This document provides technical specifications for software interfaces of an Open ComputeProject switch.
2 Contents
1 Scope ......................................................................................................................................... 2
2 Contents .................................................................................................................................... 2
3 Overview.................................................................................................................................... 3
3.1 License .......................................................................................................................... 3
4 Interfaces Definition .................................................................................................................. 5
4.1 Open Ethernet Switch API ............................................................................................ 5
4.2 OCP Platform Control ................................................................................................... 7
5 Revision History ....................................................................................................................... 12
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 3/12
Open Compute Project SW interfaces for an OCP Switch
http://opencompute.org 3
3 Overview
This document suggests standard and unified interfaces to control an OCP switch silicon and platform. To comply with the below interfaces, each OCP switch should be supplied with aSoftware Development Kit (SDK) that contains drivers that implements the logic for theseinterfaces.
Once done, the unified interface significantly reduces the porting effort for protocols and systemmonitoring stacks when integrating a solution with any system or switch hardware vendor.
Figure 1 shows a high layer software (SW) architecture containing the following components:
Open Network Install Environment (ONIE) compatible boot loader
Network OS – a SW that manages the switching/routing protocols together with system
management logic. This element is out of the doc’s scope
Open Ethernet Switch (OES) APIs – a set of functions that invoke configuration of the fast
path switch HW.
Platform interfaces – a set of functions that controls the system components: Fans,
temperature sensors, power supplies and LEDs. Other components such as watchdog and
interrupt controller can be added as the standard is adopted by the OCP community.
Figure 1 - OCP Software Components
3.1 License
As of April 7, 2011, the following persons or entities have made this Specification availableunder the Open Web Foundation Final Specification Agreement (OWFa 1.0), which is available
at http://www.openwebfoundation.org/legal/the-owf-1-0-agreements/owfa-1-0 :
Facebook, Inc.
You can review the signed copies of the Open Web Foundation Agreement Version 1.0 for this
Specification at http://opencompute.org/licensing/ , which may also include additional parties tothose listed above.
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 4/12
4 January 2014
Your use of this Specification may be subject to other third party rights. THIS SPECIFICATIONIS PROVIDED "AS IS." The contributors expressly disclaim any warranties (express, implied, or
otherwise), including implied warranties of merchantability, non-infringement, fitness for a particular purpose, or title, related to the Specification. The entire risk as to implementing orotherwise using the Specification is assumed by the Specification implementer and user.
IN NO EVENT WILL ANY PARTY BE LIABLE TO ANY OTHER PARTY FOR LOSTPROFITS OR ANY FORM OF INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL
DAMAGES OF ANY CHARACTER FROM ANY CAUSES OF ACTION OF ANY KINDWITH RESPECT TO THIS SPECIFICATION OR ITS GOVERNING AGREEMENT,
WHETHER BASED ON BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE),OR OTHERWISE, AND WHETHER OR NOT THE OTHER PARTY HAS BEEN ADVISEDOF THE POSSIBILITY OF SUCH DAMAGE.
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 5/12
Open Compute Project SW interfaces for an OCP Switch
http://opencompute.org 5
4 Interfaces Definition
The following section provides a description for generic interfaces for switch and platform
management.
4.1 Open Ethernet Switch API
The current merchant switch silicon devices are supplied with a proprietary SDK. The protocolsdriven SW stacks need to implement a hardware abstraction layer (HAL) to isolate its unified
code from the vendor specific logic. This work should be repeated for each hardware (HW)device that the stack supports.
The Open Ethernet Switch (OES) is a standard driven (e.g. .1Q) interface. Each HW vendorshould provide a glue layer between OES and its SDK APIs. This implies that for all the standarddriven functionality, the protocol stack has no changes when porting from one HW to another.The SDK and glue logic provided by each vendor are responsible to implement the HW specificlogic.
Since not all the HW devices are completely aligned with their functionality, the OES interfacedefinitions include, for each API, a pointer for vendor specific logic. This pointer can be assigned
a null value when there is no need for a unique logic.
The lack of a feature in a given device cannot always be compensated by adding a vendorextension parameter to a single API. This document suggests not solving such conflicts in theOES layer. The solution is to implement a per-HW HAL in the protocol stack directly calling theSDK API. With the existence of the OES API, this effort is required only for the conflicting
features.
The OES API repository is located in https://github.com/open-ethernet/OES . Vendors shouldconsider uploading the OES, SDK glue logic to the same repository.
OES API Convention
The APIs presented are given as an example of the API coding convention.
This API creates/destroys a new/existing LAG group and adds/deletes ports to/from an existing
LAG ports group.
oes_status_e oes_api_lag_port_group_set (
const oes_access_cmd_e access_cmd,
const int br_id,
unsigned long * lag_port_p,
const unsigned long * log_port_list_p,
const unsigned short port_cnt,
void * lag_port_group_vs_ext_p
)
Parameters:
[in] access_cmd - CREATE/DESTROY/ADD/DELETE
[in] br_id - bridge ID
[in,out] lag_port_p - In: Already created LAG ports group ID. Out: Newly created LAG ports group
ID. This parameter should be set to null in all the commands besides ADD .
[in] log_port_list_p - list of logical ports to ADD/DELETE to/from a LAG ports group
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 6/12
6 January 2014
[in] port_cnt_p - number of logical ports to ADD/DELETE to/from a LAG ports group
[in,out] lag_port_group_vs_ext_p - vendor specific extension
Returns:
OES_STATUS_SUCCESS - operation completes successfully OES_STATUS_PARAM_ERROR - parameter is invalid
OES_STATUS_RESOURCE_UNAVAILABLE - resource error
OES_STATUS_ERROR - general error
This API presents an existing LAG’s ports group.
oes_status_e oes_api_lag_port_group_get (
const int br_id,
const unsigned long lag_port,
unsigned long * log_port_list_p,
unsigned short * port_cnt_p,
void * lag_port_group_vs_ext_p
)
Parameters:
[in] br_id - bridge ID
[in] lag_port - LAG ports group ID
[out] log_port_list_p - list of logical ports that are queried
[in,out] port_cnt_p - number of logical ports that are queried. If a smaller number is
provisioned, the actual count is returned.
[in,out] lag_port_group_vs_ext - vendor specific extension
Returns:
OES_STATUS_SUCCESS - operation completes successfully
OES_STATUS_PARAM_ERROR - parameter is invalid
OES_STATUS_ERROR - resource error
OES API Names
Each API is built using the following naming convention:oes_api_<subfile>_<functionality >_set/get. A “set” API changes the configuration while the“get” queries the existing HW values.
Access Command
To reduce the number of APIs, a function can have multiple options to its functionality controlled by the “access_ cmd” parameters. Some of those access command values are as follows:
- Create: Allocates and set a new resource
- Destroy: De-allocates and delete an already allocated resource
- Add: Sets a value to a field where no resource needs to be allocated
- Delete: Erases a value previously configured and returns to default
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 7/12
Open Compute Project SW interfaces for an OCP Switch
http://opencompute.org 7
- Get; get first; get next: Controls the read location from the database
Var iable Types
Variables should be as generic as possible (int, long, short, etc.). Typedefs should be avoided.
Structures/enumerators should be declared at the function header.
Return Value
The API may return one of the following errors:
- Success: Everything is OK
- Parameter error: Something is invalid in the parameter sanity check
- Error: Unexpected error from the SDK
- Resource unavailable: No HW resources to accept the configuration
Vendor Extensions
Each API supports vendor specific enhancements by pointing to a memory allocated by the
calling function and freed by the SDK implementation. The type that the pointer is pointing toshould be agreed on by both the stack and the vendor implementation and should be out of the
OES scope. API vendor extensions are ignored by passing a NULL pointer.
4.2 OCP Platform Control
The platform interface is implemented by calling a set of files in the Linux kernel (sysfs). Thefollowing paragraph defines the sysfs tree and the HW components file syntax.
To implement a basic system control the following components should be monitored/configured.
- Fan speed
- Various thermal sensors
- Power supply units
- LEDs
Figure 2 shows a high-level management SW view from the user interface to the HW drivers.This document focuses on the kernel-user interface. Each OCP platform vendor should
implement the platform drivers aligned with the following defined logic.
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 8/12
8 January 2014
Figure 2 - Platform Control Structure
Platform Interfaces
Vendor Linux Drivers
sysfs
Structural
entity
interface
U
K
Platform daemon
Monitor/configure/alarms
Vendor Linux DriversVendor Linux Drivers
USB
I2C
PCI
sensors.
conf
WEBCLI SNMP
Platform providercode
Managementcode
Legend
temp2
temp1
LED1
Looking to leverage existing open source implementation, the lm-sensors framework
http://www.lm-sensors.org/ is adopted to control a switch system.
Platform Daemon
The platform daemon is a platform management code that should be platform-independent. The per-platform unique variables are defined in a sensors.conf file that is based on predefined syntax
and contains the translation from the driver semantics (e.g. fan1) to the platform semantics (e.g.fan_cpu_1).
Some sensors might be replaceable (e.g. replaceable power supply). The platform daemon shall provide insert and eject events to the upper layers. If an upper layer attempts to set information to
a device that is not currently present in the system, the configuration shall be stored in the platform daemon. If an upper layer attempts to get information from a device that is not currently present in the system, the platform daemon returns a failure notification with a matching cause.
The sensors.conf file contains specific platform data. Below is an example for default temperaturesensor values:
set temp1_max_hyst 45
set temp1_max 52
set temp1_crit_hyst 57set temp1_crit 62
label temp1 "CPU Temp"
The “set” commands configure values to their matching fields. Label is used to identify devices ina more readable way. A set of common naming label is used (# represents a numeric variable):
fan_ps_#
fan_chassis_#
ps_#
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 9/12
Open Compute Project SW interfaces for an OCP Switch
http://opencompute.org 9
led_ps_#
led_chassis_#
In some systems, the hardware is more autonomous than in others. For example, some platformsmight control the fan speed by hardware while others might do it by software. To support both
models, the platform code shall support monitoring functionality for temperature and voltage. The
platform daemon is configured by the sensors.conf file to handle monitoring by hardware or bysoftware.
sysfs Interface
In the sysfs interface, each hardware device gets its own directory under the /sys/devices tree.Utility sensors scan the symbol links under /sys/class/hwmon/ tree to allocate the platform
devices. Figure 3 shows the relations between the per-bus and per-device sysfs trees.
Figure 3 - syfs Structure
/
sys
devices
class
i2c...
xxx
hwmon
hmon1
fan1
input
minhmon2
input
min
input
min
S y m b o l i c l i n k
temp1
When a device is not currently present in the system (e.g. if the fan tray is pulled out for
maintenance/replacement), the driver deletes this device from the file system.
Some of the sensors may be related to others. For example, a fan tray may have 2 fans; each
equipped with its own speed meter, with a single PWM unit is shared for both fans. In such case“set speed” commands set the speed of both fan units (i.e. setting first fan speed sets the secondfan speed and vice versa).
Fans
Note: The fans definitions in this document are based on lm-sensor 3.3.4 (May 13).
The fan file system supports the following information (or part of it).
File
Name
Description Access Note
input Measured fan speed (in RPM) RO
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 10/12
10 January 2014
min Minimum fan speed value (in RPM) RWIf the softwarecontrols the speed.Only the min shows.
max Maximum fan speed value (in RPM) RW
div Fan divisor. Integer value in powers of two (1, 2,4, 8, 16, 32, 64, 128).
RW
pulses Number of tachometer pulses per fan revolution.Integer value, typically between 1 and 4.
RW
alarm 1 means an alarm condition exists;0 means no alarm
RO “alarm” won’t showif “min_alarm” or
“max_alarm” are presented and vice
versa
min_alar m
1 means an alarm condition exists;speed too slow
RO
max_alar m
1 means an alarm condition exists;speed too high
RO
fault This can be used to notify open diodes,unconnected fans etc.
RO
beep Beeped when an alarm occurs RW
Leds
The LEDs file system supports the following information.
File Name Description Access
operation Color and flashing rate: nocolor, yellow, green, red, blue, yellow_blink,green_blink, red_blink, yellow_fast, green_fast, red_fast, hw_control.
RW
Temp
Note: The temp definitions in this document are based on lm-sensor 3.3.4 (May 13).
0 shows the various thresholds and alarms for temperature sensors.
Figure 4 - Temp Sensor Alarms and Thresholds
lcrit
min
max
emergency
crit
Emergency hyst
crit hyst
max hyst
Emergency alarmCrit alarm
Max alarm
Min alarm
Lcrit alarm
Alarm on
Alarm off
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 11/12
Open Compute Project SW interfaces for an OCP Switch
http://opencompute.org 11
The temp file system supports the following information (or part of it).
File Name Description Access Note
input Measured temperature in milli-degreeCelsius
RO
max Maximum temperature value RW If- max – alarm on
- max_hyst – alarmoff
max_hyst Maximum limit for temperature hysteresisvalue; absolute
RW
min Minimum temperature value RW
crit Critical maximum temperature value.
Typically greater than correspondingtemp_max values.
RW
crit_hyst Critical limit for temperature hysteresisvalue; absolute
RW
lcrit Critical minimum temperature value.Typically lower than corresponding
temp_min values.
RW
emergency Emergency maximum temperature value;
for chips supporting more than two uppertemperature-limits. Must be equal orgreater than corresponding temp_crit value.
RW
emergency_hyst Temperature hysteresis value foremergency limit
RW
lowest Historical minimum temperature RO
highest Historical maximum temperature RO
alarm 1 means an alarm condition exists;0 means no alarm
RO “alarm” does notshow if
“min_alarm” or
“max_alarm” or anyother specific alarmare presented and
vice versa
min_alarm 1 means an alarm condition exists;
temp too slow
RO
max_alarm 1 means an alarm condition exists;temp too high
RO
crit_alarm 1 means an alarm condition exists;temp critical
RO
emergency_alarm 1 means an alarm condition exists;temp emergency
RO
lcrit_alarm 1 means an alarm condition exists;temp low critical
RO
fault This can be used to notify open diodes,unconnected fans etc.
RO
type Sensor type selection, enum: 3-thermal
diode, 4-thermistor
RW
offset Temperature offset added to the chiptemperature reading.
RO
beep Beeped when an alarm occurs RW
The following web page http://www.mellanox.com/page/ocp hosts the “OCP 10 and 40 GigabitEthernet Switch Platform Management Demo” video that demonstrates a switch platformmanagement based on the above definitions:
7/21/2019 Open Compute SW Interface 1 01
http://slidepdf.com/reader/full/open-compute-sw-interface-1-01 12/12
12 January 2014
5 Revision History
Revision Date Description
1.0 27-Jan-14 First release
1.01 30-Jan-14 Added a pointer to the OCP demo video