interface between kernel and user space
DESCRIPTION
TRANSCRIPT
![Page 2: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/2.jpg)
User Space
Kernel Space
netlink socketrtnetlink socket
include/linux/pkt_cls.hinclude/linux/pkt_sched.h
net/netlink
tc
struct sockaddr_nlstruct nlmsghdr
net/core/rtnetlink.clinux/include/rtnetlink.h
OverviewOverview
![Page 3: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/3.jpg)
Boot TimeBoot Time
__initfunc
pktsched_init
net/core/dev.c
net/sched/sch_api.c
• declarations
• binding
![Page 4: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/4.jpg)
pktsched_initpktsched_init
struct rtnetlink_link *link_p;
if (link_p) {link_p[RTM_NEWQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_DELQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_GETQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_GETQDISC-RTM_BASE].dumpit = tc_dump_qdisc;link_p[RTM_NEWTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_DELTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_GETTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_GETTCLASS-RTM_BASE].dumpit = tc_dump_tclass;}
![Page 5: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/5.jpg)
User level ApplicationUser level Application
Create netlink socketsendtonetlink_sendmsg
rtnetlink_rcv_msgcall function in rtnetlink_link
net/core/rtnetlink.c
net/netlink/af_netlink.c
![Page 6: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/6.jpg)
nl_tablenl_table
nl_table : array of INET socket linked list
![Page 7: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/7.jpg)
rtnetlink_linksrtnetlink_linksrtnetlink_links : array of
pointers to rtnetlink_linkrtnetlink_link : command
![Page 8: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/8.jpg)
TC programTC program
do_qdisc
do_class
do_filter
tc_qdisc_modify
tc_qdisc_list
usage
![Page 9: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/9.jpg)
tc_qdisc_modifytc_qdisc_modifyallocate “req”initialize it
![Page 10: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/10.jpg)
tc_qdisc_modify (con’t)tc_qdisc_modify (con’t)
rtnl_open : create ‘rtnetlink’ socketfamily = AF_NETLINKtype = SOCK_RAWprotocol = NETLINK_ROUTE
setup and bindlocal address, sockaddr_nl local
call “rtnl_talk”
![Page 11: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/11.jpg)
rtnl_talkrtnl_talkallocate “msghdr msg”
call “sendmsg” sys_sendmsg
![Page 12: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/12.jpg)
sys_sendmsgsys_sendmsg
Kernel SpaceUser space
Copyreqmsg
reqmsg
• sock_sendmsgsock_sendmsg
scm_cookie scmcall ‘scm_send’call socket’s ‘sendmsg’ = netlink_ops
netlink_sendmsg
![Page 13: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/13.jpg)
netlink_sendmsgnetlink_sendmsg
skbuffmemcpy_from_iovec
msg msg
• netlink_broadcastnetlink_broadcast• netlink_unicastnetlink_unicastdstgroups
![Page 14: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/14.jpg)
netlink_unicastnetlink_unicastsocket’s protocol
find ‘linked list’ in nl_tablel
pid
add_wait_queue
socket’s receive queue
call ‘data_ready’ = rtnetlink_rcv
skbuff
![Page 15: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/15.jpg)
rtnetlink_rcvrtnetlink_rcv
socket’s receive queue skbuff
invoke ‘rtnetlink_rcv_skb’
![Page 16: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/16.jpg)
rtnetlink_rcv_skbrtnetlink_rcv_skb
nlhskbuff
invoke ‘rtnetlink_rcv_msg’
passing ‘nlh’
![Page 17: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/17.jpg)
rtnetlink_rcv_msgrtnetlink_rcv_msg
invoke ‘doit’ in ‘rtnetlink_link’In this case, doit = tc_modify_qdisc
![Page 18: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/18.jpg)
middle summarymiddle summary
User Space
Kernel Space
tc
netlink, rtnetlink
nlmsghdr, tcmsg
rtnetlink_rcv
tc_modify_qdisctc_ctl_tfilter
tc_get_qdisc
![Page 19: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/19.jpg)
tc_modify_qdisctc_modify_qdisc
dev_get_by_index index = tcm->tcm_ifindex
if qdisc parent is set, call ‘qdisc_lookup’ : Find parent
Q call ‘qdisc_leaf’
![Page 20: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/20.jpg)
tc_modify_qdisc (con’t)tc_modify_qdisc (con’t)
if tcm->tcm_handle is not empty, call ‘qdisc_lookup’ for band Q
graftcreate_n_graft
fail
![Page 21: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/21.jpg)
tc_modify_qdisc (con’t)tc_modify_qdisc (con’t)
if tcm->tcm_handle is empty,if q is empty
elsecreate_n_graft
create graft
![Page 22: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/22.jpg)
tc_modify_qdisc (con’t)tc_modify_qdisc (con’t)
if (tcm->tcm_parent is not specified),if (tcm->tcm->handle is not
empty)then call ‘qdisc_lookup’
call qdisc_change(q,tca) ‘qdisc_change’ call ‘prio_tune’
![Page 23: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/23.jpg)
create_n_graftcreate_n_graft
qdisc_create
dev, tcm->tcm_handle, tca, &err
![Page 24: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/24.jpg)
qdisc_createqdisc_create
find qdisc’s kindusing kind, get ‘Qdisc_ops’allocate space for Q displinecall ‘skb_queue_head_init’set up ‘enqueue’, ‘dequeue’call ‘ops->init’
= prio_initinsert new Q into qdisc_list
![Page 25: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/25.jpg)
graftgraft
call ‘qdisc_graft’connect ‘new’ to parent’s class
or devif parent Q displine is empty,
call ‘dev_graft_qdisc(dev,new)’else call ‘get’ from classcall ‘qdisc_notify’
![Page 26: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/26.jpg)
dev_graft_qdiscdev_graft_qdisc
dev_deactiveput old ‘qdisc_sleeping’ to ‘oqdisc’if new Q is empty,
set new Q to noop_qdiscthen, set dev’s qdisc_sleeping to new Q,
dev->qdisc to noop_qdiscReactive device
![Page 27: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/27.jpg)
prio_getprio_get
get minor class ID
prio_graftprio_graft
using minor class ID as index which band
![Page 28: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/28.jpg)
qdisc_chageqdisc_chage
directly call ‘sch->ops->change’ chage = prio_tune
![Page 29: Interface between kernel and user space](https://reader033.vdocument.in/reader033/viewer/2022061300/54c828194a7959cc278b4645/html5/thumbnails/29.jpg)
prio_tuneprio_tune
argument opt contains ‘bands’outside band is set by ‘noop_qdisc’update child Q by ‘prio2band array’if Q == noop_qdisc
qdisc_create_dfltqdisc_creat_dflt set up child Q set up operator to ‘pfifo_qdisc_ops’