Saturday, October 27, 2012

net_device, who are you?

Hi all,
Today I'll be talking about a basic well known structure in the Linux
networking kernel. It's the net_device structure.
The net_device holds large amount of information regarding the device.

do not forget a net_device can be virtual device too,
It doesn't necessarily has to be a physical device, such as NIC.
Examples for virtual devices are:
such as a bridge interface which is a virtual representation of a bridge.
and Tunnel interfaces - The implementation of IP-over-IP tunnelling (IPIP) and the Generalized Routing Encapsulation (GRE) Protocol is based on the creation of the virtual device.


So now I'll try simplify this well-known structure, and it's usage.
Oh by the way my reference source code which I'll be talking about is Linux kernel 3.5.4 .
Some of the fields have been changed, but it's not that difficult to understand.

So Let's start, I'll try with the more simple fields
The fields of the net_device structure can be classified into the categories:

a. Configuration
Configuration fields are:



(1) char name[IFNAMESIZ]
Name of the device. for example: wifi0, the name is wifi

(2) int ifindex;
Interface index is a unique device identifier, for example eth0
the ifindex value is zero.
     
(3) unsigned char if_port ;
The type of ports being used for this interface (10BASE2, 10BASE).
if_port stores the media type of the network adapter currently used.
For Ethernet, we distinguish between BNC, Twisted Pair (TP), and AUI.


(4) unsigned short flags;

see file if.h under /include/linux
Standard interface flags (netdevice->flags)

for example: IFF_PROMISC - receive all packets
IFF_UP - interface is up
IFF_LOOPBACK - is a loopback net
IFF_ALLMULTI - receive all multicast packets
IFF_RUNNING -


(5) unsigned int mtu;
MTU stands for Maximum Transmission Unit and represents then maximum size of frames that the device can handle.

       of-course you can use ioctl and set/get MTU and make use of SIOCGIFMTU, 
       SIOCSIFMTU.

(6) unsigned short type;
interface hardware type, the category of devices to which it belongs (Ethernet, Frame Relay, etc.)  /include/linux/if_arp.h contains the complete list of possible types.
 
     
(7) unsigned char addr_len;
dev_addr is the device link layer address (No IP address). The value of addr_len depends on the type of device. ethernet addresses are six octets long.

(8) unsigned short hard_header_len;
hardware header length, for example if we are talking about Ethernet's header it features source and destination MAC addresses which have 6 octets each, the EtherType protocol identifier field and optional IEEE 802.1Q tag.


(9) struct net_device_stats stats;
Holds statistics regarding transmitting and receiving, number of packets, number of bytes, collisions,crc_errors, unsigned long rx_packets, tx_packets, rx_bytes, tx_bytes, rx_errors, tx_errors etc.
If you would like to see those stats, you can easily retrieve them through
the sysfs  mounted on:
/sys/class/net/<device_name>/statistics


(10) unsigned int promiscuity;
Probably you are asking yourself, haven't we mentioned promiscuous mode before?and how come we need an unsigned int for that, actually you are right the flags field holds the the current state of the device, so for checking out we could AND with the IFF_PROMISC mask.
but let's think things through the net_dev serves the system (OS), and over the OS runs many processes which use those net devices.
The netdevice is shared between processes so if lets say I use Wireshark and Tcpdumpp for sniffing traffic the two would use the promiscuous functionality, so it's actually a reference counter.
when the promiscuity reaches zero, we turnoff the bit, via the
IFF_PROMISC  mask.



(11) unsigned long state;
Device status, there are 3 states for each net_device.

a .  __LINK_STATE_START
     Interface state is either up  or down, is checked via function
      netif_running().

b.  __LINK_STATE_PRESENT
    D
evice is either present or has been removed from system, is checked
    via function
  netif_device_present().


c. __LINK_STATE_NOCARRIER
    Carrier is present on device, is checked via function netif_carrier_ok().


(12) unsigned short padded;
How much padding added by alloc_netdev_mqs()

(13) unsigned int num_rx_queues;
Number of RX queues allocated at register_netdev() time

(14) unsigned int real_num_rx_queues;
Number of RX queues currently active in device

(15)

unsigned int num_tx_queues;

Number of TX queues allocated at alloc_netdev_mq() time

(16) unsigned int real_num_tx_queues;
Number of TX queues currently active in device
 

  • Next to each net_device structure resides a  priv structure which is set by the driver, it's a private data structure storing information about the interface. The private data consists of statistics such as the number of packets transmitted and received and the number of errors encountered. The priv structure size is not necessarily the same for each net_device, since we are talking about a complete distinct net device which belong to another vendor. For getting the network device private data , we should use the function:
    static inline void *netdev_priv(const struct net_device *dev)


(17) unsigned short priv_flags;
Flags can be changed through the dev_change_flags function.
 

b. List management

  • So Now I'll be talking about  how those net devices get stored, of course We would like retrieve net_device’s data  fast and quick as possible . So let's see how it is done:

(18) struct list_head dev_list;
Each net_device holds field named dev_list, which is two pointers one to the next and another to the previous net_device.
next and previous is in the list_head struct.


(19) struct hlist_node index_hlist;
device index hash chain

 
(20) struct hlist_node name_hlist;
device name hash chain


For inserting a net_device into our "database" we use:
static int list_netdevice(struct net_device *dev);


according to (19) and (20)  we can understand a new net_device gets stored via 2 hash arrays. one for the name of the net_device and the second for the if_index's net_device.

static inline struct hlist_head *dev_index_hash(struct net *net, int ifindex)
{
    return &net->dev_index_head[ifindex & (NETDEV_HASHENTRIES - 1)];
}


static inline struct hlist_head *dev_name_hash(struct net *net, const char *name)
{
    unsigned int hash = full_name_hash(name, strnlen(name, IFNAMSIZ));

    return &net->dev_name_head[hash_32(hash, NETDEV_HASHBITS)];
}




For example I have illustrated a sketch, for storing 3 net_device:
let's say an eth1 net_device was added, it was the first net_device to be added to the system.

Afterwards a net_device called wifi2 was added, so a new entry gets initialised in the hash array of names (wifi entry) .

Now a new device was added called eth2, so now it becomes the head of the eth list.

So we should get the following image:





That’s it for now,
I hope you learned few neat things from today’s talk , c u on the next blog post!

No comments:

Post a Comment

About