Discussion:
Right way to configure a driver? (sysfs, ioctl, proc, configfs,....)
Aritz Bastida
2006-01-26 20:06:28 UTC
Permalink
Hello everybody.

I'm quite a newbie in the kernel development, but I'm writing a kernel
module and would like to do the things right. What I'm trying to do
is, more or less, a kind of "virtual" network device (not really that
but it will suffice).

This network device can be configured from userspace. I have read the
books "Linux Device Drivers 3" (LDD3) and "Linux Kernel Development"
and after that I didn't find the answer to the question I was making
myself: What should be the right way to configure it?

In LDD3 it says that ioctls fall out of favor among kernel developers,
but there is not a strong _advise_ to use another method. It says
that instead of that sysfs _could_ be used. Of course, that was almost
a year ago. I'm sure things have changed since then.

So, well, I did all this configuration using ioctls and proc, which
was the fastest for me, but may be not the best solution. So I'm
asking for advise here.

The configuration I need to do is actually quite simple. Most of the
commands are just set or get a variable defined in my module (for
example, write to a flags variable, just like in real network devices
-- i.e. IF_UP). The most difficult "config" I need to do is write a
struct to the module (just write two variables in the same command).

What way do you suggest for all this? Is sysfs correct for this? What
about the new filesystem "configfs"? I've just heard about it, but I
don't even have it mounted on my system. Would it be what I need?

On the other hand, I also need to export some statistics to userspace.
These are similar to the ones in a network device: packets received,
dropped,... but I would like to export not just the number of packets
received, but the number received by _each_ cpu, as well as the total.
Would you recommend me /proc or sysfs?

In case of using sysfs, would this be the correct approach or you
would recommend one value per file?
$ cat rx_packets
10 15 25
where the first value is packets received in CPU0, the second in CPU1
and the last the total.


Thank you for your time
Regards
Aritz
Greg KH
2006-01-27 05:01:09 UTC
Permalink
Post by Aritz Bastida
Hello everybody.
I'm quite a newbie in the kernel development, but I'm writing a kernel
module and would like to do the things right. What I'm trying to do
is, more or less, a kind of "virtual" network device (not really that
but it will suffice).
This network device can be configured from userspace. I have read the
books "Linux Device Drivers 3" (LDD3) and "Linux Kernel Development"
and after that I didn't find the answer to the question I was making
myself: What should be the right way to configure it?
It all depends on what you want to configure, and what type of thing you
are configuring.
Post by Aritz Bastida
In LDD3 it says that ioctls fall out of favor among kernel developers,
but there is not a strong _advise_ to use another method. It says
that instead of that sysfs _could_ be used. Of course, that was almost
a year ago. I'm sure things have changed since then.
No, not really.
Post by Aritz Bastida
So, well, I did all this configuration using ioctls and proc, which
was the fastest for me, but may be not the best solution. So I'm
asking for advise here.
do NOT use proc, unless you are doing things that concern processes.
Post by Aritz Bastida
The configuration I need to do is actually quite simple. Most of the
commands are just set or get a variable defined in my module (for
example, write to a flags variable, just like in real network devices
-- i.e. IF_UP). The most difficult "config" I need to do is write a
struct to the module (just write two variables in the same command).
That sounds like something that sysfs or even debugfs is perfict for.
Post by Aritz Bastida
What way do you suggest for all this? Is sysfs correct for this? What
about the new filesystem "configfs"? I've just heard about it, but I
don't even have it mounted on my system. Would it be what I need?
Read the docs on configfs for details on that. But for simple variables
like you describe, either sysfs or debugfs are the best.
Post by Aritz Bastida
On the other hand, I also need to export some statistics to userspace.
These are similar to the ones in a network device: packets received,
dropped,... but I would like to export not just the number of packets
received, but the number received by _each_ cpu, as well as the total.
Would you recommend me /proc or sysfs?
Again, not proc. So sysfs.
Post by Aritz Bastida
In case of using sysfs, would this be the correct approach or you
would recommend one value per file?
Yes.
Post by Aritz Bastida
$ cat rx_packets
10 15 25
where the first value is packets received in CPU0, the second in CPU1
and the last the total.
No. Have 3 different files:
rx_packets_cpu0
rx_packets_cpu1
rx_packets_total

Hope this helps,

greg k-h
Aritz Bastida
2006-01-27 10:30:26 UTC
Permalink
Hi!
Post by Greg KH
Hope this helps,
greg k-h
Yes, it helped me much. I'll move all the configuration/statistics to
sysfs. I will read carefully the corresponding chapter in LDD3 :)
But before that, I've got a few questions with what I know:

1.- In what directory should I do all this configuration? I guess as,
I'm writing a module it should be in /sys/module/<my_module>, right?
Or would your recommend /sys/class/net or anything?

2.- In my sysfs directory I would create two subdirectories: "config"
and "stats". In the first I would place read/write files used for
configuration. For example "config/flags" for the flags variable. In
the second read-only files with the statistics. Is this approach
correct?

3.- Actually the most difficult config I must do is to pass three
values from userspace to my module. Specifically two integers and a
long (it's an offset to a memory zone I've previously defined)

struct meminfo {
unsigned int id; /* segment identifier */
unsigned int size; /* size of the memory area */
unsigned long offset; /* offset to the information */
};

How would you pass this information in sysfs? Three values in the same
file? Note that using three different files wouldn't be atomic, and I
need atomicity.

4.- Last, you suggested that I had three files for the rx_packets count:
rx_packets_cpu0
rx_packets_cpu1
rx_packets_total

I have quite a few counters, and if that number of files is multiplied by
the number of cpus, the number of files could be very large (imagine a
8-cpu box), don't you think so? And after all reading a file with three
values could be done very easily with awk...



That's all. Thank you for your help
Regards
Aritz
Aritz Bastida
2006-01-30 11:23:14 UTC
Permalink
Thank you Antonio and Greg
Post by Aritz Bastida
3.- Actually the most difficult config I must do is to pass three
values from userspace to my module. Specifically two integers and a
long (it's an offset to a memory zone I've previously defined)
struct meminfo {
unsigned int id; /* segment identifier */
unsigned int size; /* size of the memory area */
unsigned long offset; /* offset to the information */
};
How would you pass this information in sysfs? Three values in the same
file? Note that using three different files wouldn't be atomic, and I
need atomicity.
I guess I could pass three values on the same file, like this:
$ echo "5 1000 500" > meminfo

I know that breaks the sysfs golden-rule, but how can I pass those
values _atomically_ then? Having three different files wouldn't be
atomic...

Regards
Aritz
Greg KH
2006-01-30 21:39:08 UTC
Permalink
Post by Aritz Bastida
Thank you Antonio and Greg
Post by Aritz Bastida
3.- Actually the most difficult config I must do is to pass three
values from userspace to my module. Specifically two integers and a
long (it's an offset to a memory zone I've previously defined)
struct meminfo {
unsigned int id; /* segment identifier */
unsigned int size; /* size of the memory area */
unsigned long offset; /* offset to the information */
};
How would you pass this information in sysfs? Three values in the same
file? Note that using three different files wouldn't be atomic, and I
need atomicity.
$ echo "5 1000 500" > meminfo
I know that breaks the sysfs golden-rule, but how can I pass those
values _atomically_ then? Having three different files wouldn't be
atomic...
That's what configfs was created for. I suggest using that for things
like this, as sysfs is not intended for it.

thanks,

greg k-h
Jan Engelhardt
2006-02-01 14:54:22 UTC
Permalink
Post by Greg KH
Post by Aritz Bastida
$ echo "5 1000 500" > meminfo
I know that breaks the sysfs golden-rule, but how can I pass those
values _atomically_ then? Having three different files wouldn't be
atomic...
That's what configfs was created for. I suggest using that for things
like this, as sysfs is not intended for it.
Can't we just somewhat merge all the duplicated functionality between procfs,
sysfs and configfs...


Jan Engelhardt
--
Greg KH
2006-02-01 15:11:45 UTC
Permalink
Post by Jan Engelhardt
Post by Greg KH
Post by Aritz Bastida
$ echo "5 1000 500" > meminfo
I know that breaks the sysfs golden-rule, but how can I pass those
values _atomically_ then? Having three different files wouldn't be
atomic...
That's what configfs was created for. I suggest using that for things
like this, as sysfs is not intended for it.
Can't we just somewhat merge all the duplicated functionality between procfs,
sysfs and configfs...
What "duplicated functionality"? They all do different, unique things.

Patches are always welcome...

thanks,

greg k-h
Aritz Bastida
2006-02-01 15:44:44 UTC
Permalink
Post by Greg KH
Post by Jan Engelhardt
Can't we just somewhat merge all the duplicated functionality between procfs,
sysfs and configfs...
What "duplicated functionality"? They all do different, unique things.
Patches are always welcome...
thanks,
greg k-h
May be not "duplicated functionality", but _yes_ lots of ways to do
the same thing. I know, that's the magic with Linux, that there are
different ways to achieve the same goal, but in an effort to write
standard, generic, readable code, some ways should be preferred over
others.

As I said in previous messages, my driver is a kind of virtual network
device (imagine something like "snull" in LDD3) and my question was:
what would be the right way to configure it? I know, i know, there is
not a unique question for that, but I'm sure at least there are
suggestions. Some years ago, maybe there was no alternative other than
using system calls or ioctls, but the spectrum is a lot wider now.

I try to resume here the different ways that could be used, and their
original purpose:

* IOCTLS: as far as I know, this is deprecated for new drivers,
although it will still be there for a long time because of backward
compatibility.
* PROCFS: it has been used a lot apart from export process
information, but it seems that finally this is moving toward some new
filesystems (sysfs,configfs).
* SYSFS: this is to export system information
* CONFIGFS: this is to configure kernel modules/subsystems. This is
new to me, and don't have it (at least by default) in my Linux 2.6.15
box (guess it will be in the menuconfig somewhere :P).
* NETLINK: As far as I know this is used for configuring network
devices, routers, firewalls, ...

I guess that, with what I know now (sure more than when I started this
thread), the right ways could be either configfs or netlink. For my
purposes netlink would be more convenient (since the communication
link is bidirectional), but I still don't know if you guys think that
this method is all right.

Bye
Aritz
Jan Engelhardt
2006-02-01 16:17:47 UTC
Permalink
Post by Aritz Bastida
As I said in previous messages, my driver is a kind of virtual network
what would be the right way to configure it? I know, i know, there is
not a unique question for that, but I'm sure at least there are
suggestions. Some years ago, maybe there was no alternative other than
using system calls or ioctls, but the spectrum is a lot wider now.
I try to resume here the different ways that could be used, and their
* SYSFS: this is to export system information
* CONFIGFS: this is to configure kernel modules/subsystems.
So there basically is an "exportfs" (sysfs) and an "importfs" (configfs).
[This has nothing to do with the nfs-exportfs for the moment.]
Can't these be merged to have a "importexportfs", would make things
simpler. Especially with system parameters (exported stuff, /sys) can be
changed (aka imported).


Jan Engelhardt
--
Neil Brown
2006-02-02 06:31:25 UTC
Permalink
Post by Greg KH
Post by Jan Engelhardt
Post by Greg KH
Post by Aritz Bastida
$ echo "5 1000 500" > meminfo
I know that breaks the sysfs golden-rule, but how can I pass those
values _atomically_ then? Having three different files wouldn't be
atomic...
That's what configfs was created for. I suggest using that for things
like this, as sysfs is not intended for it.
Can't we just somewhat merge all the duplicated functionality between procfs,
sysfs and configfs...
What "duplicated functionality"? They all do different, unique things.
So why do you recommend configfs for something that is *almost* what
sysfs does well.

sysfs both exports information, and allows changes to some (not all)
of that information.
But as soon as someone wants an atomic change, which is conceptually a
very small difference, you say "use configfs" which is conceptually a
very big difference in interface.

Configfs - as I read the doco - is not really about providing generic
atomic configuration changes.

Configfs is for *Creating* kernel objects.
The basic sequence is:
mkdir /config/subsystem/objectname
# where you choose 'objectname' to be whatever you want.
echo value > /config/subsystem/objectname/param1
echo value > /config/subsystem/objectname/param2
echo value > /config/subsystem/objectname/param3
echo value > /config/subsystem/objectname/param4

and then the object is ready to go.
Notice that there is *NO* 'commit' step. There is nothing here that
makes anything atomic.

So saying 'use configfs for atomic updates' doesn't seem rational...

To be fair, configfs is meant to have 'committable items', but they are
'currently unimplemented'.
If you have a 'committable' item, then the sequence instead would go
something like:

mkdir /config/subsystem/pending/objectname
echo value > /config/subsystem/pending/objectname/param1
...
mv /config/subsystem/pending/objectname /config/subsystem/live

This does provide atomic updates but, apart from not being
implemented, it only allows atomic updates at object creation time.

If I have a live object, and I want to change some attributes
atomically, configfs DOES NOT LET ME DO THAT.

Conversely, it is quite easy to do this with sysfs.
As you have control over the 'read' and 'write' routines for each
attribute, you simply:
- in your object, store the real attribute and a 'pending' copy.
- define a special attribute, maybe called 'commit' such that:
writing 'clear' copies the real attributes in to the pending
copy as well
writing 'commit' copies the 'pending' copies into the real
attributes atomically
- when you write to an attribute, it updates the 'pending' copy.

So to do a atomic update, you:

1/ get a flock lock on the directory (do sysfs directories support
flock?)
2/ write 'clear' to 'commit
3/ make your changes
4/ write 'commit' to 'commit
5/ unflock.

Obvious the 'flock' could be replaced by lockfiles in /tmp or whatever
you want.

This doesn't mean that we don't need configfs (though I'm not yet
convinced). The point of configfs seems to be *Creating* objects.
Maybe it is a good thing to use for this purpose (though if those
objects end up appearing in sysfs, it would seem like unnecessary
duplication).
Post by Greg KH
Patches are always welcome...
True, patches are good.
But they don't stop people from recommending the wrong tool for the
job :-)
And they aren't needed to support atomic updates in sysfs.

Maybe what would be good is support for 'mkdir' in sysfs.
I would really like to be able to use 'mkdir' to create md devices,
but '/sys/block' is too flat. If it had
/sys/block/sd/scsi-block-devices
/sys/block/hd/ide-block-devices
/sys/block/loop/loop-block-devices
it would also have
/sys/block/md/md-block-devices
and it would make sense to do a
mkdir /sys/block/md/0
or whatever to create a new md device. But I don't think it makes
sense to
mkdir /sys/block/md0
because someone would have to parse the device name ('md0') to decide
which module to hand it off to.... oh well :-(

NeilBrown

Greg KH
2006-01-30 21:41:18 UTC
Permalink
Post by Aritz Bastida
Hi!
Post by Greg KH
Hope this helps,
greg k-h
Yes, it helped me much. I'll move all the configuration/statistics to
sysfs. I will read carefully the corresponding chapter in LDD3 :)
1.- In what directory should I do all this configuration? I guess as,
I'm writing a module it should be in /sys/module/<my_module>, right?
Or would your recommend /sys/class/net or anything?
Your device directory is usually the best for device specific options.
In your driver directory (for the type of bus driver that your device
lives on) is for driver-wide options.

Not in the module directory, that's not easy to get to and not
recommended at all. Only module paramaters go there.
Post by Aritz Bastida
2.- In my sysfs directory I would create two subdirectories: "config"
and "stats". In the first I would place read/write files used for
configuration. For example "config/flags" for the flags variable. In
the second read-only files with the statistics. Is this approach
correct?
The config stuff might be better off in configfs, not sysfs.
Post by Aritz Bastida
3.- Actually the most difficult config I must do is to pass three
values from userspace to my module. Specifically two integers and a
long (it's an offset to a memory zone I've previously defined)
struct meminfo {
unsigned int id; /* segment identifier */
unsigned int size; /* size of the memory area */
unsigned long offset; /* offset to the information */
};
How would you pass this information in sysfs? Three values in the same
file? Note that using three different files wouldn't be atomic, and I
need atomicity.
Use configfs.
Post by Aritz Bastida
rx_packets_cpu0
rx_packets_cpu1
rx_packets_total
I have quite a few counters, and if that number of files is multiplied by
the number of cpus, the number of files could be very large (imagine a
8-cpu box), don't you think so? And after all reading a file with three
values could be done very easily with awk...
So, lots of files is not a problem, have you looked at the sysfs file
entries for the sensor/hwmon drivers in a while? There are zillions of
them :)

thanks,

greg k-h
Aritz Bastida
2006-02-01 13:37:15 UTC
Permalink
Post by Greg KH
Post by Aritz Bastida
3.- Actually the most difficult config I must do is to pass three
values from userspace to my module. Specifically two integers and a
long (it's an offset to a memory zone I've previously defined)
struct meminfo {
unsigned int id; /* segment identifier */
unsigned int size; /* size of the memory area */
unsigned long offset; /* offset to the information */
};
How would you pass this information in sysfs? Three values in the same
file? Note that using three different files wouldn't be atomic, and I
need atomicity.
Use configfs.
Ummhh, and would it be correct to configure my device via a netlink
socket? Remember that my driver is a kind of network "virtual" device.

There are so many old and new ways to configure a driver that I'm a
bit overwhelmed...

Regards
Aritz
linux-os (Dick Johnson)
2006-02-01 13:53:17 UTC
Permalink
Post by Aritz Bastida
Post by Greg KH
Post by Aritz Bastida
3.- Actually the most difficult config I must do is to pass three
values from userspace to my module. Specifically two integers and a
long (it's an offset to a memory zone I've previously defined)
struct meminfo {
unsigned int id; /* segment identifier */
unsigned int size; /* size of the memory area */
unsigned long offset; /* offset to the information */
};
How would you pass this information in sysfs? Three values in the same
file? Note that using three different files wouldn't be atomic, and I
need atomicity.
Use configfs.
Ummhh, and would it be correct to configure my device via a netlink
socket? Remember that my driver is a kind of network "virtual" device.
There are so many old and new ways to configure a driver that I'm a
bit overwhelmed...
Regards
Aritz
At the risk of the obvious....

struct meminfo meminfo;
ioctl(fd, UPDATE_PARAMS, &meminfo);

... and define UPDATE_PARAMS and other function codes to start
above those normally used by kernel stuff so that `strace` doesn't
make up stories.

This is what the ioctl() interface is for. Inside the kernel
you can use spinlocks (after you got the data from user-space)
to make the operations atomicc.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5589.66 BogoMips).
Warning : 98.36% of all statistics are fiction.
_
To unsubscribe


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to ***@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.
Aritz Bastida
2006-02-01 14:19:40 UTC
Permalink
Hi!
Post by linux-os (Dick Johnson)
At the risk of the obvious....
struct meminfo meminfo;
ioctl(fd, UPDATE_PARAMS, &meminfo);
... and define UPDATE_PARAMS and other function codes to start
above those normally used by kernel stuff so that `strace` doesn't
make up stories.
This is what the ioctl() interface is for. Inside the kernel
you can use spinlocks (after you got the data from user-space)
to make the operations atomicc.
Well, actually that's more or less what I had done before, and it
worked. I had a group of ioctl commands for configuring my device. The
command number was based on an unusded "magic number", as I was told
when reading Linux Device Drivers 3rd:

#define SCULL_IOCSQUANTUM _IOW(SCULL_IOC_MAGIC, 1, int)
#define SCULL_IOCSQSET _IOW(SCULL_IOC_MAGIC, 2, int)

This actually works, but it doesnt seem to be "ellegant" code for new
drivers in 2.6. Or at least that's what it says in LDD3, since the
ioctls are system wide, unstructured, and so on.

That's why I was asking for a more ellegant and cleaner configuration
method. It seems that the new filesystem "configfs" is perfect for
that, but I would like to know if netlink sockets can also be used for
that purpose (as I'm writing a kind of network device).

Thank you anyway
Regards

Aritz
linux-os (Dick Johnson)
2006-02-01 15:11:11 UTC
Permalink
Post by Aritz Bastida
Hi!
Post by linux-os (Dick Johnson)
At the risk of the obvious....
struct meminfo meminfo;
ioctl(fd, UPDATE_PARAMS, &meminfo);
... and define UPDATE_PARAMS and other function codes to start
above those normally used by kernel stuff so that `strace` doesn't
make up stories.
This is what the ioctl() interface is for. Inside the kernel
you can use spinlocks (after you got the data from user-space)
to make the operations atomicc.
Well, actually that's more or less what I had done before, and it
worked. I had a group of ioctl commands for configuring my device. The
command number was based on an unusded "magic number", as I was told
#define SCULL_IOCSQUANTUM _IOW(SCULL_IOC_MAGIC, 1, int)
#define SCULL_IOCSQSET _IOW(SCULL_IOC_MAGIC, 2, int)
This is not good. If you change your kernel, you would have to
recompile your applications as well. The ioctl() command value
needs to be a constant that doesn't change when kernel headers
change.
Post by Aritz Bastida
This actually works, but it doesnt seem to be "ellegant" code for new
drivers in 2.6. Or at least that's what it says in LDD3, since the
ioctls are system wide, unstructured, and so on.
Ioctls() contain a unique file-descriptor that means it's for your
device only, not something "system wide". I noticed some bugs with
some kernel version where all ioctls seemed to be treated the same
like a function meant for my device could actually do something
to a terminal. If this still exists, it's a bug. I don't know
if these bugs still exist.
Post by Aritz Bastida
That's why I was asking for a more ellegant and cleaner configuration
method. It seems that the new filesystem "configfs" is perfect for
that, but I would like to know if netlink sockets can also be used for
that purpose (as I'm writing a kind of network device).
Well ellegant isn't necessarily better, just different. The writers
of sysfs want you to use their interface, the writers of xxxfs
want you to use their interface,... etc. Ioctls are forever.
They will exist as long as there is a kernel. Even network
manipulating things like ethtool, ifconfig, and route use
ioctls.
Post by Aritz Bastida
Thank you anyway
Regards
Aritz
Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5589.66 BogoMips).
Warning : 98.36% of all statistics are fiction.
_
To unsubscribe


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to ***@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.
Loading...